Re: Wishlist for R2007b
- From: roberson@xxxxxxxxxxxxxxxxxx (Walter Roberson)
- Date: Tue, 4 Sep 2007 04:44:31 +0000 (UTC)
In article <fbhi9d$lck$1@xxxxxxxxxxxxxxxxxx>,
Paul H. Hunter <dont_bug_me_999@xxxxxxxxx> wrote:
[1] Command window unicode: allow fprintf-style outputs
other than English. For example, Asian scripts.
I could see that as being important to some people, but as far
as I am aware, none of my co-workers or I have ever had reason to
even think about this being a niceity. One of those cases,
I suspect, where "Those who need it, need it, and those who
don't need would tend to think the time would have been better
spent on something else".
[2] Command window font control: fprintf-style outputs with
colors, font control and so on. Combine with [1]?
If I have a user interaction that needs colours or fonts, then
I would put it into a GUI window. If I'm outputting something to
the command window, then my assumption would be that the command
window has at least a 1 in 3 chance of being run in a non-graphical
environment, -nodesktop, with the only available user interactions
those that are representable by character streams that are interpreted
by the user's terminal or terminal emulator. There is an ANSI X3.64
standard for a small number of foreground and background colours
via escape sequences (8 of each); those are about the closest you are
going to get to a standard way of representing colour in terminal
emulators. There is no, however, no sequence within X3.64 for
representing font choice.
Hence, my alternate suggestion would be to modify the desktop
command window to recognize X3.64 colour sequences; if that
were to happen, then anything beyond that would be for ease of
use; e.g., terminalcolor('red') could be short-hand for
[char(27) '[31m']
There would, I can see, be an immediate temptation to say,
"Ah, but if you code it as a routine call, then the routine could
check to see whether the desktop is in use or not and could
make the appropriate internal changes directly if the desktop
is active, or could use some other internal code sequences or
whatever was appropriate for the terminal (e.g., using
'curses' to fetch the colour sequences.) The problem with this
is that if one is using a script or otherwise making some kind
of log file, then if colour or font changes are done internally
for the desktop, or are done through some made-at-mathworks
control sequence, then the logged version would reflect some
non-standard method that would likely be a mess to recreate
(or remove) afterwards. If I write an ANSI X3.64 colour sequence
to a file, then it is easy to see that sequence in context;
just send the text to any of a number of terminal emulators and
it will be rendered. (On the other hand, text coloured with
escape sequences is a bit of nuisance to parse automatically,
as it is common to mix colours within a single token, such as to
indicate the accelerator character for a menu.) Font changes
don't have that advantage; there is no standard for representing
those as escape sequences.
[4] Fast kill option. Too many times matlab locks in a deep
calculation I'd like to quickly kill (sometimes even
taskmangler can't kill it). Use a separate process to
monitor the 'fast-kill' debug button. If this is not
realistic, allow matlab to be queried by a command window to
determine running processes within it and allow them to be
individually terminated. Without bringing the entire thing down.
I don't understand. Matlab always runs as a single process, unless
you are using the multiprocesor toolbox (MPI extensions.) Newer
Matlab support multiple -threads- within the same process; being
able to examine individual -threads- is not, I think, particularily
useful. I guess it could happen that one of the threads happened to
get caught in a tight loop while the other calculation threads waited
for it to catch up, but has that proven to be a substantial problem?
If you'd talked about killing a Java thread or ActiveX control,
or closing a file or socket or forked process such as dos()
or unix() or system() or perl(), I would find that more obvious.
[0] Optional stack and heap control. If a calculation will
run out of memory, allow the ability to optionally use disk
space as virtual memory **independent** of the OS. For
example, inform Matlab that K:\ (windows) which has a 250 GB
drive can be used for up to 50 GB (say) of virtual memory.
Let matlab open memory-mapped file(s) to use as "last
resort" memory.
Hmmm, as you specify "independent of the OS", then you
want this on the 32 bit Windows version as well as on the
64 bit versions.
Are you aware that when you memory-map files, that you must have the
virtual address space in your process in order to address all of the
file? That if you tell Matlab to use a 50 GB memory-mapped file, that
that would only work on systems with at least 50 GB unused in the
process virtual address space, which in turn can only happen if the
process address space is at least 36 bits?
If you don't have 50 GB of virtual address space available, then
in order to use those additional objects, they must be brought into
the virtual address space (even if that virtual address space
is living on disk rather than in RAM.) That would require swapping
other things out of the virtual address space.
I'm not just talking about the traditional "swap to disk": traditional
swap to disk still has the data within the virtual address space. What
I'm talking about is running out of pointers -- for example, in 32 bit
Windows, you simply cannot point at more than 4 Gb between *all* of
your pointers, because that's all that fits in 32 bits. And programs
cannot just tell the OS that they want to start using bigger pointers:
pointers live at the hardware level.
So if my virtual address space is full and I need to access (say) a 1
Gb object, then Matlab would have to go further than just writing
objects to disk: it would have to actually invalidate the pointers to
those objects, by recording the data on disk and then in the object
descriptor scribbling over the pointer (a memory address) with some
kind of disk-memory locator. And Matlab shares pointers ("copy on
write"), so it would have to track down all the objects that refer to
that memory block and switch them all to disk-memory locators (known in
computing as "cursors").
Data locked up in handle graphics cannot be exempt, as I might
have plotted a million or more points, and if it is time
to do an arithematic calculation and I need the memory, then
the arithematic calculation has to take priority over the
graphics; the graphics might not need to be refreshed for minutes
or hours yet.
But if I have Windows 32 bit and I have two 1 Gb matrices to multiply
together producing a 1 Gb answer, then that needs 3 Gb of
virtual address space made available, 1 Gb for each of two
sources, and 1 Gb for the result. That's a problem on an OS that
only allows 3 Gb to be addressed. So to address that, Matlab would
have to invest heavily in routines to do things like "blocked matrix
multiply" -- routines that aren't needed on Windows 64 bit or
Sun OS or Linux 64. Okay, routines that aren't needed -as much-,
since there are a lot of routines that can be much more efficient
if written to do blocking specifically rather than counting on
the user having more than 4 or 8 or whatever Gb of RAM when they
want to work with large objects. But having customers buy more RAM
if they happen to need large objects is -cheap- compared to Mathworks
rewriting all of Matlab to handle blocking.
Since all of this is supposed to be OS independant, and you
mentioned Windows specifically, then you want to be able to work with
objects whose individual sizes exceed 2^32 B (4 GB) of contiguous
memory. And that in turn would require not long that Matlab would
have to be rewritten to be able to swap out complete objects (or
columns of data): it would require that Matlab be rewritten so
that all columns became more "single pointer, and column dimension":
instead, all columns would have to be transformed to have a list
of memory descriptors that -together- span the column data.
A list, because Matlab would only be able to load fragments of a
column in a time, and it might have to create and merge those
fragments together on the fly. "Okay, there's a reference to
column pi*1E8.... okay I have room for 256 Kb of that at the
moment, so that segment will have to be marked in-memory. Oh, now
they skipped backwards by a million... let's see, I'll swap the editor
out of virtual memory and that'll free up 128 Kb, so I'll
fragment the on-disk descriptor before that, marking that object
as being out for the first few hundred megs, then present for
128K, then on disk for the 3/4 MB after that, then in memory
for 256KB, then on disk for the rest..."
This. Would. Be. A. Mess.
Now, imagine the changes that would have to be made to all those
existing MEX routines to handle this -- they'd have to be written
rewritten for a completely different processing model. They'd have
to work with those arrays bigger than virtual memory could possibly
fit, so they'd all be hit with the same hastles, each and every
one of them.
Considering the technical problems involved, and considering the
alternatives (use a 64 bit version; buy as much RAM as you need
for efficient use of the objects you need; let the OS handle
the swapping), I think that I would have to consider this
proposal to be ill-advised.
--
Is there any thing whereof it may be said, See, this is new? It hath
been already of old time, which was before us. -- Ecclesiastes
.
- Follow-Ups:
- Re: Wishlist for R2007b
- From: Dan Hensley
- Re: Wishlist for R2007b
- References:
- Re: Wishlist for R2007b
- From: Paul H. Hunter
- Re: Wishlist for R2007b
- Prev by Date: Re: Meaning???
- Next by Date: Re: Meaning???
- Previous by thread: Re: Wishlist for R2007b
- Next by thread: Re: Wishlist for R2007b
- Index(es):