Re: FLAC or other uncompressed formats, which is best?



Hi Richard,

The asterisks spoke, because I"ve configured the screen
reader to speak them, as people use them to set text apart,
or the dreaded footnotes of course.

OK, so my use of them for emphasis detracts from your comprehension
instead of adding to it!

But, the all caps didn't, at least when jsut reading.

NOw, were I editing, capital letters are spoken with a bit
of a raised inflection, but, when read as words they're not.

When just in the reader, not editing, for example, reading
usenet articles, a book that's text or similar I have most
punctuation disabled so sentences sound normal.

Understood. But, would you have any cues that there might
be some punctuation that you might want to see which has
been silenced. For example, the cartoonish way of showing
a pejorative as a jumbled sequence of ad hoc punctuation
marks like $(&*^@#$!

On a different note, I assume you are also victimized by
spelling errors? For example, I tend to end up transposing
pairs of letters simply because one finger finds its way
to a key before the other -- which should have preceded it.
Like teh instead of the.

Cutting to the chase on some of this, colors aren't spoken
at all. I can have the screen reader, and most screen
reader, monitor a portion of the screen, say a status line,
for either a change in text displayed there, or a change in
attributes.

Does it simply speak each change encountered? What if the
time to speak the information exceeds the time between
changes? For example, a timer counting down seconds remaining
until a task's completion. But, "one minute forty nine seconds"
takes longer to speak than the time for the display to change
to "one minute forty eight seconds". Likewise, is the
screen reader preoccupied with this task or can it also
let you wander around to other parts of the screen while
it is monitoring that section?

That's another reason I like asap. IF, for
example, I want a bit of a different configuration which is
more compatible with a drop down menu in an app, we watch
for the status line to change. if it changes to x, load y
configuration, etc.

Ah, OK.

You're right, it takes a bit of extra work to make screen
access technology play with what a person might just
download or use off the shelf. That's why I like things
with textual configuration scripting or guidance, and the
ability to make the program operate as i want it to oeprate
,g>.

Understood.

Oh, OK. Some software will note the "email context"
and try to actually keep track of this for you -- using
different voices for each party, etc. Though not their
ACTUAL voices!

NO different voices, I just have the greater than symbol in
my punctuation exceptions for anything that's a mail or
usenet reader app, so it says 'greater" then the line of
text.

OK. Obviously even different voices has a small upper
limit. Keeping track of three different parties in
quoted text would probably leave you distracted by those
voices instead of aided.

So, folks who just quote entire posts and bottom post their
replies are just as bad as folks who top post. Each
is equally hard for you to put back into context (you have to
REMEMBER what was said and remember the reply and then thread them
together in your mind)

YEp, you read through all that, or turn of the filter
quoting and see hree or four screens of quoted material for
a two liner reply<g>. one reason braille will always be
superior, the ability to skim.

This is directly analagous to reading printed text.
You can cherry pick through large amounts of information
with relatively little effort.

[jaws copy protection]

Yep, see my other post. My stumbling block came with it
mainly when I wanted to install a copy on mom's machine so I
could help her maintain it, a copy on the studio control
room machine, and one in the office. I'd *never* be using
all three simultaneously. Then when I had a system crash on
one system my install key floppy didn't play. That copy
protection has been hacked, but i refuse to play that game
for obvious ethical reasons. Others may, but i respect
intellectual property rights. Ted HEnter worked a long time
to develop it, and though i may not like his protection
scheme, that doesn't give me the right. YOu know the drill.

Yup. This is the dark side of illegal copying. It forces
authors to waste effort protecting their works. And, screws
legitimate users out of the ability to use the product
"fairly".

For example, if I have three computers but only use
one at a time, the morally correct thing is to have one
license. But, how does the author ensure that I really
*am* using just one at a time? How does the author
ensure that the "second computer" isn't a friend's
computer?

The Mercator project (now defunct) tried to layer a speech
interface *under* (not ON TOP OF!) the GUI in UNIX. I.e.,
it replaced the standard GUI libraries with speech-enabled
ones with which the "screen reader" could interact. So,
it knew that "these buttons are part of a group of RADIO
BUTTONS governing this particular option choice", and
"this text box expects a numeric value that specifies
the age of the person", etc.

YEah there were a couple like that, had heard of that one,
or the "speaqualizer project. Both were failures in the
marketplace. MErcator may have never made it to market, but
they tried with Speaqualizer for awhile.

Mercator was an academic project. Yet another example of
people thinking that there would be a "simple" way to
address this problem.

The only simple way to address the problem is NOT to provide
a visual interface! So, applications ALL have to rely on
the same non-visual interface to interact with their users!

If you provide a nonvisual OPTION, then applications will
only give token support -- if any -- to it. On the other
hand, if the only way to get information out of a device is
via that option, then they don't have a choice! This is
the approach I have been taking, lately. Pick an output
modality that addresses everyone in the target audience
and force everything to use that single mechanism!

Festival -- a free package probably available under Linux -- has a
lot of "context modules" that try to alter teh rules for
pronunciation based on context. I.e., so email addresses
are pronounced as "Richard dot Webb dot my dot foot at ..."
instead of some unpronounceable jumble of letters and symbols. For
example, the C C header would be pronounced as "carbon
copy", etc.

YEp, which was I think why so many complained when the
National WEather service went with dectalk speech synths
for their vhf radio forecasts.

The backup speech synthesizer in one of my products has similar
quality issues as Klatt's DECtalk (he wrote it while a student
and DEC commercialized it). It's biggest advantage is that
it is pretty lean when it comes to resources -- which translates
directly to implementation costs and reliability.

I have a DECtalk DTC01 and a DECtalk Express. Plus a few
of the Artic/Votrax-based synthesizers. All have the same
basic advantage -- and the same robotic speech quality!

On the other hand, Festival has a huge footprint. And, is
considerably easier to crash than DECtalk. Where DECtalk
and the other "simple" synthesizers will take a stab at
pronouncing damn near anything you throw at them, Festival
will chew on it for a fair bit of time before commiting
to a pronunciation -- which can be just as wrong as the
other products!

OK. Any particular reason why you're married to that machine?

I'd like to have the raid array for server, and yes, once
we've relocated net connected server is part of battle plan.
Raid would be nice. i've got another box which is going to
be dedicated to firewall/router duties, but would like to
keep that one as server, which was what it did in its former life.

Does the chassis force you to use a certain type of
disk drives? E.g., because of disk carriers? Many
older RAID offerings require SCSI disks. Would you
be happy with RAID in some other form?

INterestign that you learned braille. I'd be lost without
it

When I worked for Kurzweil, I was dealing with visually
impaired customers AT BEST! Seems disrespectful not to
learn to communicate in the form that THEY require.
I.e., doesn't do me much good to leave a handwritten note
telling them "I'll be back after lunch"!

[sightless V U meter]

Ah, OK. Clever.

YEp, for some plans and simple designs, goto ski.org and
download sktf.zip, it's about a 2 mb zip file, multiple
directories, but text files on lots of things, home brewing
adaptive vu solutions, soldering jigs, all sorts of stuff.

OK.

OK. So, this is "yet another DEVICE" that you have.
Like a tactile wristwatch, braille slate, talking
calculator, etc. I.e., it is designed for ONE PURPOSE.

Yep, usedto have the talking calculator, but now just use a
little command line calculator I found on the net some years
ago. OR do a lot of math in my head when out and about.

Yeah, I had given a lot of thought to how you provide
a means for letting folks review their calculations
with being able to view a "tape"

Understood. But, this can be done with different approaches!
For example, one approach is to always reset things to
"the beginning" -- or some other known state. Another
approach is to leave things where you last left them
on the assumption that you will want to do the same sort
of thing, again.

Maybe, but soem devices, such as ROland's sound modules like
to remember where you were last time, and heck, it might be
a week before i want to delve into its menus again, and I
might not remember where I was last time.

Ah, OK. No, I think a device should remember what
you did "last time" -- but, only while you are actively
and continuously using it. If you want to be able
to return to a certain set of options some days later,
you should be able to save those options and explicitly
restore them. If you turn the device off and start
over tomorrow, then everything should resort to some
default -- perhaps even one that YOU have defined
instead of that which the manufacturer has defined.

[computer interface with speech in a live environment]

What can you suggest as an alternative? Is the problem
the quality of the voice? Or the masking effects of
all that music in the background?

Yep, the music, and I'm supposed to be giving my ears to the
audio. Also, you can't amplify speech in an earbud loud
enough often unless you're doing bad things to the ear
canal.

Understood. You want a different communication channel
to interact with the device instead of having to share
the audio channel that you are devoting to the task at
hand.

I've used the same rationale to argue in favor of using
non-visual channels for visual tasks! I.e., those cases
where your eyes are busily engaged in some activity and
shouldn't have to be pulled away just so you could see
which virtual button you were pressing on your iPhone!
.



Relevant Pages

  • FLAC or other uncompressed formats, which is best?
    ... When just in the reader, not editing, for example, reading ... I want a bit of a different configuration which is ... quoting and see hree or four screens of quoted material for ... it knew that "these buttons are part of a group of RADIO ...
    (rec.audio.pro)
  • The single best news for the publishing biz
    ... Sony releasing 3 ebooks. ... the big news is there is COMPETITION in the ebook reader field. ... Kindle competed by adding wireless connection so you could buy books ... Add a recent breakthru in different way of making cheap flexible screens: ...
    (rec.arts.sf.written)
  • Re: Electronic Books - Cool, but no sale yet
    ... cut a CD and be able to use it on the reader. ... Some sort of changeable flash memory. ... The screens are still too expensive. ... but even the b&w ones just can't be bought for a low enough price. ...
    (rec.arts.sf.written)
  • Re: Why the Kindle and other e-book readers are doomed
    ... keyboard and large frame around the screen make it bigger than it needs ... I like the Sony 505 reader, but I find the 6" display to be on the ... paperback and large format paperback book. ... But E-Ink only makes screens ...
    (rec.arts.sf.written)
  • FLAC or other uncompressed formats, which is best?
    ... When just in the reader, not editing, for example, reading ... I can have the screen reader, ... and park speech cursor over that display. ... IF I want that channel strip ...
    (rec.audio.pro)