FLAC or other uncompressed formats, which is best?



On Fri 2012-May-25 20:46, Don Y writes:
The asterisks spoke, because I've configured the screen
reader to speak them, as people use them to set text apart,
or the dreaded footnotes of course.

OK, so my use of them for emphasis detracts from your comprehension
instead of adding to it!


But, the all caps didn't, at least when just reading.

NOw, were I editing, capital letters are spoken with a bit
of a raised inflection, but, when read as words they're not.

When just in the reader, not editing, for example, reading
usenet articles, a book that's text or similar I have most
punctuation disabled so sentences sound normal.

Understood. But, would you have any cues that there might
be some punctuation that you might want to see which has
been silenced. For example, the cartoonish way of showing
a pejorative as a jumbled sequence of ad hoc punctuation
marks like $(&*^@#$!

I note i saw those in my reader,when reading, not editing
this reply. SO obviously I had at some point made them
exceptions, probably because of the common use of many of
those symjbols elsewhere, such as @ in email addresses, # as pound symbol, etc. etc. I probably have much more enabled
in a usenet/mail/bbs reader application than I would in just a straight text reader that I'd use to read a novel <g>.

On a different note, I assume you are also victimized by
spelling errors? For example, I tend to end up transposing
pairs of letters simply because one finger finds its way
to a key before the other -- which should have preceded it.
Like teh instead of the.

I'm victimized more by makign them <g>.

Cutting to the chase on some of this, colors aren't spoken
at all. I can have the screen reader, and most screen
readers, monitor a portion of the screen, say a status line,
for either a change in text displayed there, or a change in
attributes.

Does it simply speak each change encountered? What if the
time to speak the information exceeds the time between
changes? For example, a timer counting down seconds remaining until
a task's completion. But, "one minute forty nine seconds" takes
longer to speak than the time for the display to change to "one
minute forty eight seconds". Likewise, is the
screen reader preoccupied with this task or can it also
let you wander around to other parts of the screen while
it is monitoring that section?

Usually I'll tell the screen reader app to remain silent,
and park speech cursor over that display. IF that display
is going to be changing a lot I tell the screen reader to
ignore that line entirely, and only force it to go there
when I want to look at it. Usually I'll use monitoring of a status line to tell ti when to change configurations. E.g.
when status line changes to x from y load a configuration
which tracks a light bar as your focus of where you are,
etc. etc. Rather complex, and probably extremely boring to
the folks here. We should probably take this line to email <g..

<snip>
<big snip>
one reason braille will always be
superior, the ability to skim.

This is directly analagous to reading printed text.
You can cherry pick through large amounts of information
with relatively little effort.

Indeed, and, believe it or not, i retain what i read better, as well as read fast using it. IN most cases, synthesized
speech is the most cost effective and most effective in
other ways, compromise that can be achieved. Braille
displays are clunky, hard to maintain and don't achieve good reading speed, or efficient work flow, unless you're a
customer service rep dealing with both the computer and
customers on the phone. After all, they take your hands
away from the keyboard, they can display a very limited
amount of text at one shot, all them little solenoid springs and mechanical parts ... <aaargh>

Yup. This is the dark side of illegal copying. It forces
authors to waste effort protecting their works. And, screws
legitimate users out of the ability to use the product
"fairly".

For example, if I have three computers but only use
one at a time, the morally correct thing is to have one
license. But, how does the author ensure that I really
*am* using just one at a time? How does the author
ensure that the "second computer" isn't a friend's
computer?

Exactly waht I ran into with it. My mom didn't want a
screenreader, bu was glad to have my help maintaining her
system, often without her having to stand over my shoulder
and play screenreader. The two machines at the studio, I'd
only be using one of them at a time, and the owner of said
studio didn't want a screen reader either. I didn't use the product at all at home, I was beta testing a competitor's
screen reader for the gui environment in fact. I just
didn't think it was fair to my employer to use a beta at
work.

The Mercator project (now defunct) tried to layer a speech
interface *under* (not ON TOP OF!) the GUI in UNIX. I.e.,
it replaced the standard GUI libraries with speech-enabled
ones with which the "screen reader" could interact.
<snip>
YEah there were a couple like that, had heard of that one,
or the "speaqualizer project. Both were failures in the
marketplace. MErcator may have never made it to market, but
they tried with Speaqualizer for awhile.

Mercator was an academic project. Yet another example of
people thinking that there would be a "simple" way to
address this problem.

Yep, and the Speaqualizer tried to do it as an integral part of hardware, you get speech as soon as the machine boots,
giving you access to bios, etc.

The only simple way to address the problem is NOT to provide a
visual interface! So, applications ALL have to rely on
the same non-visual interface to interact with their users!

If you provide a nonvisual OPTION, then applications will
only give token support -- if any -- to it. On the other
hand, if the only way to get information out of a device is
via that option, then they don't have a choice! This is
the approach I have been taking, lately. Pick an output
modality that addresses everyone in the target audience
and force everything to use that single mechanism!

iNdeed, and this is what we're finding with a lot of web
portals that do things that are only usable with vision.
Anyone who's doing web development should looik at a series
of articles discussing just this issue in this month's
Braille MOnitor, available in text, no doubt from
www.nfb.org.

Festival -- a free package probably available under Linux -- has a
lot of "context modules" that try to alter teh rules for
pronunciation based on context. I.e., so email addresses
are pronounced as "Richard dot Webb dot my dot foot at ..."
<snip>
YEp, which was I think why so many complained when the
National WEather service went with dectalk speech synths
for their vhf radio forecasts.

The backup speech synthesizer in one of my products has similar
quality issues as Klatt's DECtalk (he wrote it while a student and
DEC commercialized it). It's biggest advantage is that
it is pretty lean when it comes to resources -- which translates
directly to implementation costs and reliability.

Did you ever check out that kid a few years ago that made a
dectalk sing? HE spent some serious time coding that, iirc
the kid was only 16 years old or so when he did this one. I used to have a url for it, but it disappeared in Katrina. I can't even recall his name it's been so long.


I have a DECtalk DTC01 and a DECtalk Express. Plus a few
of the Artic/Votrax-based synthesizers. All have the same
basic advantage -- and the same robotic speech quality!

Yep, as does my doubletalk. I had, before Katrina, two
doubletalk internal cards, a doubletalk lite external, and
an audaptor.

On the other hand, Festival has a huge footprint. And, is
considerably easier to crash than DECtalk. Where DECtalk
and the other "simple" synthesizers will take a stab at
pronouncing damn near anything you throw at them, Festival
will chew on it for a fair bit of time before commiting
to a pronunciation -- which can be just as wrong as the
other products!

This is why a lot of the screenreader developers, and speech synth developers sort of "shared the load" you might say.
Common pronunciation, at the phoneme level is often handled
by rom within the synthesizer itself, exceptions and the
like are handled by the software on your hard disk.

OK. Any particular reason why you're married to that machine?
I'd like to have the raid array for server, and yes, once
we've relocated net connected server is part of battle plan.
Raid would be nice. i've got another box which is going to
be dedicated to firewall/router duties, but would like to
keep that one as server, which was what it did in its former life.

Does the chassis force you to use a certain type of
disk drives? E.g., because of disk carriers? Many
older RAID offerings require SCSI disks. Would you
be happy with RAID in some other form?

I doubt the chasis does, I'd have to look inside the box.
MOst i did with it when it was given to me was boot it up
once. IF we could get raid in some other form, that would
be cool too. Would lose those two big scsi hard drives then and have to do somethign else with them, but ... Just trying to use what's existing in the box with minimal $$$ outlay,
if possible. Still it might be worth doing that to get
totally away from windows as server app. I'll have to way
pros and cons of that one when we get to it. Right now that machine is sitting in storage unit with some of those little deseccant packages inside the case <g>.

DY> [sightless V U meter]

Ah, OK. Clever.

YEp, for some plans and simple designs, goto ski.org and
download sktf.zip, it's about a 2 mb zip file, multiple
directories, but text files on lots of things, home brewing
adaptive vu solutions, soldering jigs, all sorts of stuff.

OK.

OK. So, this is "yet another DEVICE" that you have.
Like a tactile wristwatch, braille slate, talking
calculator, etc. I.e., it is designed for ONE PURPOSE.

<snip>
Yeah, I had given a lot of thought to how you provide
a means for letting folks review their calculations
with being able to view a "tape"

Believe it or not, I've done a lot of this in batch scripts
<g>.

Re menus in devices ...
Maybe, but some devices, such as ROland's sound modules like
to remember where you were last time, and heck, it might be
a week before i want to delve into its menus again, and I
might not remember where I was last time.

Ah, OK. No, I think a device should remember what
you did "last time" -- but, only while you are actively
and continuously using it. If you want to be able
Y> to return to a certain set of options some days later,
Y> you should be able to save those options and explicitly
Y> restore them. If you turn the device off and start
Y> over tomorrow, then everything should resort to some
default -- perhaps even one that YOU have defined
instead of that which the manufacturer has defined.

Yeah sounds like a good compromise. If I go back in there
during the same 'session" it remembers where I last was.
OTherwise, it goes to the start.

DY> [computer interface with speech in a live environment]

What can you suggest as an alternative? Is the problem
the quality of the voice? Or the masking effects of
all that music in the background?

Yep, the music, and I'm supposed to be giving my ears to the
audio. Also, you can't amplify speech in an earbud loud
enough often unless you're doing bad things to the ear
canal.

Understood. You want a different communication channel
to interact with the device instead of having to share
the audio channel that you are devoting to the task at
hand.

That's it exactly. I'm accustomed, as i said in another
post this thread, to not having to do anything butdirectly
communicate with the device. IF I want that channel strip
assigned to a certain bus or certain vca group, push the
button. IF I want audio from that channel on aux bus 3, i
adjust that aux send. I don't have to play where the f*$@
am I? IT's automatic like buttoning your coat, zipping your pants or tying your shoes.

I've used the same rationale to argue in favor of using
non-visual channels for visual tasks! I.e., those cases
where your eyes are busily engaged in some activity and
shouldn't have to be pulled away just so you could see
which virtual button you were pressing on your iPhone!

Uh huh! Just my point with some of these complex devices
that are made for people to operate while driving, etc.
Give them an auditory channel for the info, keep the eyes on the road, and the hands upon the wheel please.


Regards,
Richard
--
| Remove .my.foot for email
| via Waldo's Place USA Fidonet<->Internet Gateway Site
| Standard disclaimer: The views of this user are strictly his own.
.