Re: String trim (was JavaScript Functions)
- From: Dr J R Stockton <jrs@xxxxxxxxxxxxxxxxxx>
- Date: Wed, 18 Feb 2009 19:22:17 +0000
In comp.lang.javascript message <k4ydnY53BoUhyAbUnZ2dnUVZ_umWnZ2d@gigane
ws.com>, Tue, 17 Feb 2009 19:26:36, kangax <kangax@xxxxxxxxx> posted:
Dr J R Stockton wrote:
In comp.lang.javascript message <XOGdnfc7I6_uoAfUnZ2dnUVZ_g4LAAAA@gigane
ws.com>, Mon, 16 Feb 2009 23:30:43, kangax <kangax@xxxxxxxxx> posted:
Dr J R Stockton wrote:
On a 3GHz PC, XP sp3, FF3, the following takes perceptible but
insignificant time to list all non-matches to \S : it could perhaps be
done better.
Why not just use Richard's test, posted earlier in this thread? ItRichard's test considers only the characters that he thinks should
tests client's \s against all of the whitespace characters (including
Unicode "space separators"). Doesn't it clearly demonstrate above
mentioned oddities?
be
treated by \s as spaces, etc. Mine, much quicker to write, found all
That list seems very logical to me. /\s/ (CharacterClassEscape :: s) is
clearly defined in ES3's 15.10.2.12. WhiteSpace (7.2), which /\s/
references, clearly lists all of the character code points. It also
mentions Unicode space separators. Those space separators are also
clearly defined in Unicode [1] under the White_Space section.
AFAICS, Richard's test says nothing about whether \s or \S matches
\u3000. Therefore, Richard's test cannot tell whether a browser is
fully compliant. Mine can, except for handling any character coding
outside 0x0000 to 0xFFFF.
characters that don't match \S in the current browser (it now uses
S.match(/\s/g)). The tests are logically distinct.
If there is a character, such as
cp:"6158", codePoint:"0x180E", character :"\u180E",
name:"MONGOLIAN VOWEL SEPARATOR", group:"Zs"
that NO browser recognises, that's not much of a worry for coders
(unless handling Mongolian) since testing on any browser will give the
same result.
Doesn't it make more sense to base tests on specs, rather than on some
vague subset of browsers? We can't really assert that "NO browser
recognizes" "MONGOLIAN VOWEL SEPARATOR"; neither can we test "all
browsers", can we?
You missed the stress in "if ... NO browser".
Test fully against specs to find out whether the tested systems are
compliant. Test browsers covering most of the market for Windows
browsers to find put what most (Windows) users will have in their
browsers. The tests are quite distinct.
However, after using my test, one only has to read the list of Unicode
whitespace characters to see how it compares with the result of my test.
I'm not : ) As it stands now, FireFox's \s is simply not ES3-compliant
and its deficiencies affect native `trim` (as that `trim` relies on \s)
But whether that is important for a particular page depends on whether
any incorrectly-classed characters can appear within it, and (if they
do) whether the difference really matters.
Consider reading an ISO 8601 date-and-time, found in the text of a
document. ISO 8601:2000 required a 'T' in the middle; it does not allow
't', but that should generally be tolerated. ISO 8601:2004 allows a
space instead, without (AFAIR, ICBW) actually specifying \x20 or \xA0.
In practice, the text may get paragraph-packed, so a reader should
accept a newline followed by spaces and HTabs. But perhaps not two
newlines. But maybe a form feed surrounded by newlines should count as
a newline. And one should ignore page headers and footers. But the
chances of finding a Mongolian character (which might look like a space)
are, in non-Mongolian contexts, negligible.
--
(c) John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
Proper <= 4-line sig. separator as above, a line exactly "-- " (SonOfRFC1036)
Do not Mail News to me. Before a reply, quote with ">" or "> " (SonOfRFC1036)
.
- Follow-Ups:
- Re: String trim (was JavaScript Functions)
- From: kangax
- Re: String trim (was JavaScript Functions)
- References:
- String trim (was JavaScript Functions)
- From: Garrett Smith
- Re: String trim (was JavaScript Functions)
- From: kangax
- Re: String trim (was JavaScript Functions)
- From: Richard Cornford
- Re: String trim (was JavaScript Functions)
- From: kangax
- Re: String trim (was JavaScript Functions)
- From: Richard Cornford
- Re: String trim (was JavaScript Functions)
- From: Garrett Smith
- Re: String trim (was JavaScript Functions)
- From: Dr J R Stockton
- Re: String trim (was JavaScript Functions)
- From: Garrett Smith
- Re: String trim (was JavaScript Functions)
- From: Dr J R Stockton
- Re: String trim (was JavaScript Functions)
- From: kangax
- Re: String trim (was JavaScript Functions)
- From: Dr J R Stockton
- Re: String trim (was JavaScript Functions)
- From: kangax
- String trim (was JavaScript Functions)
- Prev by Date: FAQ Updated (was: FAQ Topic - How do I format a date with javascript? (2009-02-08))
- Next by Date: Re: Jquery, Dojo, or none for realtime display?
- Previous by thread: Re: String trim (was JavaScript Functions)
- Next by thread: Re: String trim (was JavaScript Functions)
- Index(es):
Relevant Pages
|
Loading