Re: regex/replace white list



RobG wrote:

Thomas 'PointedEars' Lahn wrote:
However, using the RegExp constructor removes and introduces a
maintenance problem. It removes the problem that Regular Expressions
cannot span lines because string concatenation serves the purpose. It
introduces the problem that one has to escape the expression twice: one
time to avoid escape sequences in the string literal, and again to have
RegExp special characters parsed as expression atoms instead.

Escaping characters is always an issue, especially if multi-line input
is accepted. Should new lines & line feeds be allowed?

You misunderstood. This was not about matching newline in the input.

The solution is for the OP to learn about matching characters and apply
that to their particular circumstance.

My point was that

var rx = /very_long_Regular_Expression.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.
r.s.t.u.v.w.x.y.z.\..#.#.4.2.1.3.3.7./

is not possible (consider the above a _hard_ line break to avoid crossing
the 80-columns border), but

var rx = new RegExp(
"very_long_Regular_Expression.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p."
+ "r.s.t.u.v.w.x.y.z.\\..#.#.4.2.1.3.3.7.");

(and the like) is. The latter introduces the maintenance problem that the
literal "." must be escaped twice, but it removes the maintenance problem
that literals are not allowed to span lines (in the source code).

As I final note, I want to add that if special features of Regular
Expressions compared to strings are not used, it is probably more
efficient not to use Regular Expressions at all. Instead of writing

if (re.test(someString))

using the RegExp() constructor or the above RegExp object initializer,
it is probably more efficient to write

if (someString.indexOf("a") > -1)

If the need was a test for a specific character, then that would be
fine. Maybe you could use it with a loop to go through each character
in the black list, but how many characters/loops would it take before a
regular expression was faster?

I do not know. This was a general note.

The following example may be better:

Maybe not :)

<script type="text/javascript">

function checkList(blID, strID)
{
var blackList = document.getElementById(blID).value;
var inString = document.getElementById(strID).value;

A `form' element would have avoided the inefficient and not downwards
compatible referencing.

function checkList(f, blId, strID)
{
var es;
if (blID && strID
&& f && (es = f.elements)
&& es[blID] && es[strID])
{
var blackList = es[blID].value;
var inString = es[strID].value;

// ...
}
else
{
window.alert("foobar!");
}

return false;
}

<form action="..."
onsubmit="checkList(this, 'blackList', 'inputText');">
...
<input type="submit" value="Check input with blacklist">
</form>

var re = new RegExp('[' + blackList + ']');

What about the escaping part? You do not want the user to handle that,
do you?

document.getElementById('xx').innerHTML = re.test(inString);

Mixing standards compliant and proprietary DOM features unnecessarily.

es["xx"].style.fontStyle = "normal"; // I prefer setStyleProperty()[1]
es["xx"].value = re.test(inString);

<form ...>
...
<div>Result: <input id="xx"
value="no check done yet..."
style="border:0; font-weight:bold; font-style:italic"></div>
</form>

[...]


PointedEars
___________
[1] <URL:http://pointedears.de/scripts/dhtml.js>
.



Relevant Pages

  • Re: Get text "literally" from a TextBox
    ... Cor and Patrice, thanks for the answer; I know the regular expressions, but ... my problem is how get the pattern string if the user put that in a Textbox. ... maybe I can depure my string, but exist another especial "characters" like ...
    (microsoft.public.dotnet.languages.vb)
  • Re: JavaScript to validate User input
    ... I need to write a Java Script for a string payment_code which comes ... If a user enters characters other than the mentioned above, ... Calulate the length of the string variable ls_tmp_string and store ... Or buy the great book 'Mastering regular expressions' by O'Reilly. ...
    (comp.lang.javascript)
  • Re: Small confusion about negative lookbehind
    ... > My candidate string is "ab". ... > The expressions I'm testing this string against are the following, ... but the position between characters. ... Regular expressions describe not only strings, ...
    (comp.lang.java.programmer)
  • Re: strings vs regular expressions
    ... Just to clear things up regarding regular expressions versus string ... characters in a string, which may be different characters in the same ... pattern, and string functions for looking for substrings. ...
    (microsoft.public.dotnet.framework)
  • Re: Serious Perl Regular Expression deficiency?
    ... I started doing Perl 2 years ago and have ... > conclusion that regular expressions have a serious ... This is serious because the not string ... If you want to pull out the contents of XML comments you could do this. ...
    (comp.lang.perl.misc)