Re: Eigenartiges Verhalten von "String[] java.lang.String.split(String regex)" - Falscher Split



"Alexander Seidl" skribis:

Danke für die Antworten :)
Ich zitiere das als Zusammenfassung nochmal aus
<http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html#split(java.lang.CharSequence, int)>:

"It is an error to use a backslash prior to any alphabetic character
that does not denote an escaped construct; these are reserved for future
extensions to the regular-expression language. A backslash may be used
prior to a non-alphabetic character regardless of whether that character
is part of an unescaped construct."

String str = "\(Hallo\)";
fürhrt daher zu einem Compilierfehler!

Fast. Nicht daher, sondern eben wegen:

"It is therefore necessary to double backslashes in string literals that
represent regular expressions to protect them from interpretation by the
Java bytecode compiler."

Der Compiler mag nur einige ganz wenige
Backslash-Escape-Sequenzen, nämlich \b, \f, \n,
\r, \t, \", \', \\, dazu bis zu dreistellige
Oktalzahlen (mit Ziffern 0 ... 7, nur bis \377),
und die Unicode-Escapes \uXXXX (mit vier
Hexadezimal-Ziffern).

Alle anderen Backslashes sind nicht erlaubt und
führen zu Compiler-Fehlern.

Wenn du Klammern escapen willst, musst du das mit

"\\(Hallo\\)"

machen, was dann zu dem regulären Ausdruck

\(Hallo\)

wird.


Paul
--
Wem es darum zu tun ist, dauerhafte Achtung sich zu erwerben; [...] der würze
nicht ohne Unterlass seine Gespräche mit Lästerungen, Spott und Medisance und
gewöhne sich nicht an den auszischenden Ton von Persiflage.
Adolf Freiherr Knigge, Über den Umgang mit Menschen, 1.17
.



Relevant Pages

  • Re: Unrecognized escape sequences in string literals
    ... character without knowing the context, ... '\n' maps to the string chr. ... In both cases the backslash in the literal have the same meaning: ... escape every backslash, ...
    (comp.lang.python)
  • Re: New Line Constant Error!
    ... > make a special character that you can't type into your program. ... > tries to combine the backslash with the following character to create ... which the compiler reads as a special sequence that folds down into ... because the sequence backslash-en is ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: gfortran diagnostics and so on
    ... Well, in f0003, backslash is part of the standard Fortran character set. ... Because the backslash is part of the standard Fortran character set, the default behavior should be the printable character, **NOT** some kind of magic introductory character that transforms the interpretation of following character. ... The one I like best is to use one of the popular extensions to designate a particular literal string according to the C language. ...
    (comp.lang.fortran)
  • Re: New Line Constant Error!
    ... it is an "escape" that combines with the character after it to ... So, any time you type a string and put a backslash \ in it, the ... which the compiler reads as a special sequence that folds down into ... special-character processing on this one. ...
    (microsoft.public.dotnet.languages.csharp)
  • 2.6.6-rc1-mm1 x86-64 assembler problem
    ... There is a problem with some x86-64 assembler code. ... ignored character is `w' ... section} - `' {*UND* section} ... send the line "unsubscribe linux-kernel" in ...
    (Linux-Kernel)