Re: mapcar to convert octals to ascii



On Sat, Jan 05 2008, David Golden wrote:

Reiner Steib wrote:
Which real-life files could trigger false positives for you?

I don't know, really. My comment was actually motivated by your+Kenichi
Handa's comments in the thread you linked! - I was just taking Kenichi
Handa's word for it that there was increased risk. Actually, thinking
about it, in my experience, some (text-heavy in the opening and/or
closing stanzas) binary formats always or almost always end up
identified as some text encoding (ELF, MP3), [...]

I'd guess that null-byte detection could help for these files.

some never or almost never (JPEG or PNG, though possibly because
they're specially treated by an emacs built with jpeg or png
support). Hadn't really looked into it beyond that, you can just set
the encoding you actually wanted in seconds after all.

If the file types are quite common, it might be useful that Emacs DTRT
automatically.

BTW, looking at cvs logs, the addition of 1252 to the coding-priority of
the Latin-1 language environment (corresponding to main european.el cvs
rev 1.93->1.94) may simply not have made it into the emacs-unicode2
branch european.el to date*.

No, emacs-unicode-2 (rev. 1.86.4.13 of `european.el') also has
windows-1252 in "Latin-1", but it is missing in "German" (thanks for
pointing this out):

(set-language-info-alist
"Latin-1" '((charset iso-8859-1)
(coding-system iso-latin-1 iso-latin-9 windows-1252)
(coding-priority iso-latin-1)
(nonascii-translation . iso-8859-1)
(unibyte-display . iso-latin-1)
(input-method . "latin-1-prefix")
(sample-text
[...]
(set-language-info-alist
"German" '((tutorial . "TUTORIAL.de")
(charset iso-8859-1)
(coding-system iso-latin-1 iso-latin-9)
(coding-priority iso-latin-1)
(nonascii-translation . iso-8859-1)
(input-method . "german-postfix")
(unibyte-display . iso-latin-1)

That mightn't explain the 22.1 issue, but might well explain why
Haines and I hadn't seen emacs doing autodetection of windows-1252
(that and the fact that, er, I at least have apparently been using
the generic "UTF-8" language environment rather than any of the
local ones for ages anyway).

There's nothing wrong with using the "UTF-8" language environment.
You just don't "benefit" [1] from the defaults that e.g. "German"
sets.

[1] But maybe you don't like things like localized tutorial, etc.

Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
.



Relevant Pages

  • Re: Emacs pound sign in non-windowed mode?
    ... > I gave XEmacs a try last week, and it screwed up my UTF-8 C ... isn't in Emacs. ... The Emacs implementation actually uses a superset of UCS. ...
    (uk.comp.os.linux)
  • xterm, mutt, emacs -nw, and utf-8
    ... I'm in the process of converting to utf-8 locales and have some ... problems with emacs -nw in a xterm, and a related question about mutt. ... After opening a utf-8 encoded file, describe coding system ... '(mouse-wheel-mode t nil (mwheel)) ...
    (Debian-User)
  • Re: Language environment problem (still)
    ... and choose one of the latin-n or utf-8. ... As to shell locale, i have ... > I find that when I go to set-language-environment in emacs, ... > language environment is "". ...
    (comp.emacs)
  • Re: The Modernization of Emacs: terminology buffer and keybinding
    ... inside of a GNOME terminal in an utf-8 environment, ... it's not only Emacs you haven't been in contact with for years. ... Someone presents evidence that contradicts one's conclusion, e.g., that they entered Greek or Russian or Katakana characters on their text terminal and it worked fine. ...
    (comp.lang.java.programmer)
  • Re: locales and coding systems
    ... > The default coding system in emacs is determined by how I've set up ... > locales in debian. ... probably set your language environment to either English/Ascii or one ...
    (Debian-User)