Re: Reliable character encodings conversion
- From: James Gray <james@xxxxxxxxxxxxxxxxxxx>
- Date: Tue, 30 Sep 2008 08:58:35 -0500
On Sep 30, 2008, at 8:20 AM, Hubert Łępicki wrote:
2008/9/30 James Gray <james@xxxxxxxxxxxxxxxxxxx>:
On Sep 30, 2008, at 7:30 AM, Hubert Łępicki wrote:
I am using Iconv library wrapper to convert texts to UTF8, but it's
throwing "Iconv::IllegalSequence" exception.
You can add a //TRANSLIT to the end of the "to" encoding to have
Iconv
attempt to convert characters to reasonable equivalents in that
encoding.
This is usually more helpful when your input is all one encoding
and just
has some characters that won't translate well (like a UTF-8 …
going to
ISO-8859-1).
Your case of mixed encodings is probably best handled with //IGNORE
instead,
which asks Iconv to skip over any characters that cannot be
converted. You
will loose some data with this, but it will convert what it can.
You can also use //TRANSLIT//IGNORE to convert what can be
converted and
skip the rest.
Thanks, //IGNORE//TRANSLIT seems to help a bit - but it's not perfect.
You listed those backwards. Is that really what you tried? Does
reversing them make any difference?
James Edward Gray II
.
- Follow-Ups:
- Re: Reliable character encodings conversion
- From: Marcin Raczkowski
- Re: Reliable character encodings conversion
- References:
- Reliable character encodings conversion
- From: Hubert Łępicki
- Re: Reliable character encodings conversion
- From: James Gray
- Re: Reliable character encodings conversion
- From: Hubert Łępicki
- Reliable character encodings conversion
- Prev by Date: Re: rexml exceptions
- Next by Date: Re: exception in thread? in Net::SSH::Multi
- Previous by thread: Re: Reliable character encodings conversion
- Next by thread: Re: Reliable character encodings conversion
- Index(es):
Relevant Pages
|