Re: Premature end of regular expression with non-ascii chara
- From: Lars Broecker <lars.broecker@xxxxxxxxxxxxxxxxx>
- Date: Tue, 31 Jan 2006 13:35:37 +0100
Nick Snels wrote:
> Indeed, it isn't in UTF-8. It's in ISO-8859-1 (Latin1). The problem here
> is that I would like to work in UTF-8, but I have to read in files. And
> these files are often (almost always) in ISO-8859-1. And I haven't found
> a way of converting these strings to Unicode in Ruby. é and è etc. form
> part of ISO-8859-1.
I have to deal with similar problems when processing the infamous german
umlaute äöü. My solution has been to convert a string from latin1 or
latin15 to utf8 via this
utf8_string=latin1_string.unpack("C*").pack("U*")
and the other way round with
latin1_string=utf8_string.unpack("U*").pack("C*")
Did work so far and does not include changes in the environment.
HTH,
Lars
.
- References:
- Re: Premature end of regular expression with non-ascii chara
- From: Nuralanur
- Re: Premature end of regular expression with non-ascii chara
- From: Nick Snels
- Re: Premature end of regular expression with non-ascii chara
- From: Lugovoi Nikolai
- Re: Premature end of regular expression with non-ascii chara
- From: Nick Snels
- Re: Premature end of regular expression with non-ascii chara
- Prev by Date: Re: Premature end of regular expression with non-ascii chara
- Next by Date: Re: Ruby tutorials w/ excercises
- Previous by thread: Re: Premature end of regular expression with non-ascii chara
- Next by thread: One-Click Ruby Installer 184-16 preview2 is available!
- Index(es):
Relevant Pages
|