Re: File.new and encoding
I'm doing something like:
File.open("target","w") do |target|
File.open("source","r") do |source|
source.each_line do |line|
... some processing ...
target.write(line)
end
end
end
Have you looked at 'iconv' in the standard library?
http://www.ruby-doc.org/stdlib/libdoc/iconv/rdoc/classes/Iconv.html
Assuming all your input files were ISO-8859-1, and you wanted your output file in UTF-8, your example might look something like (untested):
File.open("target","w") do |target|
Iconv.open('UTF-8', 'ISO-8859-1') do | converter |
File.open("source","r") do |source|
source.each_line do |line|
# ... some processing ...
target.write( converter.iconv(line) )
end
end
target << converter.iconv(nil)
end
end
Iconv should deal with BOMs, stripping them out or adding them in where necessary. I'm not sure if it will complain if it finds a BOM mid-stream (as you open your second and subsequent input file) - if so you could just instantiate a new Iconv to deal with each input.
HTH
alex
.