Re: REXML memory consumption



On Nov 17, 2007 7:58 AM, Ray Chen <ray.c.chen@xxxxxxxxx> wrote:
My process memory usage has been increasing steadily, and some probing
pointed me to REXML. I created a test that consisted of feeding 10 xml
files ranging in size from 15kB to 270kB to REXML::Document.new(). The
files are fed smallest to largest. I would think that memory usage
should return back to ~8 MB since the REXML::Document should go out of
scope, and everything should get garbage-collected.

Is there something wrong with my understanding of Ruby or does REXML
hold onto memory?

You can get marginally better by replacing
#create the string
f = File.open("/tmp/#{i}.xml", 'r')
str = ''

while line = f.gets
str << line
end
f.close

with

str = File.read("/tmp/#{i}.xml")

NB: The your version would be better written (with regards to
exception safety etc.) as:

str = ''
File.open("/tmp/#{i}.xml", 'r') do |f|
while line = f.gets
str << line
end
end

#construct the xml
xml = REXML::Document.new(str)
xml = nil

return nil
end

As Robert said, there are more things happening. One of them is that
ruby allocates memory in increasing heap blocks.
If anything used is still inside the block, the block won't be
released to system.

I tried to reuse one string as a buffer for the file, but it didn't
help [see IO#read(lenght, buffer)]. Other thing I tried was to
send the file itself to REXML::Document.new, but it was even worse [
File.open(...) {|f| REXML::Doc.new(f) }].

This is on win xp sp2.

You can find on the net some tools to find out what consumes the
memory - but most of them are in the
hacks category (no offense!). On windows there is the Ruby Memory
Validator that does a similar job.

.



Relevant Pages

  • Re: REXML memory consumption
    ... I would think that memory usage ... Is there something wrong with my understanding of Ruby or does REXML ... str << line ... No need to read the whole file into a large string before it is parsed as XML. ...
    (comp.lang.ruby)
  • [EXPL] Diskeeper Remote Memory Disclosure
    ... The following security advisory is sent to the securiteam mailing list, and can be found at the SecuriTeam web site: http://www.securiteam.com ... Diskeeper Remote Memory Disclosure ... char str; ...
    (Securiteam)
  • [Full-disclosure] Remote Memory Read in Diskeeper 9 - 2007
    ... Diskeeper Remote Memory Disclosure ... char str; ...
    (Full-Disclosure)
  • Remote Memory Read in Diskeeper 9 - 2007
    ... Diskeeper Remote Memory Disclosure ... char str; ...
    (Bugtraq)
  • Re: Database for C#?
    ... I want the XML file that .Net generates to have separate Nodes for each language so that we can selectively load only the language text values for the language that the user selects for the Add-in. ... That will reduce the memory and make it quicker to load. ... <XMLData> ...
    (microsoft.public.dotnet.languages.csharp)