Re: Tracking down a garbage collection problem
- From: "M. Edward (Ed) Borasky" <znmeb@xxxxxxxxxxx>
- Date: Sun, 22 Apr 2007 01:20:46 +0900
Wincent Colaiuta wrote:
I'm trying to work out ways to reduce the memory use of one of myWell ... where to begin?? :)
projects, but I don't know what methods are available to the Ruby
programmer for profiling memory use and tracking down garbage
collection problems.
The short version:
I have a project where processing a file can consume dozens of
megabytes of memory; if I process many files in a single run then
total memory usage can reach hundreds of megabytes or more than a gig.
I would expect garbage collection to kick in along the way but it
doesn't seem to be happening, memory usage grows and grows, and I
don't know where to start zeroing in on the problem.
The long version:
I've written an object-oriented templating system[1] that incorporates
a memoizing packrat parser. As each file is parsed the parser
"memoizes" the partial results for speed. In a lengthy file the size
of the memoizing cache can grow quite large (dozens of megabytes). But
I would expect the entire contents of the memoization cache to get
garbage collected when I move on to the next file; the cache itself
has definitely fallen out of scope by that time. But garbage
collection doesn't seem to be happen, as memory use grows linearly as
I batch process input files.
As this is a largish, complicated project I don't even know where to
begin to start investigating this. So really, I am looking for general
information on techniques for measuring and exploring memory use and
garbage collection in Ruby.
Thanks in advance for the advice!
Wincent
[1] http://walrus.wincent.com/
1. First of all, get the notion that "premature optimization is the root
of all evil" out of your head. The only sense in which that maxim is
valid is when the word "premature" is strongly emphasized. Part of the
practice of software engineering, and what separates software
engineering from "mere coding" is knowing what the algorithms of choice
are for the problem you are trying to solve -- and their resource
requirements -- and using them. Dijkstra may have said the premature
optimization thing, but it's obviously been taken out of the context of
his *massive* output of practical computer science and software
engineering teachings. Read *everything* he wrote!
2. In general, to reduce memory usage, you must do one or both of two
things: recompute things rather than storing them in memory, or write
things explicitly out to "backing store" and read them back in.
3. Languages without explicit object destructors need to be fixed,
including Ruby. :) However, part of software engineering in the absence
of them is to make sure there are no references to objects you no longer
want, and then explicitly call the garbage collector. I do a lot of
coding in R, which is a dynamic, garbage collected language for
scientific and statistical computing. I've got a 1 GB workstation, and
still I have "normal sized problems" that can overflow memory. A simple
delete of unused objects (R has "rm", which will delete an object from
the workspace) followed by a call to the garbage collector usually gets
me going again.
4. Relational databases are your friend. They are designed and optimized
for dealing with large and complicated datasets, and object-relational
mappings like ActiveRecord and Og (Object Graph) exist in Ruby to make
working with them as simple as possible. How do you define "large"? For
a single-user system like a laptop or workstation, figure you have
something like half of the installed RAM to run your applications. At
least on Linux workstations, things like I/O buffers will take up the
other half. If you're only running this one application, anything bigger
than half of your installed RAM is too big and ought to be redesigned to
use a database.
--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.net/
If God had meant for carrots to be eaten cooked, He would have given rabbits fire.
.
- References:
- Tracking down a garbage collection problem
- From: Wincent Colaiuta
- Tracking down a garbage collection problem
- Prev by Date: Re: getting keyboard input with ruby while forking mplayer.
- Next by Date: Re: Segfaults, other signals, fork, etc.
- Previous by thread: Re: Tracking down a garbage collection problem
- Next by thread: Re: Tracking down a garbage collection problem
- Index(es):
Relevant Pages
|