Can Anyone Explain This Memory Leak?

Hi Folks,

Sorry to get your attention. :slight_smile:

There's a very strange problem with Mongrel where if Threads are created
because of the Mutex around Rails dispatching, then lots of ram gets
created that never seems to go away.

I boiled the problem down to this:

http://pastie.caboo.se/10194

It's a graph of the "leak" and the base code that causes it (nothing
Mongrel in it at all). This code kind of simulates how Mongrel is
managing threads and locking Rails.

What this code does is create threads until there's 1000 in a
ThreadGroup waiting on a Mutex. Inside the guard 30000 integers are put
inside an Array. Don't let this distract you since it can be strings,
or even nothing and you'll see the same thing. It's just to simulate
Rails creating all the stuff it creates, and to demonstrate that while
these objects should go away, they do not.

Then it waits in 10 second increments for these threads to go away,
calling GC.start each time.

And what happens is the graph you see (samples of mem usage of the ruby
process 1/second after 3 cycles of create/destroy threads). Rather than
the memory for the threads and the array of integers going away, it
sticks around. It'll dip a little bit, but not much, just tops out
there and doesn't die. Even though all the threads are clearly gone and
none of their contents should be around.

In contrast, if you remove the Mutex then the ram behaves as you'd
expect, with it going up and then going away.

I'm hoping people way smarter with Ruby than myself can tell me why this
happens, what is wrong with this code, and how to fix it.

Thanks.

路路路

--
Zed A. Shaw
http://www.zedshaw.com/
http://mongrel.rubyforge.org/
http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.

hi zed-

i don't think you have a leak. try running under electric fence (ef). when i
do i clearly see the memory rise from 1->20% on my desktop, and then decline
back down to 1%, over and over with no reported leaks. the cycle matches
the logging of the script perfectly.

here's the thing though, when i don't run it under electric fence i see the
memory climb to about 20% and then stay there forever. but this too does not
indicate a leak. it just shows how calling 'free' in a process doesn't really
release memory to the os, only to the process itself. the reason you see the
memory vary nicely under ef is that it replaces the standard malloc/free with
it's own voodoo - details of which i do not understand or care too. the
point, however, is that it's 'free' which is doing the 'leaking' - just at the
os level, not the process (ruby) level. we have tons of really long running
processes that exhibit the exact same behaviour - basically the memory image
will climb to maximum and stay there. oddly, however, when you tally them all
up the usage exceeds the system capacity plus swap by miles.

i think this is just illustraing reason 42 why i prefer Kernel.fork to
Thread.new - the only real way to return memory to the os is to exit! :wink:

regards.

-a

路路路

On Fri, 25 Aug 2006, Zed Shaw wrote:

Hi Folks,

Sorry to get your attention. :slight_smile:

There's a very strange problem with Mongrel where if Threads are created
because of the Mutex around Rails dispatching, then lots of ram gets
created that never seems to go away.

I boiled the problem down to this:

http://pastie.caboo.se/10194

It's a graph of the "leak" and the base code that causes it (nothing
Mongrel in it at all). This code kind of simulates how Mongrel is
managing threads and locking Rails.

What this code does is create threads until there's 1000 in a
ThreadGroup waiting on a Mutex. Inside the guard 30000 integers are put
inside an Array. Don't let this distract you since it can be strings,
or even nothing and you'll see the same thing. It's just to simulate
Rails creating all the stuff it creates, and to demonstrate that while
these objects should go away, they do not.

Then it waits in 10 second increments for these threads to go away,
calling GC.start each time.

And what happens is the graph you see (samples of mem usage of the ruby
process 1/second after 3 cycles of create/destroy threads). Rather than
the memory for the threads and the array of integers going away, it
sticks around. It'll dip a little bit, but not much, just tops out
there and doesn't die. Even though all the threads are clearly gone and
none of their contents should be around.

In contrast, if you remove the Mutex then the ram behaves as you'd
expect, with it going up and then going away.

I'm hoping people way smarter with Ruby than myself can tell me why this
happens, what is wrong with this code, and how to fix it.

Thanks.

--
to foster inner awareness, introspection, and reasoning is more efficient than
meditation and prayer.
- h.h. the 14th dalai lama

Have you tried to run it with the latest 1.8.5 prerelease?
from changelog:

聽聽聽聽聽聽聽聽* eval.c (rb_gc_mark_threads): leave unmarked threads which won't wake
聽聽聽聽聽聽聽聽聽聽up alone, and mark threads in the loading table. [ruby-dev:28154]

聽聽聽聽聽聽聽聽* eval.c (rb_gc_abort_threads), gc.c (gc_sweep): kill unmarked
聽聽聽聽聽聽聽聽聽聽threads. [ruby-dev:28172]

路路路

Thu Dec 29 23:59:37 2005 Nobuyoshi Nakada <nobu@ruby-lang.org>

On 8/25/06, Zed Shaw <zedshaw@zedshaw.com> wrote:

Hi Folks,

Sorry to get your attention. :slight_smile:

There's a very strange problem with Mongrel where if Threads are created
because of the Mutex around Rails dispatching, then lots of ram gets
created that never seems to go away.

I boiled the problem down to this:

http://pastie.caboo.se/10194

It's a graph of the "leak" and the base code that causes it (nothing
Mongrel in it at all). This code kind of simulates how Mongrel is
managing threads and locking Rails.

What this code does is create threads until there's 1000 in a
ThreadGroup waiting on a Mutex. Inside the guard 30000 integers are put
inside an Array. Don't let this distract you since it can be strings,
or even nothing and you'll see the same thing. It's just to simulate
Rails creating all the stuff it creates, and to demonstrate that while
these objects should go away, they do not.

Then it waits in 10 second increments for these threads to go away,
calling GC.start each time.

And what happens is the graph you see (samples of mem usage of the ruby
process 1/second after 3 cycles of create/destroy threads). Rather than
the memory for the threads and the array of integers going away, it
sticks around. It'll dip a little bit, but not much, just tops out
there and doesn't die. Even though all the threads are clearly gone and
none of their contents should be around.

In contrast, if you remove the Mutex then the ram behaves as you'd
expect, with it going up and then going away.

I'm hoping people way smarter with Ruby than myself can tell me why this
happens, what is wrong with this code, and how to fix it.

Thanks.

--
Zed A. Shaw
http://www.zedshaw.com/
http://mongrel.rubyforge.org/
http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.

--
Kent
---
http://www.datanoise.com

Zed Shaw <zedshaw@zedshaw.com> writes:

I boiled the problem down to this:

http://pastie.caboo.se/10194

And this one, from the other thread:

聽聽聽聽http://pastie.caboo.se/10317

It's a graph of the "leak" and the base code that causes it (nothing
Mongrel in it at all). This code kind of simulates how Mongrel is
managing threads and locking Rails.

Try out these graphs:

聽聽聽聽http://pastie.caboo.se/10550
聽聽聽聽http://pastie.caboo.se/10551

On an otherwise quiet GNU/Linux system, I ran each of your scripts for half an
hour in a ruby process into which I loaded hooks for malloc(), realloc(),
free(), and brk() [1]. The red line is the amount of memory ruby is consuming
from the heap. The green line is the amount of memory the heap implementation
is consuming from the system.

As Ara described in another message in this thread, Unix-y heap implementations
(including glibc's) typically get new process memory from the kernel by calling
brk() to extend the process's data segment. This can increase the apparent
memory usage of code because the heap implementation can only move the end of
the data segment backwards (releasing memory to the system) as far as the last
page containing in-use memory.

In the graphs I've created we can clearly see that in both cases ruby's use of
memory from the heap drops back around the baseline with each iteration. In
the 'sync.rb' case, the heap implementation's use of memory from the system
tracks user code heap usage pretty closely. In the 'mutex.rb' case, however,
we see two anomalies: (a) the heap implementation's use of system memory stays
right near the maximum amount taken by ruby from the heap, and (b) the maximum
memory used is around twice that of 'sync.rb'.

Ara also mentioned in another message that Mutex is faster than Synchronize,
and hypothesized that this was leading to a pathological interaction with the
ruby garbage collector when creating as many threads as your example does. My
data suggest that the observed "Mutex memory leak" /is/ caused by Mutex's
relative speed, but that it is due to an interaction with the system heap
implementation rather than with ruby's garbage collector. I haven't track down
the exact factors at work, but I'd guess that 'mutex.rb' is (a) allocating new
memory within the time window the heap implementation holds onto unused system
memory, and (b) under some circumstances sufficiently fragments memory to force
the heap implementation to allocate more system memory than required,
eventually bringing the process under the gaze of the OOM killer.

I hope this helps.

[1] I used glibc's provided hooks for malloc/realloc/free and wrote a
聽聽聽聽quick-and-dirty custom trampoline for brk(). Yes, I know about ltrace, but
聽聽聽聽it only caught brk() as a system call, which slowed things down by a factor
聽聽聽聽of 1000 (!). And yes Solaris DTrace would have made this easy, but I made
聽聽聽聽my graphs while waiting for the 2.5Gb Solaris Express DVD to finish
聽聽聽聽downloading. :slight_smile: I can provide my code if anyone wants it.

-Marshall

And why Apache with pre-fork is good for long-running applications :slight_smile:

Greetings,
JS

路路路

On Fri, 2006-08-25 at 14:33 +0900, ara.t.howard@noaa.gov wrote:

i think this is just illustraing reason 42 why i prefer Kernel.fork to
Thread.new - the only real way to return memory to the os is to exit! :wink:

So, to test this hypothesis, the OP could try to instantiate a large number of objects, and see if there is no effect on the vmsize reported by the OS, right? Because those objects should be able to use the memory that is owned by the process, but not used by ruby objects.

路路路

ara.t.howard@noaa.gov wrote:

i don't think you have a leak. try running under electric fence (ef). when i
do i clearly see the memory rise from 1->20% on my desktop, and then decline
back down to 1%, over and over with no reported leaks. the cycle matches
the logging of the script perfectly.

here's the thing though, when i don't run it under electric fence i see the
memory climb to about 20% and then stay there forever. but this too does not
indicate a leak. it just shows how calling 'free' in a process doesn't really
release memory to the os, only to the process itself. the reason you see the
memory vary nicely under ef is that it replaces the standard malloc/free with
it's own voodoo - details of which i do not understand or care too. the
point, however, is that it's 'free' which is doing the 'leaking' - just at the
os level, not the process (ruby) level. we have tons of really long running
processes that exhibit the exact same behaviour - basically the memory image
will climb to maximum and stay there. oddly, however, when you tally them all
up the usage exceeds the system capacity plus swap by miles.

--
聽聽聽聽聽聽聽vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Nope, I can't agree with this because the ram goes up, the OS will kill
it eventually, and if I remove the guard the ram doesn't do this.

And where are you getting your information that free doesn't free
memory? I'd like to read that since all my years of C coding says that
is dead wrong. Care to tell me how malloc/free would report 80M with
Mutex but properly show the ram go down when there is no-Mutex?

And why Linux would kill processes if the ram get too high? Why whole
VPS servers crash? I mean if this ram was just "fake" reporting (which
is very hard to believe) then why are all these things happening?

So please, point me at where in the specifications for malloc/free on
Linux it says that the memory reported will be high even though free and
malloc is called on 80M of ram slowly cycled out, and that linux will
still kill your process even though this ram is not really owned by the
process.

路路路

On Fri, 2006-08-25 at 14:33 +0900, ara.t.howard@noaa.gov wrote:

On Fri, 25 Aug 2006, Zed Shaw wrote:

> Hi Folks,
>
> Sorry to get your attention. :slight_smile:
>
> There's a very strange problem with Mongrel where if Threads are created
> because of the Mutex around Rails dispatching, then lots of ram gets
> created that never seems to go away.
>
> I boiled the problem down to this:
>
> http://pastie.caboo.se/10194
>
hi zed-

i don't think you have a leak. try running under electric fence (ef). when i
do i clearly see the memory rise from 1->20% on my desktop, and then decline
back down to 1%, over and over with no reported leaks. the cycle matches
the logging of the script perfectly.

here's the thing though, when i don't run it under electric fence i see the
memory climb to about 20% and then stay there forever. but this too does not
indicate a leak. it just shows how calling 'free' in a process doesn't really
release memory to the os, only to the process itself. the reason you see the
memory vary nicely under ef is that it replaces the standard malloc/free with
it's own voodoo - details of which i do not understand or care too. the
point, however, is that it's 'free' which is doing the 'leaking' - just at the
os level, not the process (ruby) level. we have tons of really long running
processes that exhibit the exact same behaviour - basically the memory image
will climb to maximum and stay there. oddly, however, when you tally them all
up the usage exceeds the system capacity plus swap by miles.

--
Zed A. Shaw


http://mongrel.rubyforge.org/
http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.

You gotta be kidding me. A damn bug? Oh no, according to ara.t.howard
it's because free doesn't actually free.

Man, two days wasted for nothing.

I'll try 1.8.5 tomorrow. I'm kind of tired of this to be honest.

路路路

On Fri, 2006-08-25 at 14:41 +0900, Kent Sibilev wrote:

Have you tried to run it with the latest 1.8.5 prerelease?
from changelog:

Thu Dec 29 23:59:37 2005 Nobuyoshi Nakada <nobu@ruby-lang.org>

聽聽聽聽聽聽聽聽* eval.c (rb_gc_mark_threads): leave unmarked threads which won't wake
聽聽聽聽聽聽聽聽聽聽up alone, and mark threads in the loading table. [ruby-dev:28154]

聽聽聽聽聽聽聽聽* eval.c (rb_gc_abort_threads), gc.c (gc_sweep): kill unmarked
聽聽聽聽聽聽聽聽聽聽threads. [ruby-dev:28172]

--
Zed A. Shaw


http://mongrel.rubyforge.org/
http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.

Marshall T. Vandegrift wrote:

As Ara described in another message in this thread, Unix-y heap implementations
(including glibc's) typically get new process memory from the kernel by calling
brk() to extend the process's data segment. This can increase the apparent
memory usage of code because the heap implementation can only move the end of
the data segment backwards (releasing memory to the system) as far as the last
page containing in-use memory.

Patient: Doctor, it hurts when I do that!
Doctor: OK ... don't do that!

Patient: I have this pain right here.
Doctor: Have you ever had this before?
Patient: No.
Doctor: Well, you have it now!

Ara also mentioned in another message that Mutex is faster than Synchronize,
and hypothesized that this was leading to a pathological interaction with the
ruby garbage collector when creating as many threads as your example does. My
data suggest that the observed "Mutex memory leak" /is/ caused by Mutex's
relative speed, but that it is due to an interaction with the system heap
implementation rather than with ruby's garbage collector. I haven't track down
the exact factors at work, but I'd guess that 'mutex.rb' is (a) allocating new
memory within the time window the heap implementation holds onto unused system
memory, and (b) under some circumstances sufficiently fragments memory to force
the heap implementation to allocate more system memory than required,
eventually bringing the process under the gaze of the OOM killer.

I hope this helps.

[1] I used glibc's provided hooks for malloc/realloc/free and wrote a
聽聽聽聽quick-and-dirty custom trampoline for brk(). Yes, I know about ltrace, but
聽聽聽聽it only caught brk() as a system call, which slowed things down by a factor
聽聽聽聽of 1000 (!). And yes Solaris DTrace would have made this easy, but I made
聽聽聽聽my graphs while waiting for the 2.5Gb Solaris Express DVD to finish
聽聽聽聽downloading. :slight_smile: I can provide my code if anyone wants it.

I'm more interested in what these tests do on Solaris :). We have
Windows, Linux, Mac OS and someone I think ran it on BSD. Just out of
curiosity, will these test cases run on YARV?

amen!

fork + drb takes it every time :wink:

-a

路路路

On Fri, 25 Aug 2006, Srinivas JONNALAGADDA wrote:

On Fri, 2006-08-25 at 14:33 +0900, ara.t.howard@noaa.gov wrote:

i think this is just illustraing reason 42 why i prefer Kernel.fork to
Thread.new - the only real way to return memory to the os is to exit! :wink:

And why Apache with pre-fork is good for long-running applications :slight_smile:

--
to foster inner awareness, introspection, and reasoning is more efficient than
meditation and prayer.
- h.h. the 14th dalai lama

This is actually what I refer to in the no-Mutex situation. Create a
ton of threads without a mutex in them and the ram goes away. The
evidence doesn't support the claims at all.

Also the fact that the OS is killing these processes and swap is getting
used indicates that this is real memory being lost.

But, a reply from Kent Sibilev says this could be a bug in 1.8.4. So
there's even more evidence that it is a leak.

路路路

On Fri, 2006-08-25 at 14:46 +0900, Joel VanderWerf wrote:

So, to test this hypothesis, the OP could try to instantiate a large
number of objects, and see if there is no effect on the vmsize reported
by the OS, right? Because those objects should be able to use the memory
that is owned by the process, but not used by ruby objects.

--
Zed A. Shaw


http://mongrel.rubyforge.org/
http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.

Nope, I can't agree with this because the ram goes up, the OS will kill it
eventually, and if I remove the guard the ram doesn't do this.

that makes perfect sense. with the gaurd only one thread at a time can be
initializing the huge array at once, it takes time, because of this the number
of threads grows quite large - thus the large amount of memory consumed as
they are added the the thread group. without the mutex the threads simply
race right through their work - creating the array and quickly exiting.

so, when the threads can be created and die quickly the maximum memory used by
the process simly never gets that big. with the mutex the maximum memory is
larger simply because the threads take longer to run.

run this on your system:

聽聽聽聽聽harp:~ > cat a.rb
聽聽聽聽聽pid = Process.pid
聽聽聽聽聽a = []
聽聽聽聽聽new_array = lambda{|size| Array.new(size){ 42 }.map}
聽聽聽聽聽eat_memory = lambda{|n| n.times{ a << new_array[4242] }}
聽聽聽聽聽free_memory = lambda{a.clear and GC.start }

聽聽聽聽聽eat_memory[4242]
聽聽聽聽聽puts
聽聽聽聽聽puts "after malloc"
聽聽聽聽聽system "ps v #{ pid }"

聽聽聽聽聽free_memory[]
聽聽聽聽聽puts
聽聽聽聽聽puts "after free"
聽聽聽聽聽system "ps v #{ pid }"

聽聽聽聽聽harp:~ > ruby a.rb

聽聽聽聽聽after malloc
聽聽聽聽聽聽聽PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
聽聽聽聽聽31264 pts/10 R+ 0:24 0 595 27164 26364 15.6 ruby a.rb

聽聽聽聽聽after free
聽聽聽聽聽聽聽PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
聽聽聽聽聽31264 pts/10 S+ 0:24 0 595 27164 26368 15.6 ruby a.rb

it shows the sort of behaviour i'm talking about

And where are you getting your information that free doesn't free
memory? I'd like to read that since all my years of C coding says that
is dead wrong.

http://groups.google.com/group/comp.unix.programmer/browse_frm/thread/44a5705312cc5df9/829197acdf6651ac?lnk=gst&q=memory+not+really+freed&rnum=4#829197acdf6651ac
http://groups.google.com/group/comp.unix.programmer/browse_frm/thread/23e7be26dd21434a/2f84b3dc080c7519?lnk=gst&q=memory+not+really+freed&rnum=9#2f84b3dc080c7519

the thing is that this has nothing to do with c and everything to do with os

聽聽聽http://www.linuxjournal.com/article/6390

"When a process needs memory, some room is created by moving the upper bound of
the heap forward, using the brk() or sbrk() system calls. Because a system call
is expensive in terms of CPU usage, a better strategy is to call brk() to grab
a large chunk of memory and then split it as needed to get smaller chunks. This
is exactly what malloc() does. It aggregates a lot of smaller malloc() requests
into fewer large brk() calls. Doing so yields a significant performance
improvement. The malloc() call itself is much less expensive than brk(),
because it is a library call, not a system call. Symmetric behavior is adopted
when memory is freed by the process. Memory blocks are not immediately returned
to the system, which would require a new brk() call with a negative argument.
Instead, the C library aggregates them until a sufficiently large, contiguous
chunk can be freed at once.

For very large requests, malloc() uses the mmap() system call to find
addressable memory space. This process helps reduce the negative effects of
memory fragmentation when large blocks of memory are freed but locked by
smaller, more recently allocated blocks lying between them and the end of the
allocated space. In this case, in fact, had the block been allocated with
brk(), it would have remained unusable by the system even if the process freed
it."

Care to tell me how malloc/free would report 80M with Mutex but properly show
the ram go down when there is no-Mutex?

it's a sort of race

And why Linux would kill processes if the ram get too high? Why whole
VPS servers crash? I mean if this ram was just "fake" reporting (which
is very hard to believe) then why are all these things happening?

all memory is fake. that's why you can do this

聽聽聽harp:~ > free -b
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽total used free shared buffers cached
聽聽聽Mem: 1051602944 1032327168 19275776 0 145215488 563167232
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽^^^^^^^^

聽聽聽harp:~ > ruby -e' way_too_big = "42" * 19275776; p way_too_big.size '
聽聽聽38551552

:wink:

regarding the crashes - there are still limits whic something is apparently
exceeding.

So please, point me at where in the specifications for malloc/free on
Linux it says that the memory reported will be high even though free and
malloc is called on 80M of ram slowly cycled out, and that linux will
still kill your process even though this ram is not really owned by the
process.

http://groups.google.com/group/comp.unix.programmer/search?group=comp.unix.programmer&q=memory+not+really+freed&qt_g=1&searchnow=Search+this+group

in any case, try running under electric fence, which will use a cleaner
malloc/free and show the 'real' behaviour. it really does seem fine.

regards.

-a

路路路

On Fri, 25 Aug 2006, Zed Shaw wrote:
--
to foster inner awareness, introspection, and reasoning is more efficient than
meditation and prayer.
- h.h. the 14th dalai lama

a nice explanation:

"A couple of things: Some `malloc' implementations use mmap() to allocate
large blocks (sometimes the threshold is a page or two, sometimes more), so
this might be part of what you're seeing. Some programs have allocation
patterns that interact badly with the way certain allocators work. Often, for
example, when some number of objects of a certain size have been allocated, a
future allocation cuts up a page into chunks of that size, gives you one, and
throws the rest onto a free list. If the allocation pattern of a program is to
allocate many chunks of a certain size, free them, and then allocate many
chunks of a somewhat larger size, the allocator can't satisfy the latter
requests (as bunches of somewhat-too-small chunks are on the free lists)
without grabbing more address space from the OS (via sbrk()). This is not
necessarily a bad thing, even though it makes it look like the overall size of
the program is expanding; though the address space may have grown, the pages
containing the `somewhat-too-small' chunks that have been freed are eventually
swapped out; unless they're touched again their only downside is consumption
of swap space. It's really only a problem for programs with _very_ large
footprints; even in cases like that, at some point most malloc implementations
will `unslice' space from previously sliced pages."

http://groups.google.com/group/comp.unix.programmer/browse_frm/thread/23e7be26dd21434a/2f84b3dc080c7519?lnk=gst&q=memory+not+really+freed&rnum=9#2f84b3dc080c7519

the paper referenced is also good.

ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps

regards.

-a

路路路

On Fri, 25 Aug 2006, Zed Shaw wrote:

And where are you getting your information that free doesn't free
memory? I'd like to read that since all my years of C coding says that
is dead wrong. Care to tell me how malloc/free would report 80M with
Mutex but properly show the ram go down when there is no-Mutex?

--
to foster inner awareness, introspection, and reasoning is more efficient than
meditation and prayer.
- h.h. the 14th dalai lama

Zed Shaw wrote:

Hi Folks,

Sorry to get your attention. :slight_smile:

There's a very strange problem with Mongrel where if Threads are created
because of the Mutex around Rails dispatching, then lots of ram gets
created that never seems to go away.

I boiled the problem down to this:

http://pastie.caboo.se/10194

hi zed-

i don't think you have a leak. try running under electric fence (ef). when i
do i clearly see the memory rise from 1->20% on my desktop, and then decline
back down to 1%, over and over with no reported leaks. the cycle matches
the logging of the script perfectly.

here's the thing though, when i don't run it under electric fence i see the
memory climb to about 20% and then stay there forever. but this too does not
indicate a leak. it just shows how calling 'free' in a process doesn't really
release memory to the os, only to the process itself. the reason you see the
memory vary nicely under ef is that it replaces the standard malloc/free with
it's own voodoo - details of which i do not understand or care too. the
point, however, is that it's 'free' which is doing the 'leaking' - just at the
os level, not the process (ruby) level. we have tons of really long running
processes that exhibit the exact same behaviour - basically the memory image
will climb to maximum and stay there. oddly, however, when you tally them all
up the usage exceeds the system capacity plus swap by miles.

Nope, I can't agree with this because the ram goes up, the OS will kill
it eventually, and if I remove the guard the ram doesn't do this.

And where are you getting your information that free doesn't free
memory? I'd like to read that since all my years of C coding says that
is dead wrong. Care to tell me how malloc/free would report 80M with
Mutex but properly show the ram go down when there is no-Mutex?

And why Linux would kill processes if the ram get too high? Why whole
VPS servers crash? I mean if this ram was just "fake" reporting (which
is very hard to believe) then why are all these things happening?

So please, point me at where in the specifications for malloc/free on
Linux it says that the memory reported will be high even though free and
malloc is called on 80M of ram slowly cycled out, and that linux will
still kill your process even though this ram is not really owned by the
process.

I've had a lot of experience with the "bizarre" behavior of the Linux
memory manager. First of all, which kernel do you have? Second, the
Linux out-of-memory killer can be turned off (also kernel-dependent).
Finally, Linux has this philosophy that "free memory is wasted memory",
and "I am Linux, you are the user, I know what's good for me, if that
works for you, great!" The combination of these means that things that
make sense to you or to a performance engineer might not actually be
happening. :slight_smile:

Oh, yeah, how much physical RAM do you have? If you have a recent
kernel, which of the dozens of memory configuration options are you
using? Have you considered switching to another OS? :slight_smile:

路路路

On Fri, 2006-08-25 at 14:33 +0900, ara.t.howard@noaa.gov wrote:

On Fri, 25 Aug 2006, Zed Shaw wrote:

This isn't considered bizarre behavior and in fact is a feature. Even after you shutdown an application it keeps shared libs and such in memory (cache). That way the system as a whole is much quicker when you access applications that call these shared libs. No point in having the overhead of re-loading things into memory over and over again. If you get to the point where you use additional applications and there isn't free ram it flushes some of the cached objects out so that you don't have to swap.

M. Edward (Ed) Borasky wrote:

路路路

Zed Shaw wrote:

On Fri, 2006-08-25 at 14:33 +0900, ara.t.howard@noaa.gov wrote:

On Fri, 25 Aug 2006, Zed Shaw wrote:

Hi Folks,

Sorry to get your attention. :slight_smile:

There's a very strange problem with Mongrel where if Threads are created
because of the Mutex around Rails dispatching, then lots of ram gets
created that never seems to go away.

I boiled the problem down to this:

http://pastie.caboo.se/10194

hi zed-

i don't think you have a leak. try running under electric fence (ef). when i
do i clearly see the memory rise from 1->20% on my desktop, and then decline
back down to 1%, over and over with no reported leaks. the cycle matches
the logging of the script perfectly.

here's the thing though, when i don't run it under electric fence i see the
memory climb to about 20% and then stay there forever. but this too does not
indicate a leak. it just shows how calling 'free' in a process doesn't really
release memory to the os, only to the process itself. the reason you see the
memory vary nicely under ef is that it replaces the standard malloc/free with
it's own voodoo - details of which i do not understand or care too. the
point, however, is that it's 'free' which is doing the 'leaking' - just at the
os level, not the process (ruby) level. we have tons of really long running
processes that exhibit the exact same behaviour - basically the memory image
will climb to maximum and stay there. oddly, however, when you tally them all
up the usage exceeds the system capacity plus swap by miles.

Nope, I can't agree with this because the ram goes up, the OS will kill
it eventually, and if I remove the guard the ram doesn't do this.

And where are you getting your information that free doesn't free
memory? I'd like to read that since all my years of C coding says that
is dead wrong. Care to tell me how malloc/free would report 80M with
Mutex but properly show the ram go down when there is no-Mutex?

And why Linux would kill processes if the ram get too high? Why whole
VPS servers crash? I mean if this ram was just "fake" reporting (which
is very hard to believe) then why are all these things happening?

So please, point me at where in the specifications for malloc/free on
Linux it says that the memory reported will be high even though free and
malloc is called on 80M of ram slowly cycled out, and that linux will
still kill your process even though this ram is not really owned by the
process.

I've had a lot of experience with the "bizarre" behavior of the Linux
memory manager. First of all, which kernel do you have? Second, the
Linux out-of-memory killer can be turned off (also kernel-dependent).
Finally, Linux has this philosophy that "free memory is wasted memory",
and "I am Linux, you are the user, I know what's good for me, if that
works for you, great!" The combination of these means that things that
make sense to you or to a performance engineer might not actually be
happening. :slight_smile:

Oh, yeah, how much physical RAM do you have? If you have a recent
kernel, which of the dozens of memory configuration options are you
using? Have you considered switching to another OS? :slight_smile:

> And where are you getting your information that free doesn't free
> memory? I'd like to read that since all my years of C coding says that
> is dead wrong. Care to tell me how malloc/free would report 80M with
> Mutex but properly show the ram go down when there is no-Mutex?

a nice explanation:

<snip>

http://groups.google.com/group/comp.unix.programmer/browse_frm/thread/23e7be26dd21434a/2f84b3dc080c7519?lnk=gst&q=memory+not+really+freed&rnum=9#2f84b3dc080c7519

the paper referenced is also good.

ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps

So, a posting to a news group from some guy in 2001 (5 years old) and a
paper written in 1995 (11 years old) that has references to papers as
old as 1964, none of which say that measurements of RAM will behave as
I've demonstrated *today*. Then more usenet postings (still no C code
that demonstrates this magic), and finally a recent article from
linuxjournal describing the Linux memory, but nothing that really says
memory will stay around at 80M levels even 20-30 seconds after all ram
has been supposedly freed.

Riiight. Sounds like I'll just go back to my now working program with
it's fancy "Sync" instead of "Mutex" technology, since obviously there's
no memory leak (even though we consistently demonstrate this fixes it in
several situations).

http://pastie.caboo.se/10317

http://pastie.caboo.se/10194

路路路

On Fri, 2006-08-25 at 16:10 +0900, ara.t.howard@noaa.gov wrote:

On Fri, 25 Aug 2006, Zed Shaw wrote:

--
Zed A. Shaw


http://mongrel.rubyforge.org/
http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.

convincing zed of this might be hard :wink:

-a

路路路

On Sat, 26 Aug 2006, Cliff Cyphers wrote:

This isn't considered bizarre behavior and in fact is a feature. Even after
you shutdown an application it keeps shared libs and such in memory (cache).
That way the system as a whole is much quicker when you access applications
that call these shared libs. No point in having the overhead of re-loading
things into memory over and over again. If you get to the point where you
use additional applications and there isn't free ram it flushes some of the
cached objects out so that you don't have to swap.

--
to foster inner awareness, introspection, and reasoning is more efficient than
meditation and prayer.
- h.h. the 14th dalai lama

Cliff Cyphers wrote:

This isn't considered bizarre behavior and in fact is a feature. Even
after you shutdown an application it keeps shared libs and such in
memory (cache). That way the system as a whole is much quicker when you
access applications that call these shared libs. No point in having the
overhead of re-loading things into memory over and over again. If you
get to the point where you use additional applications and there isn't
free ram it flushes some of the cached objects out so that you don't
have to swap.

As a performance engineer who works with managers and capacity planners
of large systems, I consider a memory manager that can't be tuned to a
workload bizarre. Performance comes in many dimensions, and we ask our
Linux systems to serve numerous roles.

Sometimes we want them to process large batch jobs for rapid turnaround,
sometimes we want them to provide rapid interactive response to
thousands of people logged on to an application, sometimes we want them
to manage a huge DBMS, sometimes we want them to serve up web pages,
sometimes we want them to participate in a high-performance cluster, and
sometimes we want them to be scientific workstations. The idea that a
single memory management algorithm/code without tuning parameters can
serve all of those needs is ludicrous.

I don't know if the 2.6 kernel is genuinely better than 2.4, I've gotten
smarter about the Linux memory manager, or both. :slight_smile: What I *do* know is
that just about all forms of the 2.4 Linux kernel, even the carefully
tuned ones in RHEL 3, behave in unpredictable and less than useful ways
when people do even moderately stupid things. I think a server ought to
be able to deal with a memory leak in a web server application in a more
productive way than the out of memory killer!

brk and sbrk are still used, for example, phkmalloc:

http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/stdlib/malloc.c?rev=1.90.2.1&content-type=text/x-cvsweb-markup&only_with_tag=RELENG_6

phkmalloc returns memory to the OS when an entire page is clean. In Ruby a page may not always be clean because ruby heap slots may be scribbled upon it. See add_heap in gc.c:

http://www.ruby-lang.org/cgi-bin/cvsweb.cgi/ruby/gc.c?rev=1.168.2.45;content-type=text%2Fx-cvsweb-markup;only_with_tag=ruby_1_8

A good way to test this theory would be to either increase the number of items per ruby heap slot by editing gc.c and recompiling, or using many more threads. (Maybe I will do that.)

The real difference may be that Sync and Mutex have different memory usage profiles.

路路路

On Aug 26, 2006, at 5:14 AM, Zed Shaw wrote:

On Fri, 2006-08-25 at 16:10 +0900, ara.t.howard@noaa.gov wrote:

On Fri, 25 Aug 2006, Zed Shaw wrote:

And where are you getting your information that free doesn't free
memory? I'd like to read that since all my years of C coding says that
is dead wrong. Care to tell me how malloc/free would report 80M with
Mutex but properly show the ram go down when there is no-Mutex?

a nice explanation:

<snip>

http://groups.google.com/group/comp.unix.programmer/browse_frm/thread/23e7be26dd21434a/2f84b3dc080c7519?lnk=gst&q=memory+not+really+freed&rnum=9#2f84b3dc080c7519

the paper referenced is also good.

ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps

So, a posting to a news group from some guy in 2001 (5 years old) and a
paper written in 1995 (11 years old) that has references to papers as
old as 1964, none of which say that measurements of RAM will behave as
I've demonstrated *today*.

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net
This implementation is HODEL-HASH-9600 compliant

http://trackmap.robotcoop.com

Run your code under a process trace program such as ktrace (Mac OS X).
You should be able to easily grep the resulting dump for system calls
that *actually* return memory to the OS. Just because free is called
doesn't mean that the memory allocator has actually notified the kernel
that it no longer needs that huge hunk of memory.

I'm guessing you'll look for calls to munmap or something similar.

Gary Wright

路路路

On Aug 26, 2006, at 8:14 AM, Zed Shaw wrote:

So, a posting to a news group from some guy in 2001 (5 years old) and a
paper written in 1995 (11 years old) that has references to papers as
old as 1964, none of which say that measurements of RAM will behave as
I've demonstrated *today*. Then more usenet postings (still no C code
that demonstrates this magic), and finally a recent article from
linuxjournal describing the Linux memory, but nothing that really says
memory will stay around at 80M levels even 20-30 seconds after all ram
has been supposedly freed.