Notes from So, You've Got A Memory Leak

These are my notes from Danny Coates's talk on July 3, 2012 at NodeConf. You can follow Danny on twitter @antiserf

Memory leaks can be almost anywhere in node. Native code, your JavaScript code, modules... I'm going to talk about finding memory leaks in the V8. V8 has some really great tools to help you find these memory leaks. There are the V8 profiler API, which gives you CPU profiling and heap snapshots. I wrote a little wrapper around V8 profiler and put it on npm a long time ago just because nobody had done it yet. Several people now have contributed to it without my knowledge.

A heap snapshot is basically a graph of your entire memory heap. Everything looks the same, because they're just objects. With one snapshot, you can't really find anything.

A node has a size, children, retained size, retainers and a type. Retainers are the things that you have pointers to; if I stick something in an array, that array is a retainer of me. The types of nodes are generally:

  • kHidden
  • kArray
  • kstring
  • kObject
  • kCode
  • kClosure
  • kRegExp
  • kHeapNumber (numbers too large to fit)
  • kNative (native C++ references)
  • kSynthetic (used to group snapshot items)

All these nodes link up using edges. Edges include a to, from, name, and type. Var names live on the edge. Types of edges include:

  • kContextVariable
  • kElement
  • kProperty
  • kInternal
  • kHidden
  • kShortcut
  • kWeak

If you run v8-profiler using profiler.takeSnapshot().nodesCount you'll get something like 18,000 nodes. So how can you get data without doing a bare, traverse-the-graph-yourself approach? Don't use node-inspector. It's old and won't be updated. However, fortunately for everybody else, webkit-devtools-agent is available at https://github.com/c4milo/node-webkit-agent. If you've used the profiler in Chrome, it's the same thing. You can take snapshots and diff them; you can look in an object and see the retainers graph; you can see the delta between two snapshots. It's pretty useful, but it's still kind of hard. If I'm looking for a leak, it's still hard to tell where my leak is coming from.

You can't take one snapshot. One is never enough; even two won't help you pinpoint a leak. One reason is because you need more data. You need to take a bunch of snapshots to see when something was created and how long it lives, and get an idea where the leak is. A leak tends to be "standing still" in the snapshot. Take as many snapshots as you can and keep track of object age. How do we do this in node?

I'm not a rockstar like some other node people. My apologies for calling people rockstars.

Did you enjoy this post? Please spread the word.