How fast can Erlang create processes?

(This page is a mirrored copy of an article originally posted on the LShift blog; see the archive index here.)

Very fast indeed.

1> spawntest:serial_spawn(1). 

That’s telling me that Erlang can create and tear down processes at a rate of roughly 350,000 Hz. The numbers change slightly - things slow down - if I’m running the test in parallel:

2> spawntest:serial_spawn(10).
3> spawntest:serial_spawn(10).

4> spawntest:serial_spawn(100).
5> spawntest:serial_spawn(100).

[Update: I forgot to mention earlier that the system seems to spend 50% CPU in user and 50% in system time. Very odd! I wonder what the Erlang runtime is doing to spend so much system time?]

Here’s the code for what I’m doing:


serial_spawn(M) ->
    N = 1000000,
    NpM = N div M,
    Start = erlang:now(),
    dotimes(M, fun () -> serial_spawn(self(), NpM) end),
    dotimes(M, fun () -> receive X -> X end end),
    Stop = erlang:now(),
    (NpM * M) / time_diff(Start, Stop).

serial_spawn(Who, 0) -> Who ! done;
serial_spawn(Who, Count) ->
    spawn(fun () ->
          serial_spawn(Who, Count - 1)

dotimes(0, _) -> done;
dotimes(N, F) ->
    dotimes(N - 1, F).

time_diff({A1,A2,A3}, {B1,B2,B3}) ->
    (B1 - A1) * 1000000 + (B2 - A2) + (B3 - A3) / 1000000.0 .

This is all on an Intel Pentium 4 running at 2.8GHz, with 1MB cache, on Debian linux, with erlang_11.b.0-3_all.deb.


On 11 September, 2006 at 8:49 am, matthew wrote:

My guess is that the kernel time is syscalls to malloc and free for the allocation and deallocation of stack and heap for each process.

On 11 September, 2006 at 11:16 am, Paul Crowley wrote:

malloc and free aren’t syscalls - malloc calls sbrk or similar if it needs more memory, but otherwise it manages the heap in userspace. It should have a certain amount of hysteresis before it tries to give any memory back to kernel space, so all the malloc and free stuff should be happening entirely in userspace.

Of course being a garbage-collected language it won’t be calling malloc and free - it’ll be directly calling sbrk or similar (mmap in fact AIUI) and managing the memory returned itself.

On 13 September, 2006 at 1:22 am, SM Smithfield wrote:

Just ran your test on a Mac (10.4 G5 2.1Ghz 512kB cache)
7> c(spawntest).
8> spawntest:serialspawn(1).
9> spawntest:serial
10> spawntest:serialspawn(10).
11> spawntest:serial
12> spawntest:serial_spawn(100).

I think these fellows might be onto something.

On 13 September, 2006 at 7:00 am, ratatask wrote:

strace the erlang process if you want to see what it’s doing using
system time.
I wouldn’t be surprised if it was time reading, or a select/poll loop :-|

On 13 September, 2006 at 11:42 am, tonyg wrote:

@ratatask: yes, that’d probably give some idea. I don’t just yet fancy the job of trawling through the megabytes of strace output looking for patterns, though :-)

On 13 September, 2006 at 1:36 pm, Samuel Tardieu wrote:

It does make a lot of poll() call, but also much much more times() and gettimeofday() calls (700 times more than poll()).

On 2 June, 2007 at 1:26 pm, def ZA wrote:

“I don’t just yet fancy the job of trawling through the megabytes of strace output looking for patterns, ”

You could maybe use the summary flag on strace to “strace -c” to give you a breakdown.