Understand CPU profiling graph

The CPU profiler provides very interesting graphs. However, they contain more information than it seems at first sight.

How does it works ? About sampling

The golang CPU profiling produce metrics by sampling.
It means that your software is randomly stopped every X ms (it’s the frequency), and the profiler save in which function you are.
And you can output as a graph the information collected :

By doing this, the profiler is able to produce a statistically accurate representation of how many percent/time of CPU computing you pass in this function.

Make it really accurate

As a statistic measure, the computation time is the limit of the data accuracy.
It’s why the more you send job to do to your software, the more the result is accurate.

Your output is empty ?
The output files are empty or the generate graph is blank :
Your software didn’t compute enough, and there is not enough samples.
Don’t forget that CPU profiling doesn’t sample when your software is just waiting (ie waiting network).
-> Send more data to process !

What information gives a node


encoding/json is the package.
(*encodeState).reflectValue is the function represented by the node.

The prefix (*encodeState) let us know :
1. reflectValue is a method of encodeState
2. the caller is a pointer (*) on reflectValue
3. the function is called from an interface : encodeState is between parenthesis (as Golang syntax for cast from an interface to underlying type)

We got 2 times :
“00.3s” is the time passed in the function itself, without called functions.
“5.33s” is the total time passed in the function (function itself + called functions)

Each time is followed by a percent ; representing the time passed in the function on the global execution time.

The sum is not 100% !

The graph is not complete : only the top consuming nodes/edges are represented.
On the top left corner you can see more informations about ignored data :


By the way, did you notice that some edges are dashed lines, but others are solide lines ?


The dashed lines represent calls with intermediate calls dropped : the time passed in theses hidden intermediate functions is negligible.

Five interesting runtime nodes

1- Garbage Collector

runtime.gc[..] prefixed functions are Garbage Collector functions (sourcecode).

2- Scheduler

runtime.schedule is the golang scheduler, in charge to manage the goroutines. Never heard about that ? Take a look !

3- Malloc

runtime.mallocgc is the memory allocator.

4- SigProf

runtime.sigprof is the profiler itself.

5- Goexit

The root node of the graph is named “goexit”.

128	// goexit is the return stub at the top of every goroutine call stack.
129	// Each goroutine stack is constructed as if goexit called the
130	// goroutine's entry point function, so that when the entry point
131	// function returns, it will return to goexit, which will call goexit1
132	// to perform the actual exit.

From source code documentation.

The CPU profiler is a useful tool to find the consuming parts of your code.
Have fun with it !

Entrepreneur – Cofounder at Golem.ai (Paris, France)

I enjoy sharing Golang interesting patterns, experiments and tips.

Leave a Reply

Your email address will not be published. Required fields are marked *