What’s a Tracer and How’s that Different from a Profiler

Tian Gao
Python in Plain English
4 min readJun 4, 2021

--

trace capture from VizTracer

Programmers think of profilers when their program is slow, and debuggers when their program does unexpected things. However, there’s a nice combination of both tools - a tracer.

A tracer taps on certain events and records them on the timeline. With proper visualization, you can learn what’s your program doing at specific timestamps, just like watching a movie.

Different tracers tap on different events. For example, ftrace on Linux kernel is most commonly used as a function tracer, while strace traces system calls.

Today I want to talk about a Python tracer - VizTracer.

Like ftrace , VizTracer mostly taps on function entries and exits, but it’s also capable of tracing other events.

Let’s start with something simple. g() sleeps for a while then called f() , which sleeps for a while too.

import timedef g():
time.sleep(0.005)
f()
def f():
time.sleep(0.01)
g()

I saved this file to scrabbly.py and ran viztracer :

viztracer scrabble.py
# a result.json file is generated by viztracer
vizviewer result.json

This is the captured result. Everything is visualized on a timeline, the whole program takes about 15ms, which is expected.

With VizTracer, you can visualize what’s going on in your program at the function level.

For example, I was running a gradient descent algorithm and traced it.

Every executed function is recorded and visualized on the timeline. If you are familiar with the algorithm, it’s doing the “gradient descent” function repeatedly, as you can probably tell from the captured result.

If you want details, you can zoom in to the single function.

This is how one iteration ran.

VizTracer is often considered a profiler, but it actually provides much more information than a profiler. A profiler normally generates a summary of your program execution, providing the time spent on functions, which is very helpful in many cases for optimization. It does not provide the timeline, how each function call executes, the order of calls.

Actually, VizTracer provides a profiler visualization as well, using the very helpful flamegraph:

vizviewer --flamegraph result.json

This is the flamegraph of the same gradient descent algorithm. It’s clearer about which function(or call stack) takes more time, but also lacks a lot of other information a tracer can provide.

A very unique feature of VizTracer is its support for concurrency, including multiprocessing, subprocess, threading and async . It’s significantly harder to debug or profile your program when concurrency is in the picture. You need to think about synchronization, racing issues, communication, and many other things that you don’t need to worry about without concurrency.

VizTracer is very helpful in these cases because it can show different threads, processes, or async tasks on the same timeline.

Let’s use an example slightly modified from Python official documentation

from multiprocessing import Pool
import os
def f(x):
return x ** x
if __name__ == "__main__":
process_num = 5
with Pool(processes=process_num) as pool:
print(pool.map(f, range(10)))
for i in pool.imap_unordered(f, range(10)):
print(i)
res = pool.apply_async(f, (20,)) # runs in *only* one process
print(res.get(timeout=1)) # prints "400"
res = pool.apply_async(os.getpid, ()) # runs in *only* one process
print(res.get(timeout=1)) # prints the PID of that process
multiple_results = [pool.apply_async(os.getpid, ()) for i in range(process_num)]
print([res.get(timeout=1) for res in multiple_results])

And you need to tell VizTracer that it’s collecting multiprocessing programs.

viztracer --log_multiprocess multi_process_pool.py

On the same timeline, you can view how the main process spawn the pool workers, and how each of the pool worker spent most of the time on communication, not “working”. This is super easy to notice when you can visualize your workers, instead of asking “why my multi-process program is slower” on StackOverflow.

In fact, you can even detect deadlocks or unexpected blocks using this feature, because everything is on the SAME timeline.

Of course, extra information comes with a price.

First of all, because a tracer records more information, it has larger overhead than a sampling profiler, which only samples call stack at a certain frequency. VizTracer has a similar overhead as cProfile, which is about 2x in the worst case.

And because it’s detailed information, not a summary, it takes more RAM and disk space to store the result. It’s also challenging to visualize it when it’s large.

However, thanks to Perfetto, visualization is very smooth even with millions of entries. VizTracer can generally handle at least 4 to 5 million entries with about 1GB RAM. VizTracer also uses a circular buffer so you don’t need to worry about overflow. It will throw out the earlier records and keep the latest ones.

Sampling profilers like py-spy or Scalene are great in many situations, but you should definitely have a tracer in your toolbox for the cases that are more complicated or irreproducible.

More content at plainenglish.io

--

--