gh-138122: Add differential flame graph by ivonastojanovic · Pull Request #145785 · python/cpython

ivonastojanovic · 2026-03-10T19:46:59Z

Differential flame graphs compare two profiling runs and highlight where performance has changed. This makes it easier to detect regressions introduced by code changes and to verify that optimizations have the intended effect.

The visualization renders the current profile with frame widths representing current time consumption. Color is then applied to show the difference relative to the baseline profile: red gradients indicate regressions, while blue gradients indicate improvements.

Some call paths may disappear entirely between profiles. These are referred to as elided stacks and occur when optimizations remove code paths or when certain branches stop executing. When elided stacks are present, an "Elided" toggle is displayed, allowing the user to switch between the main differential view and a view showing only the removed paths.

In the diff view, slow_function improved (blue) and fast_function regressed (red) compared to the baseline. medium_function was removed (shown in the elided stacks toggle) and new_function was added (purple).

Differential view

Elided view

CC: @pablogsal @lkollar

Issue: Implement PEP 799 – A dedicated profiling package for organizing Python profiling tools #138122

📚 Documentation preview 📚: https://cpython-previews--145785.org.readthedocs.build/

Differential flame graphs compare two profiling runs and highlight where performance has changed. This makes it easier to detect regressions introduced by code changes and to verify that optimizations have the intended effect. The visualization renders the current profile with frame widths representing current time consumption. Color is then applied to show the difference relative to the baseline profile: red gradients indicate regressions, while blue gradients indicate improvements. Some call paths may disappear entirely between profiles. These are referred to as elided stacks and occur when optimizations remove code paths or when certain branches stop executing. When elided stacks are present, an "Elided" toggle is displayed, allowing the user to switch between the main differential view and a view showing only the removed paths.

ivonastojanovic · 2026-03-10T19:51:41Z

I’m a bit stuck on what colors we should use for new vs deleted functions.

Right now:

New functions (not in baseline, present in current) are purple.
Deleted functions (shown in elided view, were in baseline, gone in current) are deep red, mostly so they stand out and because red = “removed”.

But I’m not sure if we should even treat them differently from other functions visually.

From a perf perspective it’s kind of confusing:

New functions are technically a regression (the code path now exists and adds cost).
Deleted ones are technically an improvement (code path removed).

So I’m not sure what the right visual semantics are here. Curious what you think 🙂

Lib/profiling/sampling/_flamegraph_assets/flamegraph.js

Lib/test/test_profiling/test_sampling_profiler/test_collectors.py

Lib/test/test_profiling/test_sampling_profiler/mocks.py

Lib/profiling/sampling/stack_collector.py

Lib/profiling/sampling/_flamegraph_assets/flamegraph.css

Lib/profiling/sampling/_flamegraph_assets/flamegraph.js

Lib/profiling/sampling/_flamegraph_assets/flamegraph.css

Lib/test/test_profiling/test_sampling_profiler/test_collectors.py

pablogsal · 2026-03-25T01:12:27Z

I’m a bit stuck on what colors we should use for new vs deleted functions.

Right now:
* New functions (not in baseline, present in current) are purple.

* Deleted functions (shown in elided view, were in baseline, gone in current) are deep red, mostly so they stand out and because red = “removed”.
But I’m not sure if we should even treat them differently from other functions visually.

From a perf perspective it’s kind of confusing:
* New functions are technically a regression (the code path now exists and adds cost).

* Deleted ones are technically an improvement (code path removed).
So I’m not sure what the right visual semantics are here. Curious what you think

Perhaps use purple for both new and elided functions, with different shades to distinguish them:

Bright/saturated purple for new functions (present in current, not in baseline): the current color, kept as-is and then muted/desaturated purple for elided functions?

This unifies the two "out-of-band" categories under a single visual language that reads as "no direct comparison is available", keeping them distinct from the red/blue performance axis entirely. The legend becomes simpler too: instead of explaining four special cases, you just say "purple = this function has no counterpart in the other profile."

Use four gradients consistently for both regressions and improvements. Also avoid calling getComputedStyle(document.documentElement) on every frame, cache the values like we do for the heatmap, since doing this per frame can be expensive for large profiles.

Since baseline and current self time are shown in ms, the diff should also be displayed in ms instead of samples.

We were determining the selected data and applying filtering/processing inside each toggle handler, which led to inconsistencies (e.g. thread filtering) and missed cases. This approach also won’t scale well as we add more toggles. Instead, introduce a single centralized function that returns the active flamegraph data and use it consistently for all updates and processing.

Clean up tests by extracting repeated logic (resolving function names and finding child nodes by name) into helper functions. Also add a test that doesn’t mock BinaryCollector to cover the full round trip.

Use purple (with gradients) for both removed and new functions to unify these “out-of-band” cases under a single visual language, meaning no direct comparison is available. This keeps them clearly separate from the red/blue performance axis and simplifies the legend: “purple = this function has no counterpart in the other profile.”

Opcodes from multiple call paths were silently dropped, only the first path's opcodes were kept. Now they're summed correctly when nodes merge.

picnixz · 2026-03-29T19:16:48Z

Lib/profiling/sampling/_flamegraph_assets/flamegraph.js

+  }
+
+  // Neutral zone: small percentage change
+  if (Math.abs(diff_pct) < 15) {


This may be a stupid question but do you plan to have the sections configurable? I can see some worth where people are only interested in improvements above a certain threshold and the notion of what is a deep / medium / light / negligible improvement is quite subjective.

For instance, if I really needed to estimate the cost of something, vs readability for instance, I would be happy to only catch 3x improvements for instance if the change is complex.

Good question! Let's keep it as-is for this PR and explore making them configurable separately. An interactive slider in the HTML toolbar would probably be the most useful approach since you'd want to experiment without re-running the profiler.

I like the idea as well, I can experiment with it in a separate PR

Yes that's what I had in mind. I would suggest a slide bar with multiple markers that you can move left/right to define the regions of what improvements are. I found https://brechtdr.github.io/enhanced-range-slider-poc/ where it shows a rangegroup in Vite but since it's not a standard element nor something that is implemented natively, we'll need to create the pure HTML/CSS alternative. It would look like:

Lib/profiling/sampling/cli.py

Lib/profiling/sampling/stack_collector.py

pablogsal · 2026-03-29T21:15:52Z

@ivonastojanovic Great job. I have pushed a small commit fixing some small stuff:

Elided node value overwrite that dropped self-samples
Baseline scaling now accounts for sample interval differences
Empty current profiles produce differential metadata instead of silently dropping baseline
Source line collection suppressed for elided nodes (files may not exist)
Dark theme CSS overrides for differential colors
Docs corrected: elided stacks are purple, not "deep red"
Removed redundant baseline_strings key
Early path validation in DiffFlamegraphCollector.init
CLI cleanup: proper set_defaults, direct attribute access
diff_pct no longer misleadingly shows -100% on elided internal nodes with zero self-time

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

Lib/profiling/sampling/stack_collector.py

pablogsal · 2026-03-30T11:31:29Z

Amazing job @ivonastojanovic 🖤

…ktor * upstream/main: (119 commits) pythongh-144270: Make SubElement parent and tag positional-only (pythonGH-144845) pythongh-146558: JIT optimize dict access for objects with known hash (python#146559) pythongh-139922: always run MSVC 64-bit tail-calling CI (pythonGH-146570) pythongh-126835: Fix _PY_IS_SMALL_INT() macro (python#146631) pythongh-146587: fix type slot assignment incase of multiple slots for same name (python#146593) pythongh-138122: Add differential flame graph (python#145785) pythongh-146416: Emscripten: Improve standard stream handling in node_entry.mjs (python#146417) pythongh-146444: Don't package as part of iOS 'build hosts' target (python#146628) pythongh-138850: Add --disable-epoll to configure (pythonGH-145768) pythongh-146444: Make Platforms/Apple/ compatible with Python 3.9 (python#146624) pythongh-138577: Fix keyboard shortcuts in getpass with echo_char (python#141597) pythongh-146556: Fix infinite loop in annotationlib.get_annotations() on circular __wrapped__ (python#146557) pythongh-146579: _zstd: Fix decompression options dict error message (python#146577) pythongh-146083: Upgrade bundled Expat to 2.7.5 (python#146085) pythongh-146080: fix a crash in SNI callbacks when the SSL object is gone (python#146573) pythongh-146090: fix memory management of internal `sqlite3` callback contexts (python#146569) pythongh-145876: Do not mask KeyErrors raised during dictionary unpacking in call (pythonGH-146472) pythongh-146004: fix test_args_from_interpreter_flags on windows (python#146580) pythongh-139003: Use frozenset for module level attributes in _pyrepl.utils (python#139004) pythonGH-146527: Add more data to GC statistics and add it to PyDebugOffsets (python#146532) ...

ivonastojanovic requested a review from pablogsal as a code owner March 10, 2026 19:46

bedevere-app bot added the awaiting review label Mar 10, 2026

ivonastojanovic changed the title ~~Add differential flame graph~~ gh-138122: Add differential flame graph Mar 10, 2026

bedevere-app bot mentioned this pull request Mar 10, 2026

Implement PEP 799 – A dedicated profiling package for organizing Python profiling tools #138122

Open

Add news

2d97700

ivonastojanovic mentioned this pull request Mar 16, 2026

Improve Tachyon UX #142927

Open

11 tasks

pablogsal reviewed Mar 25, 2026

View reviewed changes

ivonastojanovic added 7 commits March 26, 2026 19:52

Fix colors

6e642e4

Use four gradients consistently for both regressions and improvements. Also avoid calling getComputedStyle(document.documentElement) on every frame, cache the values like we do for the heatmap, since doing this per frame can be expensive for large profiles.

Show diff in ms

37289cb

Since baseline and current self time are shown in ms, the diff should also be displayed in ms instead of samples.

Improve tests

c2203d6

Clean up tests by extracting repeated logic (resolving function names and finding child nodes by name) into helper functions. Also add a test that doesn’t mock BinaryCollector to cover the full round trip.

Fix opcode merging in inverted flamegraph view

c0d937f

Opcodes from multiple call paths were silently dropped, only the first path's opcodes were kept. Now they're summed correctly when nodes merge.

fixup! Improve tests

080d2c8

ivonastojanovic requested a review from pablogsal March 29, 2026 15:03

Merge branch 'main' into differential_flamegraph

440d333

picnixz reviewed Mar 29, 2026

View reviewed changes

Update Lib/profiling/sampling/stack_collector.py

f4928b9

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

picnixz reviewed Mar 29, 2026

View reviewed changes

Lib/profiling/sampling/stack_collector.py Show resolved Hide resolved

picnixz and others added 3 commits March 29, 2026 23:21

Update Lib/profiling/sampling/stack_collector.py

5556396

Random fixes

d515f73

Small fix to cli

f753071

pablogsal merged commit f4d3c61 into python:main Mar 30, 2026
48 checks passed

bedevere-app bot removed the awaiting review label Mar 30, 2026

Uh oh!

Conversation

ivonastojanovic commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ivonastojanovic commented Mar 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pablogsal commented Mar 25, 2026

Uh oh!

picnixz Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

pablogsal Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

ivonastojanovic Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

picnixz Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pablogsal commented Mar 29, 2026

Uh oh!

Uh oh!

Uh oh!

pablogsal commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ivonastojanovic commented Mar 10, 2026 •

edited

Loading