-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
gh-143004: Fix UAF in _collections Counter update fast path #143044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
StanFromIreland
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create an issue and add a blurb.
@StanFromIreland, it's #143004. @Kaushalt2004, while technically this seems to be correct, this is not something that solves real-world problem. Could you please provide benchmarks to show how this will impact more typical use-cases for Counter()? |
|
I ran a small microbenchmark to estimate the impact of the Py_INCREF(oldval) / Py_DECREF(oldval) pair added around PyNumber_Add() in the dict fast-path. Setup Windows x64 Before: origin/main (b8d3fdd) subst X: "C:\Users\...\New folder (12)" update(unique) from empty: 26218.0 → 26438.0 (+0.84%) Deltas are within ~±1% on this machine; no meaningful regression observed, including the “preseeded” case that exercises the new INCREF/DECREF on every update. Setup Windows x64 Before: origin/main (b8d3fdd) subst X: "C:\Users\...\New folder (12)" update(unique) from empty: 26218.0 → 26438.0 (+0.84%) Deltas are within ~±1% on this machine; no meaningful regression observed, including the “preseeded” case that exercises the new INCREF/DECREF on every update. |
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
I still see no benchmark code. PS: @Kaushalt2004, do not click Update branch without a good reason because it notifies everyone watching the PR that there are new changes, when there are not, and it uses up limited CI resources. |
| @@ -0,0 +1 @@ | |||
| gh-143004: Fix a potential use-after-free in collections.Counter.update() when user code mutates the Counter during an update. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| gh-143004: Fix a potential use-after-free in collections.Counter.update() when user code mutates the Counter during an update. | |
| Fix a potential use-after-free in collections.Counter.update() when user code | |
| mutates the Counter during an update. |
|
Thanks for the feedback. I’ve removed the benchmark script from the PR as requested. Benchmark methodology and results were shared earlier in comments; Please let me know if you’d like the benchmark rerun using pyperf |
| def __new__(cls): | ||
| return int.__new__(cls, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is intentional.
The custom new ensures the instance starts at 0 and that add is exercised during Counter.update(), which is required to reproduce the re-entrant mutation scenario reliably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not it have zero value by default? What happens when remove the user constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While int() defaults to zero, the explicit new ensures a deterministic starting value for the subclass instance and guarantees that add is exercised during Counter.update().
This makes the re-entrant mutation scenario reliable; removing it can make the test fragile or fail to reproduce the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is it possible? What prevent __add__() to be called if the object use the default constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this test, the intent is indeed that the Counter stores an instance of Evil.
The explicit_new is not because other values are expected, but to make that guarantee explicit and robust.
Without it, CPython is free to materialize or replace the value with a plain int during Counter. update(), which would bypass the overridden_add and make the test implementation-dependent.
The custom constructor ensures the value stored is definitively an Evil instance and
that PyNumber_Add() dispatches to
Evil._add reliably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please show how the Evil instance is replaced with a plain int and how the test without a custom __new__ is not robust?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current implementation, the test intends that the value stored is an instance of Evil, and that Evil.add is invoked during Counter.update().
The explicit new is not because a plain int is known to be substituted today, but to make this invariant explicit and robust.
Without it, the test would rely on incidental behavior of how the initial value is materialized and preserved, which CPython does not guarantee and which could change with future optimizations.
Defining new ensures the value is definitively an Evil instance and that PyNumber_Add() reliably dispatches to the overridden add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without it, the test would rely on incidental behavior of how the initial value is materialized and preserved, which CPython does not guarantee
Why do you think that Python (not even CPython!) don't guarantee this? Subclass constructor per default will return a subclass instance, not some base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re right — Python guarantees that constructing a subclass returns an instance of that subclass, and the test does not currently demonstrate any loss of subclass identity.
The custom new is therefore unnecessary, and the test can be simplified without losing coverage.
| @@ -0,0 +1,2 @@ | |||
| Fix a potential use-after-free in collections.Counter.update() when user code | |||
| mutates the Counter during an update. (gh-143004) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue number is already included in the file name.
Misc/NEWS.d/next/Library/2025-12-22-00-00-00.gh-issue-143004.uaf-counter.rst
Outdated
Show resolved
Hide resolved
…af-counter.rst Co-authored-by: Serhiy Storchaka <[email protected]>
| def __new__(cls): | ||
| return int.__new__(cls, 0) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def __new__(cls): | |
| return int.__new__(cls, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional.
Overriding new ensures the value starts at 0 and that add is invoked during Counter.update(), which is required to reliably trigger the re-entrant mutation scenario covered by this test.
Misc/NEWS.d/next/Library/2025-12-22-00-00-00.gh-issue-143004.uaf-counter.rst
Outdated
Show resolved
Hide resolved
…af-counter.rst Co-authored-by: Serhiy Storchaka <[email protected]>
Problem
_collections._count_elements has a fast-path for dict that reads oldval via _PyDict_GetItem_KnownHash() (borrowed reference) and then calls PyNumber_Add(oldval, one). PyNumber_Add() can run arbitrary Python code (e.g. add), allowing re-entrant mutation such as Counter.clear(). That can drop the last reference to oldval while the C code still holds a borrowed pointer, resulting in a use-after-free (ASAN).
Reproducer
Fix
In the fast dict path, keep oldval alive across PyNumber_Add() by temporarily owning a reference:
Py_INCREF(oldval); newval = PyNumber_Add(oldval, one); Py_DECREF(oldval);
This prevents oldval from being freed if user code re-enters and clears/mutates the mapping during the add.
Changes
_collectionsmodule.c: INCREF/DECREF oldval around PyNumber_Add() in the fast path.
test_collections.py: add regression test test_update_reentrant_add_clears_counter.
Tests
./python -m test test_collections -k reentrant_add -v
Counter.updatevia re-entrant__add__#143004