Skip to content

Conversation

@yihong0618
Copy link
Contributor

@yihong0618 yihong0618 commented Dec 21, 2025

This patch fix all re-entrant flush problem
by adding an attr flushing
and. refactor the file check to _textiowrapper_flush

all crash problem in 143008 can be fix cc @jackfromeast can you help to check?
thank you very much!

Copy link
Contributor

@cmaloney cmaloney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like adding a new flushing variable here. To me it doesn't solves the core issue: Code assumes self->buffer is non-NULL because of checks which become invalid by the time it's actually dereferenced/used (TOCTOU general bug class). I think TextIOWrapper already has the information we need to be looking at / checking for (CHECK_ATTACHED macro / self->detached, self->buffer == NULL, and self->ok, ...). Those seem to be tracking very close to the same thing and should be usable to solve this group of issues.

I think it would be better here to use CHECK_ATTACHED just before self->buffer is used / passed to a call / returned; making sure there aren't any other calls to interpreter methods betwen the CHECK_ATTACHED + self->buffer usage / dereference.

I suspect BufferedIO will need a similar set of fixes: It also has a concept of an underlying stream that can be "detached" and which user code may detach during various operations.

Some comments on specifics inline, the test_textio ones I think are important in further changes, implementation details less so if not adding/using the flushing member.

# instead of crashing.
wrapper = None

class BadRaw(self.RawIOBase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a number of mock helpers defined in test_io/utils.py to make it so can reduce this to just the methods that matter. I think self.MockRawIO will work here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

with self.subTest(name):
wrapper = self.TextIOWrapper(EvilBuffer(BadRaw()), encoding='utf-8')
self.assertRaisesRegex(RuntimeError, "reentrant", method)
wrapper = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make the gc / closing here more explicit? del wrapper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

static inline int
_textiowrapper_flush(textio *self)
{
self->flushing = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't this be in _io_TextIOWrapper_flush_impl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can, the design is cover it in _PyFile_Flush

PyObject *ret;
self->flushing = 1;
do {
ret = PyObject_CallMethodOneArg(self->buffer, &_Py_ID(write), b);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

every loop iteration / write call could execute arbitrary interpreter code that could invalidate self->buffer so likely need to check on each loop iteration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will check, but I think maybe not need

@yihong0618
Copy link
Contributor Author

I don't like adding a new flushing variable here. To me it doesn't solves the core issue: Code assumes self->buffer is non-NULL because of checks which become invalid by the time it's actually dereferenced/used (TOCTOU general bug class). I think TextIOWrapper already has the information we need to be looking at / checking for (CHECK_ATTACHED macro / self->detached, self->buffer == NULL, and self->ok, ...). Those seem to be tracking very close to the same thing and should be usable to solve this group of issues.

I think it would be better here to use CHECK_ATTACHED just before self->buffer is used / passed to a call / returned; making sure there aren't any other calls to interpreter methods betwen the CHECK_ATTACHED + self->buffer usage / dereference.

I suspect BufferedIO will need a similar set of fixes: It also has a concept of an underlying stream that can be "detached" and which user code may detach during various operations.

Some comments on specifics inline, the test_textio ones I think are important in further changes, implementation details less so if not adding/using the flushing member.

I think you are right but CHECK_ATTACHED in the middle of code is not a good name and easy to forget add it.

@yihong0618
Copy link
Contributor Author

And I can change it to my first solution or waiting for other discuss here?

@cmaloney
Copy link
Contributor

I agree CHECK_ATTACHED is easy to forget and I'd argue this bug is the result of some cases missing it.. It is a relatively small change though and matches the rest of the textio.c code style / convention. I have been working towards changing textio.c more generally, but that's too big for a likely back-ported bugfix.

Can you explain what your first solution is? I'm not understanding what you're referring to.

@yihong0618
Copy link
Contributor Author

I agree CHECK_ATTACHED is easy to forget and I'd argue this bug is the result of some cases missing it.. It is a relatively small change though and matches the rest of the textio.c code style / convention. I have been working towards changing textio.c more generally, but that's too big for a likely back-ported bugfix.

Can you explain what your first solution is? I'm not understanding what you're referring to.

#143008 (comment)

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break the case when flush() was intentionally overridden.

I am not also sure that it fixes the core issue, it just makes a particular reproducer no longer working.

@yihong0618
Copy link
Contributor Author

This will break the case when flush() was intentionally overridden.

I am not also sure that it fixes the core issue, it just makes a particular reproducer no longer working.

will dig it, thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants