Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-44446: set lineno in generator instructions #26782

Closed
wants to merge 1 commit into from

Conversation

FFY00
Copy link
Member

@FFY00 FFY00 commented Jun 18, 2021

When generator instructions were added to bytecode, lineno was always set to -1,
instead of the line number. This patch sets the lineno, if there is one.

Signed-off-by: Filipe Laíns [email protected]

https://bugs.python.org/issue44446

@FFY00 FFY00 requested a review from markshannon as a code owner June 18, 2021 00:02
@FFY00 FFY00 changed the title bpo-44446: set lineno in prefix instructions bpo-44446: set lineno in generator instructions Jun 18, 2021
@FFY00 FFY00 force-pushed the bpo-44446-set-lineno branch 7 times, most recently from d078463 to 640f0a0 Compare June 18, 2021 00:51
@FFY00 FFY00 marked this pull request as draft June 18, 2021 02:11
When generator instructions were added to bytecode, lineno was always set to -1,
instead of the line number. This patch sets the lineno, if there is one.

Signed-off-by: Filipe Laíns <[email protected]>
@FFY00
Copy link
Member Author

FFY00 commented Jun 19, 2021

@markshannon I do not understand why this patch is causing the behavior that it is, I would have guessed that this is a fairly innocuous change, do you have any idea of what is happening?
With that said, I am not sure this is the correct approach. It seems to me like we should be propagating the i_lineno in propogate_line_numbers, but b_nofallthrough is set, which is preventing that. I don't have a great understanding of this code, so I thought it was probably easier, and perhaps even better anyway, to fill in i_lineno when adding the instruction. But it seems I am doing something wrong.
So, if you could give me some pointers, it would be great!

@pablogsal
Copy link
Member

See #26781 (comment)

@pablogsal
Copy link
Member

pablogsal commented Jun 20, 2021

See #26781 (comment)

Indeed, I am right, this is an ownership issue with the locals dictionary:

#11 0x0000555555796cc9 in PyFrame_FastToLocalsWithError (f=f@entry=0x7ffff72a4a50) at Objects/frameobject.c:1009
1009                if (PyObject_SetItem(locals, name, value) != 0) {
(gdb) p locals
$1 = (PyObject *) 0x7ffff6a8a930
(gdb) p locals->ob_refcnt
$2 = 0
(gdb)

@markshannon, please take a look at this seems to be a regression either of PEP 626 or the latest changes in the frames/code object. I am not fully sure if you are accounting on this but I honestly don't see how we could have dropped the ownership of the locals dict when PyFrame_FastToLocalsWithError is called from the trace function.

@FFY00
Copy link
Member Author

FFY00 commented Jun 21, 2021

I think the issue might be in the new 3.11 bytecode.

Before falling out to PyFrame_FastToLocalsWithError, we are triggering

cpython/Python/ceval.c

Lines 2473 to 2476 in a317778

_PyErr_Format(tstate, PyExc_TypeError,
"can't send non-None value to a "
"just-started %s",
gen_kind[oparg]);

After some digging, it seems like we are poping from an empty stack. What we end up popping in the test that triggers the assert happens to be the locals dict, which then gets decref'ed, and blows up afterwards.

--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -2457,6 +2457,7 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, PyFrameObject *f, int throwflag)
         }

         case TARGET(GEN_START): {
+            assert(!EMPTY());
             PyObject *none = POP();
             Py_DECREF(none);
             if (!Py_IsNone(none)) {
python: Python/ceval.c:2460: _PyEval_EvalFrameDefault: Assertion `!EMPTY()' failed.

I have also managed to trigger another issue.

--- a/Lib/test/test_sys_settrace.py
+++ b/Lib/test/test_sys_settrace.py
@@ -1447,6 +1447,7 @@ def test_jump_backwards_out_of_with_block(output):
     async def test_jump_backwards_out_of_async_with_block(output):
         output.append(1)
         async with asynctracecontext(output, 2):
+            global test
             output.append(3)

     @jump_test(2, 5, [5])
python: Python/ceval.c:2248: _PyEval_EvalFrameDefault: Assertion `EMPTY()' failed.

assert(EMPTY());

@FFY00
Copy link
Member Author

FFY00 commented Jun 21, 2021

If you have any tips on how to use lltrace, that'd be very helpful, I haven't managed to figure that out 😅

@pablogsal
Copy link
Member

If you have any tips on how to use lltrace, that'd be very helpful, I haven't managed to figure that out 😅

Yeah, wait for #26822 to land and then just set __ltrace__ = True in the scope of the function you want to check. You need to run in debug mode.

@markshannon
Copy link
Member

This seems to be obsolete.

@pablogsal Can you produce that issue with low refcounts for locals on main, or just here?
Setting the line number on a frame is horribly fragile. Thankfully it can be executed in a debugger.

@pablogsal
Copy link
Member

This seems to be obsolete.

@pablogsal Can you produce that issue with low refcounts for locals on main, or just here?
Setting the line number on a frame is horribly fragile. Thankfully it can be executed in a debugger.

Only here, but the changes in this PR shouldn't manifest that way, so something is clearly going on.

@markshannon
Copy link
Member

Only here, but the changes in this PR shouldn't manifest that way, so something is clearly going on.

Why? If you mess around with the line numbers, it hardly seems surprising that frame.f_lineno = ... will break.

@markshannon markshannon removed their request for review July 12, 2021 09:22
@pablogsal
Copy link
Member

Only here, but the changes in this PR shouldn't manifest that way, so something is clearly going on.

Why? If you mess around with the line numbers, it hardly seems surprising that frame.f_lineno = ... will break.

But what breaks I'd not frame.f_lineno, what breaks is a invalid reference count because the eval loop pops an incorrect object and decrefs it too much. This should not happen because you mess with the line numbers.

Check the rest of the conversation in this issue

@markshannon
Copy link
Member

If you change the line number, you are changing the instruction pointer. If the stack depth is wrong, anything can happen.

@FFY00
Copy link
Member Author

FFY00 commented Jul 19, 2021

Sorry to bother, is there any documentation on how this works?

@FFY00 FFY00 closed this Jul 19, 2021
@markshannon
Copy link
Member

I'm afraid not.
It's held together with sticky tape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants