Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialization of generator iteration and coroutine await #457

Closed
markshannon opened this issue Sep 8, 2022 · 1 comment
Closed

Specialization of generator iteration and coroutine await #457

markshannon opened this issue Sep 8, 2022 · 1 comment

Comments

@markshannon
Copy link
Member

Currently iterating over a generator or awaiting a coroutine goes through several layers of C code, performing lots of wasteful transformations to do little more than make a jump in the bytecode.

By specializing FOR_ITER for generators, and SEND for coroutines we can remove this overhead.

However, we will either need trampolines to fix up returns, or to change the behavior of RETURN_VALUE in generators and coroutines

The following assumes that python/cpython#96319 has been merged.

Iterating over a generator

The FOR_ITER bytecode pushes the yielded value when __next__ returns a value, so that's simple enough. YIELD_VALUE already does that. The complication is that RETURN_VALUE pushes a value, but we actually need to POP the generator. So we need an additional two POPs after the return.
We can either change the way return works for generators, adding a new instruction GEN_RETURN, change the way FOR_ITER works, some combination of those, or insert a trampoline.

Inserting a trampoline is relatively expensive, so I'd like to do this without one.
First, we can implement GEN_RETURN which would cleanup the generator, and replace the caller's TOS with the returned value.
Then we change FOR_ITER to not pop the iterator on completion.
A for loop will now compile to:

  FOR_ITER end
  body
  ...
end:
  POP_TOP

This cost one more POP_TOP per loop, but simplifies FOR_ITER a bit.

We can then specialize FOR_ITER for generators in a straightforward fashion, as no cleanup shim will be needed.

Awaiting a coroutine

SEND operates much like FOR_ITER, but the transformation is simpler, as we don't need to POP the result.
await compiles exactly as before, as GEN_RETURN leaves the result on the caller's stack.

The new bytecodes

GEN_RETURN

Does the following:

  • Pops the TOS from the caller (will be the generator)
  • Pushes the result to the caller's stack
  • Pops and destroys the current frame
  • Resumes the caller at next_instr + gen_return_offset

FOR_ITER_GENERATOR

Does the following:

  • Deopts if iterator is not a generator
  • Deopts if the generator is not suspended
  • Sets the current frame's gen_return_offset to oparg
  • Pushes the generator's frame
  • Pushes None to the generator's stack
  • Resumes execution of the generator

SEND_COROUTINE

Does the following:

  • Deopts if awaitable is not a coroutine
  • Deopts if the coroutine is not suspended
  • Sets the current frame's gen_return_offset to oparg
  • Pop the value from the callers' stack
  • Pushes the coroutine's frame
  • Pushes the value to the coroutine's stack
  • Resumes execution of the coroutine
@markshannon
Copy link
Member Author

Having GEN_RETURN pop the stack of the caller is a bit weird and doesn't play nicely with localized optimizations, or PyEval_EvalFrame(), as the latter would have a potentially surprising side-effect.

So:

FOR_ITER end
  body
  ...
end:
  POP_TOP

will become

FOR_ITER end
  body
  ...
end:
  END_FOR

Where END_FOR is equivalent to POP_TOP; POP_TOP but allows us to handle pushing NULL to the stack.

Having a special instruction also tells the bytecode compiler to leave it alone, to enable FOR_ITER to skip it, retaining the same efficiency as the old FOR_ITER.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant