-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-32591: Add native coroutine origin tracking #5250
Conversation
This is fixing an old bug, that for some reason wasn't causing test suite failures until the previous commit. The issue was: Sometimes, for some reason, the tests are calling loop.set_debug() from a random thread, different from the one where the event loop is running. When set_debug() is called and the event loop is running, it attempts to immediately enable or disable debug stuff (previously registering/unregistering the coroutine wrapper, now toggling the coroutine origin tracking). However, it used to do this *in the thread where set_debug() was called*. Since the toggling uses a save/restore pattern, we'd end up saving the value in one thread and then restoring it in another, while never restoring it in the first thread, etc. Now we always enable/disable the debugging mode inside the event loop thread, which makes much more sense. And now the tests are passing again.
Lib/warnings.py
Outdated
msg_lines.append("Coroutine created at (most recent call last)\n") | ||
msg_lines += traceback.format_list(list(extract())) | ||
msg = "".join(msg_lines).rstrip("\n") | ||
warn(msg, category=RuntimeWarning, stacklevel=2, source=coro) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small, easy to miss point here: passing source=coro
here means that if the user happens to have tracemalloc
enabled, its origin traceback will be included in the warning. Kinda neat! It does mean if you have both set_coroutine_origin_tracking_depth
and tracemalloc
enabled you'll get two at-least-partially-redundant tracebacks in the same warning. I don't know that I care that much about this but FYI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a redundant warning is OK. tracemalloc
is enabled quite rarely. I also think what you wrote in this comment should be reflected in a code comment in case someone debugs the double warnings issue :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks good.
A couple more requests:
-
Let's raise a
DeprecationWarning
insys.set_coroutine_wrapper
. And we need to document that it has been deprecated and is scheduled to be removed in 3.8. -
In what's new in 3.7, we need to add an entry about the deprecation of sys.set_coroutine_wrapper.
-
Minor nit: please fix C code style, mainly, add braces for single-statement "if" branches.
Doc/library/sys.rst
Outdated
|
||
Returns the old value of *depth*. | ||
|
||
This setting is thread-local. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thread-local -> thread-specific
Doc/library/inspect.rst
Outdated
| | cr_origin | where coroutine was | | ||
| | | created, if coroutine | | ||
| | | origin tracking is enabled| | ||
+-----------+-------------------+---------------------------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a link to the set_coroutine_origin_tracking_depth
documentation snippet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that was my original thought. The problem is that
:func:`set_coroutine_origin_tracking_depth`
is too long to fit in the ascii-art box. So... either we need a shorter name for the function, or we need to redraw this whole giant table, and I couldn't think of a satisfactory way to do either in the 2 minutes I spent thinking about it :-). Any suggestions?
I guess we could use that weird ReST substitution thing? I'm not sure how that works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc/library/sys.rst
Outdated
enabled, the ``cr_origin`` attribute on coroutine objects will | ||
contain a list of (filename, line number, function name) tuples | ||
describing the traceback where the coroutine object was created. | ||
When disabled, ``cr_origin`` will be None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to specify how the list is ordered.
Include/warnings.h
Outdated
@@ -56,6 +56,10 @@ PyErr_WarnExplicitFormat(PyObject *category, | |||
#define PyErr_Warn(category, msg) PyErr_WarnEx(category, msg, 1) | |||
#endif | |||
|
|||
#ifndef Py_LIMITED_API | |||
PyAPI_FUNC(void) _PyErr_WarnUnawaitedCoroutine(PyObject *coro); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking we don't need PyAPI_FUNC
here, as warnings.c|h
will be linked with genobject.c
and we are not going to use it in extensions.
Lib/asyncio/base_events.py
Outdated
enabled = bool(enabled) | ||
if self._coroutine_wrapper_set == enabled: | ||
def _set_coroutine_origin_tracking(self, enabled): | ||
if enabled == self._coroutine_origin_tracking_enabled: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if enabled and self._coroutine_origin_tracking_enabled:
?
I don't like using ==
for bools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that would be different – the no-==
equivalent would be:
if ((enabled and self._coroutine_origin_tracking_enabled) or (not enabled and not self._coroutine_origin_tracking_enabled)):
...
I find the ==
version easier to read. ("If the requested state equals the current state, return without doing anything.")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, then I'd cast them both to bool
: if bool(enabled) == bool(self._coroutine_origin_tracking_enabled):
Objects/genobject.c
Outdated
|
||
if (depth == 0) { | ||
((PyCoroObject *)coro)->cr_origin = NULL; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, per PEP 7:
if ( ... ) {
}
else {
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I'd like this whole else
branch to be in a separate helper function.
Objects/genobject.c
Outdated
} else { | ||
PyObject *origin = PyList_New(depth); | ||
/* Immediately pass ownership to coro, so on error paths we don't have | ||
to worry about it separately. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is neat, but it makes the code harder to read and it's not how we usually do it in CPython. While reviewing the code I've already added 2 comments asking you to add Py_DECREF(origin)
. Please do it the long way: create, populate, cleanup on errors, and finally assign it to cr_origin
. After all it's just two places where an extra decref is needed.
Objects/genobject.c
Outdated
frame = frame->f_back; | ||
} | ||
/* Truncate the list if necessary */ | ||
if (PyList_SetSlice(origin, i, depth, NULL) < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call PyList_SetSlice
only when needed?
Python/_warnings.c
Outdated
*/ | ||
fn = get_warnings_attr(&PyId__warn_unawaited_coroutine, 1); | ||
if (fn) { | ||
res = PyObject_CallFunctionObjArgs(fn, coro, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just add Py_DECREF(fn)
after this line.
Python/ceval.c
Outdated
int | ||
_PyEval_SetCoroutineOriginTrackingDepth(int new_depth) | ||
{ | ||
PyThreadState *tstate = PyThreadState_GET(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add assert(new_depth >= 0)
for cases when somebody uses the C API directly (sys.set_coroutine_origin_tracking_depth has a check for Python users).
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Objects/genobject.c
Outdated
PyObject *origin = PyList_New(depth); | ||
/* Immediately pass ownership to coro, so on error paths we don't have | ||
to worry about it separately. */ | ||
((PyCoroObject *)coro)->cr_origin = origin; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that exposing list object directly through cr_origin
means that users can mutate it. How about we cycle through the frames two times: 1st time to determine the real depth we can capture; 2nd time to populate a cr_origin
tuple (not list). Traversing a relatively short chain of frames should be pretty fast, and this is a debug mode after all. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option would be to keep things as is, but to change cr_origin
to a getter, and return a copy of the list every time. But I like the tuple idea more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I originally left out the Py_VISIT
call and then realized that mutation was possible, so figured I'd put it in just in case :-).
I don't think it matters too much whether we expose a tuple, copy the list, or just keep the current thing where someone can mutate it if they really want to. (The warnings code is robust against such shenanigans: worst case it'll print an error b/c cr_origin
was corrupt, and then the real warning without a traceback.) Given this I have a mild preference not to do the tuple thing, because it leads to the most complicated C code. Doing an extra loop is totally doable of course, but it's also plenty complicated to hide some kind of off-by-one bug or something.
@@ -1512,6 +1542,7 @@ static PyMethodDef sys_methods[] = { | |||
{"call_tracing", sys_call_tracing, METH_VARARGS, call_tracing_doc}, | |||
{"_debugmallocstats", sys_debugmallocstats, METH_NOARGS, | |||
debugmallocstats_doc}, | |||
SYS_SET_COROUTINE_ORIGIN_TRACKING_DEPTH_METHODDEF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add sys.get_coroutine_origin_tracking_depth
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, I guess. I was thinking this is sufficiently arcane functionality that I could be lazy and get away with the set
-returns-the-old-value pattern (cf. signal.set_wakeup_fd
), but you're probably right that set
+get
is a nicer API.
… On Sat, Jan 20, 2018 at 8:51 PM, Yury Selivanov ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In Doc/library/inspect.rst
<#5250 (comment)>:
> @@ -215,6 +215,10 @@ attributes:
+-----------+-------------------+---------------------------+
| | cr_code | code |
+-----------+-------------------+---------------------------+
+| | cr_origin | where coroutine was |
+| | | created, if coroutine |
+| | | origin tracking is enabled|
++-----------+-------------------+---------------------------+
Does this work: http://docutils.sourceforge.net/docs/user/rst/quickref.
html#hyperlink-targets ?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5250 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAlOaO4NcvwZwIJbftzlLOSgyObtU2imks5tMsJXgaJpZM4RlXZz>
.
--
Nathaniel J. Smith -- https://vorpus.org <http://vorpus.org>
|
Okay, I think that last push addresses everything except:
|
I still don't like the idea of returning the original mutable list :( Let's do some extra work and make this list a tuple of tuples. It's better than have a getter for cr_origin+copy. And let's have a 'get' method, otherwise it's hard to capture the current sys settings for debug purposes or whatnot. |
I'm not a huge fan b/c tuples shouldn't be used for homogenous variable-length sequences, but... fine |
I have made the requested changes; please review again |
Thanks for making the requested changes! @1st1: please review the changes made to this pull request. |
Yeah, I know, but we don't have a frozenlist. |
After this PR, asyncio no longer uses sys.set_coroutine_wrapper.
There is one non-trivial semantic change: before in asycnio debug
mode, unawaited coroutine warnings were logged using
logger.error(...). Now they generate a regular warnings-module
warning.
This is best reviewed one commit at a time.
Note that there's also a fix for a very subtle bug in
BaseEventLoop.set_debug; see the 4th commit. (It's the one where the
commit message is 4x longer than the diff.)
Not done here, could be added here or in followups:
https://bugs.python.org/issue32591