-
-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance: decrease runtime overhead for constructing HasTraits (up to 20x faster) #777
Conversation
63f30eb
to
ef4d638
Compare
The ipywidget tests fail, since it adds |
7262a70
to
a3862e0
Compare
I think it is fair to make this change, assuming jupyter-widgets/ipywidgets#3590 gets merged and released and we go green. |
I have another commit that gives a good performance win in a situation where a lot of default are 'static' (like a fixed default value for string and ints). This happens a lot for ipywidgets. Should I push to this branch? or should I wait to get this in first (I could open a new PR and later rebase against main) |
I'd say push here |
I've been benchmarking this with pytest-benchmark using this script: from traitlets import HasTraits, Int, Unicode
class SingleTrait(HasTraits):
foo = Int()
class ManyTrait(HasTraits):
foo0 = Unicode("foo0").tag(sync=True)
foo1 = Unicode("foo1").tag(sync=True)
foo2 = Unicode("foo2").tag(sync=True)
foo3 = Unicode("foo3").tag(sync=True)
foo4 = Unicode("foo4").tag(sync=True)
foo5 = Unicode("foo5").tag(sync=True)
foo6 = Unicode("foo6").tag(sync=True)
foo7 = Unicode("foo7").tag(sync=True)
foo8 = Unicode("foo8").tag(sync=True)
foo9 = Unicode("foo9").tag(sync=True)
def create(N, cls):
for i in range(N):
cls()
def create_and_access(N, cls):
keys = cls.class_trait_names(sync=True)
for i in range(N):
obj = cls()
for key in keys:
getattr(obj, key)
def test_single_trait(benchmark):
benchmark(create, cls=SingleTrait, N=1_000)
def test_many_trait(benchmark):
benchmark(create, cls=ManyTrait, N=1_000)
def test_single_trait_and_access(benchmark):
benchmark(create_and_access, cls=SingleTrait, N=1_000)
def test_many_trait_and_access(benchmark):
benchmark(create_and_access, cls=ManyTrait, N=1_000) ResultsMain a35dd0b
before the last commit 166cef4
ae8c6cd
Looking at the median, we see a >10x improvement for where the last commit really helps with a factor of |
I noticed that ae8c6cd slowed down the other cases because we now did some unneeded checking of presence in a dict, so I fixed that. Now ae8c6cd only gives a speedup compared to 166cef4:
|
Nice work! |
Found another performance issue: We are calling When creating a 1000 HasTraits with 30 traits that do not need an Performance benchmark:Before
After
This saves another ~10% for HasTraits with 10 traits. I've also tested this in a more extreme case with 100 traits, which gives a 25% performance increase. This shows this approach scales much better. Looking at creating a 100 trait object in viztracer, we see: before
afterSo instead of |
After also working on the HasTrait constructor, where we remove a lot of overhead from the
We see another ~25% performance increase. |
f491db4
to
521876d
Compare
I think I don't see any overhead at this moment anymore at constructing HasTraits that are obvious and don't break things, I think this is as fast as I can push it. Possible future optimization might be in removing magic methods, because that would save us in checking with hasattr at runtime. Also, the |
Ha, I love seeing the PR title update. Thanks again for digging in to this! |
2b87079
to
589c1c1
Compare
Per Maarten's ideas, I further dug into performance here. Some quick observations: Overhead from the Bunch classFirst, it's worth noting that the
This has the following results:
So, using the Overhead comes from the get_attr function in bunchThis overhead is clearly from the get_attr function, which is overwritten to call the .getitem function on the underlying dict. You can get some minor performance improvements by setting The other problem here is that user code that has event observers likely expects the AttributeErrors to be turned to KeyErrors, as currently happens in the getitem function. This performance PR shouldn't change the external interface users mess about with -- but explored below are ways to help minimize the impact of bunch performance on tartlets internally. Using the
|
I'll open a PR today against this PR that makes these changes (about 6 characters being changed total), as for traitlets that have their values set frequently, this gives a 20% performance benefit for effectively no change. Based on the above logs, there aren't any more super obvious optimization targets when it comes to set events + change observers. Are there any other common workflows with traitlets that would be worth profiling to see if there are easy wins? |
Not sure if this was already used, but it might be worth running perflint to see if there are any other micro-improvements that can be made (balancing for readability of course). |
589c1c1
to
4324bbb
Compare
Good news: after the ipywidgets 8.0.3 release, those tests pass with this branch. |
4324bbb
to
1c85d41
Compare
@blink1073 do you understand the jupyter_server issue? |
Server 2.0.1 has a fix for that, I kicked the build to see if we pick it up. |
Nope, it must be taking a while to propagate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thank you!
🥳 |
We are using sphinx 4.5.0 with myst-nb 0.17.1 and are now getting |
Thanks for letting us know, how can I trigger this error? So I can test it locally |
1 similar comment
Thanks for letting us know, how can I trigger this error? So I can test it locally |
Regression, noticed in ipython#777 (comment)
Regression, noticed in #777 (comment)
This makes creating ipywidgets about 2x faster (ignoring comm construction).
Related to, but superseded #639