-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deduplicate bytecode dependencies used by both creation and deployed object #15178
Comments
As pointed out by @ekpyron, we do have some kind of deduplication at assembly level already: solidity/libevmasm/Assembly.cpp Lines 1053 to 1071 in 5da0f47
Before fixing this we should check why it doesn't kick in here. Does the generated bytecode end up being different? |
It may be that |
This issue has been marked as stale due to inactivity for the last 90 days. |
The performance aspect of this has been solved by the introduction of optimized IR caching (#15267 / #15179). I checked the repro on 0.8.27 and now both objects actually come from cache. What remains is the bytecode duplication. Unfortunately, looks like this is something that won't be possible to address on EOF. Looking at |
Abstract
When a contract deploys another contract (via
new
) or accesses its bytecode (via.runtimeObject
or.creationCode
), the compiler embeds that bytecode in the accessing contract. Depending on whether the contract is accessed at creation time or at runtime, its bytecode ends up as a subassembly of the creation or runtime assembly, respectively. However, when it is accessed at both times it ends up being included in both places.This happens in both the legacy and the IR pipeline.
Details
This behavior can be clearly seen in the IR codegen. It's clear that there's no attempt at deduplication:
solidity/libsolidity/codegen/ir/IRGenerator.cpp
Line 184 in 8a97fa7
solidity/libsolidity/codegen/ir/IRGenerator.cpp
Line 206 in 8a97fa7
Motivation
This duplication not only increases bytecode size but also requires the compiler to optimize the same code twice, which is especially problematic for the IR pipeline.
How to reproduce
solc test.sol --bin --optimize | fold --width 115
Bytecode of
C
clearly is included inside bytecode ofD
twice. The long string included in the source can be seen twice as two long sequences of0x44
.The effect is similar with
--via-ir
, though the string is not as easily visible. Still, it can be easily confirmed by removingD
's constructor - it cuts the side ofD
' bytecode almost in half.Possible solutions
Backwards Compatibility
I can't see any backwards compatibility issues here.
The text was updated successfully, but these errors were encountered: