Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance enhancements #557

Merged
merged 2 commits into from
Apr 23, 2024
Merged

Performance enhancements #557

merged 2 commits into from
Apr 23, 2024

Conversation

sevaa
Copy link
Contributor

@sevaa sevaa commented Apr 22, 2024

I made the following enhancements, in the rough order of effect:

  • Rewritten the LEB128 parsers to be less idiomatic but more streamlined
  • Instantiated the handful of parsers for scalars that are used in all the one off struct_parse calls, and replaced all points of usage
  • Inlined some stuff on the hot path
  • Removed unneeded stream_preserve for streams in auxiliary sections

elftools/common/construct_utils.py Outdated Show resolved Hide resolved
elftools/dwarf/die.py Show resolved Hide resolved
@eliben eliben merged commit 75f0f98 into eliben:main Apr 23, 2024
4 checks passed
@sevaa
Copy link
Contributor Author

sevaa commented Apr 26, 2024

@eliben

On the topic of performance, there is a disconnect between ELF proper parsing and DWARF parsing. ELF parsing, at least in the typical use case, happens against a file stream. DWARF parsing is all in memory (barring exotic scenarios where users monkeypatch or subclass ELFFile), and yet we still treat DWARF sections as streams instead of byte arrays in memory that they really are. Rather than read()ing (and constructing bytes objects) all the time, the parser could have worked with bytes elements and slices. The workhorse method of parsing primitive datatypes - struct..unpack() - can take a buffer and an offset.

But that would pretty much mean abandoning construct for the DWARF part of it. As far as I understand, construct was meant to work like protobuf, working with wire and file streams.

@eliben
Copy link
Owner

eliben commented Apr 29, 2024

Yes, I'd rather keep this capability. A more interesting direction would be to enable more incremental DWARF parsing without slurping whole sections into memory

@sevaa
Copy link
Contributor Author

sevaa commented Apr 29, 2024

Reliance on construct is not a capability per se, it's more of an implementation detail. The DWARF parser mostly spits out Python native structures - lists, namedtuples and such. Off the top of my head, the only Construct objects that we get in the public API are CU headers. And the interface of Construct, one there x["a"] and x.a are equivalent, is hardly magical.

OBTW, construct technically has a parse-from-buffer method. Only it works by constructing a BytesIO around it, then parses the stream :) Overhead wise, not a net win.

Anyway, were I to implement buffer style parsing, I won't be getting rid of construct altogether - compound datatypes can stay. I'd implement a buffer+position object that walks like a stream but doesn't quite quack like a stream, and I'd teach the primitive type parsers (there is just a handful) to recognize those. The compound parsers - Struct, Array - can stay as they are.


I gave some thought to incremental parsing of DWARF. One big obstacle to that would be the transforms that DWARF sections undergo - two decompression hooks, and relocations (also the phantom bytes thing, but that's an edge case by far). Compressed streams are not seekable. Also, I'm assuming we are talking about support for extra large binaries here; were were to implement a no-slurp mode, I'm afraid it'd have to be accompanied by a no-cache mode. At least no CU/DIE cache - lack of an abbrev cache, I'm afraid, would be to costly.

But let's assume there are no transforms. I had three possible designs in mind:

  • Implement a "file window" stream class, one that wraps a proper file stream, maintains its own position/length and seeks as needed
  • Open the file several times so that there is a file stream per section (dup won't help, dupped handles share their current position)
  • Memory map the file, construct ByteIO objects around sections and trust the OS to do the right thing

Which one do you think sounds the least crazy?

@eliben
Copy link
Owner

eliben commented May 1, 2024

The mmap idea sounds intriguing

@sevaa
Copy link
Contributor Author

sevaa commented May 1, 2024

Basically already done this in #481 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants