Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get backtrace function names from DWARF sections #3958

Merged

Conversation

ysbaddaden
Copy link
Contributor

@ysbaddaden ysbaddaden commented Jan 30, 2017

Parses the .debug_info section of executables to get the original function name instead of raw symbols (that sometimes can't be found on Linux) in backtraces. Function names (just like line numbers) will always be displayed, unless the program is compiled with --no-debug. For Example:

Before:

0x45a8fa: *Kls#callee1:Nil at /home/github/crystal/spec/std/data/backtrace_sample 3:10
0x432d4e: *callee3:Nil at /home/github/crystal/spec/std/data/backtrace_sample 15:3
0x42f03f: ??? at /home/github/crystal/spec/std/data/backtrace_sample 17:1
0x432c29: main at /home/julien/.cache/crystal/macro70621728.cr 12:15
0x7f2ab0c52f45: __libc_start_main at ??

After:

0x47e65a: callee1 at /home/github/crystal/spec/std/data/backtrace_sample 3:10
0x445dce: callee3 at /home/github/crystal/spec/std/data/backtrace_sample 15:3
0x43825d: __crystal_main at /home/github/crystal/spec/std/data/backtrace_sample 17:1
0x445ca9: main at /home/julien/.cache/crystal/macro85348592.cr 12:15
0x7f9e86ef4f45: __libc_start_main at ??

@0x1eef
Copy link

0x1eef commented Jan 30, 2017

nice 💯
is there a "before" example too? i guess no method names showed up before?

@RX14
Copy link
Contributor

RX14 commented Jan 30, 2017

Does this not have the class/type information like before?

@ysbaddaden
Copy link
Contributor Author

We only push the function name to LLVM. I was surprised at first, but then noticed that Ruby, for example, does exactly the same: only the function name is displayed in backtraces. Along with the file:line information, this has always been enough. I believe this avoids even more noise.

@RX14
Copy link
Contributor

RX14 commented Jan 30, 2017

I think it would be nice to include at least the type name in the backtrace. For example Foo::Bar#foo. I don't think that would get too cluttered.

@ysbaddaden
Copy link
Contributor Author

Not my opinio, but is is out of scope of this PR anyway. Here, I seek the information, I don't set it.

Copy link
Member

@bcardiff bcardiff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ysbaddaden Of course this is really great! I rebased locally on master and the class/return type is present in osx. I am not sure if on linux is present.

There is also a couple of minor comments.

Would you rebase/check them when you have time so we can merge this? Thanks!

@@ -167,10 +171,12 @@ struct CallStack

{% if flag?(:darwin) || flag?(:freebsd) || flag?(:linux) || flag?(:openbsd) %}
@@dwarf_line_numbers : Debug::DWARF::LineNumbers?
@@dwarf_function_names : Array(Tuple(LibC::SizeT, LibC::SizeT, String))?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array({LibC::SizeT, LibC::SizeT, String})? would match the most used style.

code = DWARF.read_unsigned_leb128(@io)
attributes.clear

if abbrev = abbreviations[code - 1]? # abbreviations.find { |a| a.code == abbrev }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left over?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented part is correct but slow, the actual is faster, but maybe wrong. This looks like working, but must be tested more.

A better note with explanation should be added.

output.should match(/#{sample} 15:3/) # callee3
output.should match(/#{sample} 17:1/) # ???
output.should match(/callee1 at #{sample} 3:10/)
output.should match(/callee3 at #{sample} 15:3/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rebase this over master locally and I get that class name and type return information is present in OSX.

The following make things pass in osx:

    output.should match(/Kls#callee1:Nil at #{sample} 3:10/)
    output.should match(/callee3:Nil at #{sample} 15:3/)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the raw symbol, not the function name from the DWARF info; the hint is the :Nil return type, and the class type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e., it fails on macOS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see.

Since in OSX the output of backtrace_sample changes before and after the use of dsymutil I though it was working on OSX.

So, Is this a WIP still?
Is the osx dwarf parsing possible?
Is there anything someone could help with to get that part done?

$ ./bin/crystal build --debug spec/std/data/backtrace_sample -o ./backtrace_sample

$ ./backtrace_sample 
0x104731675: *CallStack::unwind:Array(Pointer(Void)) at ??
0x104731611: *CallStack#initialize:Array(Pointer(Void)) at ??
0x1047315e8: *CallStack::new:CallStack at ??
0x104730765: *caller:Array(String) at ??
0x1047760e1: *Kls#callee1:Nil at ??
0x104730741: *callee3:Nil at ??
0x1047207d7: __crystal_main at ??
0x104730618: main at ??

$ dsymutil --flat ./backtrace_sample 

$ ./backtrace_sample 
0x109bfe0e1: *Kls#callee1:Nil at /.../src/spec/std/data/backtrace_sample 3:10
0x109bb8741: *callee3:Nil at /.../src/spec/std/data/backtrace_sample 15:3
0x109ba87d7: __crystal_main at /.../src/spec/std/data/backtrace_sample 17:1
0x109bb8618: main at /.../src/main.cr 12:15

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens is that the __debug_line section is correctly parsed (from the .dwarf file) then lines in the backtrace originating from exception.cr and callstack.cr files are skipped (they don't matter to the exception). Yet, the __debug_info isn't parsed correctly, and we fallback to the raw symbols (it should be callee1 and callee3 only.

We should be able to parse this section. macOS outputs DWARF v2 so maybe there is some noticeable differences from DWARF v4 as generated by Linux, FreeBSD or OpenBSD; maybe we must parse the __debug_aranges section instead?

I need access to a Mac to debug that. If someone is motivated enough to investigate, looking at the .dwarf Mach-O file, and checking what happens when I parse it. Or maybe send me both an executable and its associated dwarf file, so I can look deeper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In osx the CallStack.decode_function_name(pc) is not used (ie returns nil).
That is because @@dwarf_function_names is empty.

In self.read_dwarf_sections the __debug_info exists but inside the Debug::DWARF::Info.new I think there is something odd as you pointed. Because the @io.tell < end_offset in Debug::DWARF::Info#each is false from the beginning.

info # => Debug::DWARF::Info(
           @abbreviations=nil,
           @address_size=8,
           @debug_abbrev_offset=0,
           @dwarf64=false,
           @io=
            #<File:/path/to/backtrace_sample.dwarf>,
           @offset=566555,
           @ref_offset=0,
           @unit_length=2657,
           @version=4)

{@io.tell, end_offset} # => {731730, 569212}  (this is before the while guard in #each)

@ysbaddaden if you can spot/suggest something I am willing to continue.

Otherwise here you can find the binary and dwarf I am using to test it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing and providing binaries! I'll try to read the sections and dwarf from them —I made a Mach-O and ELF reader that outputs data like readelf does. I should publish it somewhere :-)

@bcardiff bcardiff added this to the Next milestone Mar 17, 2017
@bcardiff bcardiff mentioned this pull request Mar 21, 2017
@ysbaddaden ysbaddaden force-pushed the core-backtrace-function-names branch from 4ba667e to 39c1620 Compare March 31, 2017 13:52
@asterite asterite modified the milestone: Next Apr 7, 2017
@ysbaddaden
Copy link
Contributor Author

I think I found the culprit, a sneaky seek bug. Thanks again for the binaries @bcardiff !

@bcardiff
Copy link
Member

@ysbaddaden this is great! Thanks!

Would you mind rebasing on master so the merge commit references this PR and we have shorter branches merges?

Feel free to merge it directly 👍

@Sija
Copy link
Contributor

Sija commented Apr 14, 2017

Is there anything we could do to preserve class names?
I'd say loosing that part handicaps backtrace call signatures quite a bit.

@ysbaddaden
Copy link
Contributor Author

This is out of scope of this PR (I extract the name, I don't set it). Furthermore other OO languages, such as C++, Ruby or Python, don't store/print the type name, only the function name, and that never bothered me —because file:line already provides enough.

@ysbaddaden ysbaddaden force-pushed the core-backtrace-function-names branch from 321f504 to b46a8c1 Compare April 14, 2017 12:14
@bcardiff bcardiff merged commit 7b43477 into crystal-lang:master Apr 20, 2017
@ysbaddaden ysbaddaden deleted the core-backtrace-function-names branch April 21, 2017 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants