Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve control flow decompilation (fixes #1133) #1176

Merged
merged 2 commits into from
Jun 13, 2018

Conversation

Chicken-Bones
Copy link
Contributor

@Chicken-Bones Chicken-Bones commented Jun 12, 2018

A solution to #1133 involving repeated inlining, and block exit merging with appropriate keyword prioritisation

Add a ControlFlowSimplification step after SplitVariables
Enable dead code removal in some unit tests
@dgrunwald
Copy link
Member

I haven't had time to look at the code yet; but I compared the decompiler output with a few test assemblies, and it's clear that this is a massive improvement. Thanks!

Total number of gotos in the roundtrip tests is down from 781 to 200.
I'll try to get this reviewed ASAP so that we can include this in the 3.2 release.

@dgrunwald dgrunwald self-assigned this Jun 12, 2018
@Chicken-Bones
Copy link
Contributor Author

Chicken-Bones commented Jun 12, 2018

That's great to hear. Sorry it took so long, I had to put it on hold for a week and a half as finals hit for uni.

I'm not sure what your release schedule is, but there's two more pieces coming, related to switch reconstruction, one of which I can have done in 2 days time.

You'll notice that part of the new ConditionDetection involves guessing Branch targets which are continue keywords, as those have different ordering priorities. I'll need to do something similar to switch detection, which means separating it from and moving it after LoopDetection. This would produce another BlockTransform phase, and I don't understand the system enough to know the performance or flow impacts yet. What would you suggest?

@dgrunwald
Copy link
Member

Oh right, switch blocks can have two exits (break+continue). Currently it's using the loop logic for finding a single exit point, so that can't work correctly.
But whatever the chosen exit point for switch is, it can always be reached via break;, so the ConditionDetection logic based on an exit point at the end of the block doesn't apply either.
But the transform ordering is tricky, since a single "continue-block" only exists after the big BlockTransform combined multiple blocks into a single block (consider a continue block involving ?. or array initializers...).
But I don't think continue; within switch within a for loop occur frequently enough that we should care about this case... (the more common continue; within switch within while/foreach loops should also be possible with the current LoopDetection-based design for switch)

Copy link
Member

@dgrunwald dgrunwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall: I like the approach, and think it works great.
Thanks a lot for your work here! I'll merge the PR and add a commit of my own on top of it (with purely cosmetic changes I made during the review).

Most of the remaining gotos in the round trip tests are in some way related to switch. There seem to be some bugs in the logic that determines the switch body (particularly with nested switches); but there's also a lot of tricky cases where we over-aggressively detect a switch and cause problems for later pipeline stages. I have no idea on how to solve those (though we could improve things somewhat by fine-tuning the sparse-switch heuristic).
Some other gotos are caused by us not inlining generated value temporaries (keyword: STRUCT_SPLITTING_IMPROVED), which in turn makes it impossible to reconstruct the short-circuiting operators. Properly solving that will first require support for ref-like types...
I didn't see any where the goto would have been avoidable if ConditionDetection made better choices.

@@ -106,7 +106,8 @@ public void ShortCircuit([ValueSource("defaultOptions")] CSharpCompilerOptions c
public void ExceptionHandling([ValueSource("defaultOptions")] CSharpCompilerOptions cscOptions)
{
RunForLibrary(cscOptions: cscOptions, decompilerSettings: new DecompilerSettings {
NullPropagation = false
NullPropagation = false,
RemoveDeadCode = !cscOptions.HasFlag(CSharpCompilerOptions.UseRoslyn)
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RemoveDeadCode was introduced for F# (which emits much more dead code than C#).
It's not enabled by default, so it feels weird to have to use it in the C# tests.

As far as I can tell, it's only for while (true) loops compiled with legacy csc in debug mode?
I guess it's fine to special-case this for the test case... but maybe it would be better to have a transform that removes this specific kind of dead store independent of the RemoveDeadCode configuration option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A transform would be better, I'll leave that one for you

return forIncrement;
} catch (InvalidOperationException) {
// multiple potential increment blocks. Can get this because we don't check that the while loop
// has a condition first, as we don't need to do too much of HighLevelLoopTransform's job.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetIncrementBlock should be changed to return null instead of throwing an exception.

Copy link
Contributor Author

@Chicken-Bones Chicken-Bones Jun 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had it return null, but the code looked ugly in comparison, and wasn't necessary inside HighLevelLoopTransform because of preconditions. Avoiding exceptions as part of normal control flow is a design principle I'm aware of, so I'll favour that.

//save a copy
var trueInst = ifInst.TrueInst;

if (ifInst != block.Instructions.SecondToLastOrDefault()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving instructions from the current block into a nested block is potentially problematic, because the following block transforms apply only to the current block, any nested blocks are expected to be already completely transformed.
But I guess it's OK in this case because it's only moving instructions that were previously inlined from another block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, works out for us here.

if (exitInst is Branch branch
&& branch.TargetBlock.Parent == currentContainer
&& branch.TargetBlock.IncomingEdgeCount == 1
&& branch.TargetBlock.FinalInstruction is Nop) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the old ConditionDetection was originally written, we strengthened the Block invariants to allow a non-Nop FinalInstruction only for the inline block types:

  • branch.TargetBlock must be directly within a container (via Branch invariant)
  • Any block in a container must have type ControlFlow (via BlockContainer invariant)
  • ControlFlow blocks can't have a FinalInstruction (via Block invariant)
    So these checks are redundant now. I'll modify the code to replace them with an assertion.

Although, this doesn't hold for most blocks being checked, since the ThenInst of a IfInstruction could theoretically be an expression represented by an inline block (a ? new C { ... } : ...).

int falseInstIndex = block.Instructions.IndexOf(ifInst) + 1;
AddExits(block, falseInstIndex, elseExits);

var commonExits = elseExits.Where(e1 => thenExits.Any(e2 => DetectExitPoints.CompatibleExitInstruction(e1, e2)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quadratic search. Could be optimized using a HashSet with a CompatibleExitInstructionComparer...
But I guess usually there won't be that many exits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, clearer code first, optimisations if obvious or profiling hotspot, generally not that many exits (especially since it's only looking for exits which could be moved to the block root via inversions.

@dgrunwald dgrunwald merged commit 9937302 into icsharpcode:master Jun 13, 2018
dgrunwald added a commit that referenced this pull request Jun 13, 2018
@Chicken-Bones
Copy link
Contributor Author

Thanks for this, switch detection, particularly over-aggressive UseILSwitch will be addressed in my next PR[s]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants