Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Message.hashCode Consistent with Message.equals #941

Merged
merged 2 commits into from
Dec 22, 2015

Conversation

tlantz
Copy link
Contributor

@tlantz tlantz commented Jul 3, 2015

Fixes #940

This commit makes Message.hashCode and Message.equals computed off of the same fields.

@@ -28,6 +26,8 @@
import java.io.Serializable;
import java.util.List;

import static com.google.common.base.Preconditions.checkNotNull;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why'd you move this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inadvertent (IntelliJ defaults). I will fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d20bf5b

@tavianator
Copy link
Contributor

Have you measured the performance effect of this?

@tlantz
Copy link
Contributor Author

tlantz commented Jul 6, 2015

@tavianator - yes, for the case I was describing above. Inserting single Message objects is no longer significant as the number of messages grows. I can post some repro code and results.

The actual implementation of Message.hashCode will be slower but I'm not sure that's a very helpful measure. Maybe I can provide the breakeven point in number of Messages when the slow insertion overtakes the time spent in individual hashCode calls?

Are there other specific benchmark cases you think it would make sense for me to measure? Happy to post results for those as well.

Will try to put something together by the weekend (I don't have much time for it during the work week).

@tlantz
Copy link
Contributor Author

tlantz commented Jul 12, 2015

I tested out two things related to performance. For the sake of being thorough I just timed the hashCode implementation and as expected the new one is slower. On my machine (1.3GHz Intel i5) 100 million calls of the current implementation is 237 millis. With the new implementation and a source list size of 10 (a bit arbitrary, but didn't seem unreasonable after logging in a real application) 100 million calls is 5320 millis. That's a large percentage increase but still, in nanosecond territory per call. So I'd argue that in any practical application it should not matter.

The second thing I did was time a simpler version of what was happening in the production codebase where I saw the original issue: each time a new Message was created with a unique Throwable it was merged with all prior Messages in an immutable set. I ran the test over a range (log scale) of set sizes. Somewhere between 128 and 256 Messages in the set is where it breaks even.

  MessageCount OldHashCodeMillis NewHashCodeMillis
1           32                 3                 5
2           64                 9                11
3          128                15                21
4          256                77                33
5          512               176               103
6         1024              1287               130
7         2048              7995               372

The red line below is the timings with the old implementation, the blue line with the new.

image

The timing code is pretty trivial: just a run to get rid of JIT overhead and then a loop to measure the hash code time or to build sets of a certain size..

These message counts are large for any normal use case, but going back to the original issue: when stuck in an infinite cycle of exceptions during dependency resolution in a production situation, this made understanding the current state of an application difficult since the application was spending the majority of its time inserting Message objects into a hash set.

Happy to look at specific cases where this change might be considered risky. Other than the runaway exception case I mentioned, I'm having trouble finding one where Message.hashCode would actually be a hot code path.

sameb added a commit that referenced this pull request Dec 22, 2015
…stency

Make Message.hashCode Consistent with Message.equals
@sameb sameb merged commit fb01e99 into google:master Dec 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants