-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify differences between count() and tally() #5349
Conversation
And revert count() to 0.8.5 behaviour. Fixes #5163
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@lionel- could you check my reasoning here please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
count()
and tally()
have always behaved differently because they use different default values (NULL vs missing): dc92c24
The historical behaviour is:
tally()
guesses the weight variablecount()
doesn't guess.
This behaviour was preserved when we switched to tidy eval. However it was inadvertently changed in 0.8.2 with #4408. Since that version (almost one year old) neither count()
nor tally()
try to guess the weighting column.
I think not guessing is the expected behaviour for most people. I also like that count and tally are consistent. Maybe we should sanction the 0.8.2 behaviour that the weight column is never guessed? In that case, we don't need the guess_wt()
sentinel.
Ok, if we accidentally broke the autoguessing in 0.8.2, then lets remove it all together. |
@yutannihilation since you have an eye for detail, would you mind taking a look at this PR. Most importantly, is the reasoning in the NEWS clear, and does it make sense to you? Thanks! |
Looks good! Just one thing to confirm, is it safe to keep using |
Good point, probably worth keeping that check around just as a precaution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, will be simpler to explain (including to self).
(#5324). | ||
* `count()` and `tally()` no longer automatically weights by column `n` if | ||
present (#5298). dplyr 1.0.0 introduced this behaviour because of Hadley's | ||
faulty memory. Historically `tally()` automatically weighted and `count()` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂
And revert count() to 0.8.5 behaviour. Fixes #5298