-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Minimal patch to fix multiple match updates #1806
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, Davis, much appreciated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, LGTM.
To me it seems that with 8aeca84 in #1817 in addition to this PR, we should be fine, at least for the currently implemented tests.
We should probably think about situations where we get many-to-many relations in the join operations in our functions that are currently not covered by our tests.
Thanks! |
dm is designed to avoid many-to-many joins in "normal" operation. Only an inconsistent dm will give many-to-many join warnings. We could try and see what happens then, but it doesn't seem too important. |
@krlmlr that is kind of what i figured and was hoping for! that should align pretty nicely with the dplyr behavior then |
dm 1.0.5 - Progress bars for `dm_wrap_tbl()` and `dm_unwrap_tbl()` (#835, #1450). - Add cheat sheet as a vignette (#1653). - Suggest creating a function for your database `dm` object (#1827, #1828). - Add alternative text to author images for pkgdown website (#1804). - Compatibility with dev jsonlite (#1837). - Remove tidyverse dependency (#1798, #1834). - Minimal patch to fix multiple match updates (@DavisVaughan, #1806). - Adapt to rlang 1.1.0 changes (#1817). - Make sure `{dm}` passes "noSuggests" workflow (#1659).
We are preparing to release dplyr 1.1.1 and dm popped up in the revdep checks.
We realized that we didn't get the
multiple
argument quite right the first time around 😞 . It was too aggressive since it warned on both one-to-many and many-to-many joins. In 1.1.1 we've made two improvements:multiple
now defaults to"all"
. The options ofNULL
,"error"
, and"warning"
are deprecated.relationship
is a new argument to add known constraints onto the join procedure, such as"many-to-one"
, which replacesmultiple = "error"
.The default of
relationship
checks to see if there is amany-to-many
relationship between the keys ofx
andy
and will warn if one is present. This should be much rarer than what we checked for before, and targets the most dangerous case that we were trying to warn the user about.You can read all about
relationship
here tidyverse/dplyr#6753, along with the issues linked there.Unfortunately it does affect dm by breaking some tests. I've done the minimal amount of work to get the tests working again, but I think you may need to do a full review of
multiple = "all"
usage. I think most of the time you will just be able to remove it, as I doubt you do many many-to-many joins here. The annoying part is ensuring the tests pass on CRAN and dev dplyr, so maybe it is worth leaving the othermultiple = "all"
calls in there until dplyr 1.1.1 is out and you can depend on it.We plan to submit dplyr 1.1.1 in 2-3 weeks.
This should be compatible with both dev and CRAN dplyr. It would help us out if you could go ahead and send a patch version of your package to CRAN ahead of time! Thanks!