Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colwise update #1853

Merged
merged 11 commits into from
May 31, 2016
Merged

Colwise update #1853

merged 11 commits into from
May 31, 2016

Conversation

lionel-
Copy link
Member

@lionel- lionel- commented May 24, 2016

Here you go @hadley

I'm not sure how to handle databases with the _if() variants. It seems only a very restricted set of predicates will be compatible and in any case they will generate a full table scan. This doesn't seem very useful, so these variants fail with an error message for now when applied to a lazy source.

Also I think the easiest way to solve the purrr problem is to restore the DataframeSubsetVisitors class as header only, and keep the dmap() functions as is (with a deprecation message pointing to dplyr). The semantics are slightly different so there is no easy way to provide a 100% compatible translation to the new dplyr colwise functions.

@lionel- lionel- force-pushed the colwise-update branch 2 times, most recently from a1fa560 to 5af25f9 Compare May 26, 2016 10:07
@@ -44,6 +44,9 @@
* Code related to starting and signalling clusters has been moved out to
[multidplyr](http://github.com/hadley/multidplyr).

* `summarise_each()` and `mutate_each()` are deprecated in favour of a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not deprecate for now? (And instead deprecate in the next version)

Can be a character vector or an actual function
- Don't assign bare functions in calling env!
- Properly propagate env when coercing character vector
- Make it explicit that BaseEnv is used by default
@lionel-
Copy link
Member Author

lionel- commented May 27, 2016

All done

@@ -13,6 +13,7 @@
column names (#1513).

* `one_of()` tolerates unknown variables in `vars`, but warns (#1848, @jennybc).
column names (#1513).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops

@hadley
Copy link
Member

hadley commented May 27, 2016

@lionel- looks good (apart from the one small merge problem)

@jennybc I'd love your thoughts before I pull the trigger on this

@hadley hadley modified the milestone: 0.5 May 27, 2016
@codecov-io
Copy link

codecov-io commented May 28, 2016

Current coverage is 20.01%

Merging #1853 into master will increase coverage by 0.82%

@@             master      #1853   diff @@
==========================================
  Files           188        188          
  Lines          7345       7422    +77   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           1409       1485    +76   
- Misses         5936       5937     +1   
  Partials          0          0          

Powered by Codecov. Last updated by c904113...a80e0d8

@jennybc
Copy link
Member

jennybc commented May 28, 2016

Yes it would be a pleasure to test drive. I had already noticed the lack of American versions but thankfully @lionel- has fixed that :)

@jennybc
Copy link
Member

jennybc commented May 29, 2016

These new functions will be very useful! Everything works well. I made a PR to @lionel- with some modest changes to the help. I put some other notes made while running things in a gist.

This is definitely related to tidyverse/purrr#152 and presumably closes several of the dplyr issues linked there. Now that I reread those, I'm still not sure if there's a dplyr solution to select based on a predicate, i.e. select_if()? Is purrr::keep() the best bet #497?

@lionel-
Copy link
Member Author

lionel- commented May 30, 2016

A select_if() variant would be consistent with the new colwise verbs (with all _if variants only compatible with local sources). Then should we also have a select_at() that would accept numeric/character vectors and vars() specifications? Could be useful for programming but will probably be confusing and redundant wrt both select() and select_().

@jennybc
Copy link
Member

jennybc commented May 30, 2016

select_if() seems to be part or all of what people are asking for in #1569, #1501, #1229, #497 = all the issues I alluded to above. So that would be handy.

The fact that select_at() has such potential to create confusion with select() itself gets at something I noted in the gist: Why do we have to use the usual select_helpers inside vars() with the new _at functions but not with select()?

@hadley
Copy link
Member

hadley commented May 30, 2016

Because there's three things you might want to supply to summarise_at(): the variables, the functions, and fixed arguments to the functions. There's only only thing to supply to select(): the variables.

@hadley hadley merged commit 7385813 into tidyverse:master May 31, 2016
@hadley
Copy link
Member

hadley commented May 31, 2016

Thanks!

@lock
Copy link

lock bot commented Jan 18, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Jan 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants