Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

summarise() works for empty data frames #3274

Merged
merged 4 commits into from
Feb 3, 2018

Conversation

krlmlr
Copy link
Member

@krlmlr krlmlr commented Dec 31, 2017

Fixes #3071.

@krlmlr krlmlr requested a review from hadley December 31, 2017 01:46
Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’d be useful to add a test for mean() too. In general, I’m not sure if the hybrid handlers have been written with an eye to the zero length case.

@krlmlr
Copy link
Member Author

krlmlr commented Jan 4, 2018

Indeed, mean(na.rm = TRUE) doesn't handle empty vectors. Will review all handlers.

@krlmlr
Copy link
Member Author

krlmlr commented Jan 6, 2018

I'm seeing an average 10% slowdown with this PR with flights, note the different scales for has_na: TRUE and na_rm: FALSE due to #3288.

I've started a writeup, http://rpubs.com/krlmlr/dplyr-spec-template, would that be suitable as an internal dplyr vignette?

rplot

@krlmlr
Copy link
Member Author

krlmlr commented Jan 6, 2018

Update: A second measurement with times = 100 suggests performance is about equal:

## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.987559   0.011399  86.635  < 2e-16 ***
## typereal    -0.002408   0.009307  -0.259  0.79883    
## has_naTRUE   0.005208   0.009307   0.560  0.58271    
## funvar       0.010561   0.011399   0.927  0.36643    
## funsum       0.034494   0.011399   3.026  0.00726 ** 
## na_rmTRUE   -0.035563   0.009307  -3.821  0.00125 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

download

@krlmlr krlmlr force-pushed the b-#3071-summarize-zero-columns branch from 0d22c56 to 54b8191 Compare February 2, 2018 07:01
because this is brittle and often fails on Windows
@krlmlr krlmlr merged commit 4d6fe4c into tidyverse:master Feb 3, 2018
@krlmlr krlmlr deleted the b-#3071-summarize-zero-columns branch February 3, 2018 19:54
@lock
Copy link

lock bot commented Aug 2, 2018

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Aug 2, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants