-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement list_sizes()
#539
Conversation
Alternatively or additionally, it would be nice to have a C api for |
Could replace map_int(x, vec_size) by vec_sizes() from r-lib/vctrs#539
Not sure if it matters, but it looks like this would also work on a data frame and return the size of each column (all the same size)? |
I think it'd be better to export
That doesn't seem appropriate. In general this function seems ill-suited for vctrs at the moment because it's about |
Sure. I can live with |
Let's revisit once we have figured out |
From a discussion with @DavisVaughan, we could have |
@romainfrancois I'll take over this PR if that's alright with you |
@DavisVaughan sure. |
Pretty nice improvement library(purrr)
library(vctrs)
x <- rep_len(list(1:5 + 0L), 100)
bench::mark(
map_int(x, vec_size),
list_sizes(x)
)
#> # A tibble: 2 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 map_int(x, vec_size) 63.34µs 77.62µs 12802. 68.6KB 91.8
#> 2 list_sizes(x) 2.81µs 3.08µs 287798. 4.3KB 0
list_sizes(mtcars)
#> Error: `x` must be a list. Created on 2020-03-18 by the reprex package (v0.3.0) |
} | ||
|
||
// [[ register() ]] | ||
SEXP vctrs_list_sizes(SEXP x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Do we need a registered alias since they have the same signature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to try and always keep a separate function prefixed with vctrs_
that gets exported, whether it is strictly required or not
vec_sizes()
is tovec_size()
whatlengths()
is tolength()
:The 😈 plan is to use this in a vctrs based implementation of things like
dplyr::summarise()
anddplyr::mutate()
e.g. evaluate the expression in each group and then assert that the sizes are good (all equal to 1 if summaries, equal to the group sizes on a mutate).The PR is wip, if we want that I'll add unit tests, etc ...