-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Variadics #2240
base: trunk
Are you sure you want to change the base?
Proposal: Variadics #2240
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made another pass without diving into the type checking bits, since those seem like something that could be handled in a separate proposal once the other issues are handled.
Co-authored-by: josh11b <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I really appreciate the examples you have included, they are quite helpful (and so I've asked for even more!).
Co-authored-by: josh11b <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I think this is ready for leads review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all but the appendix (largely trusting Josh's review there) and generally pretty happy across the board. Left a bunch of comments, but mostly minor wording / presentation improvements.
proposals/p2240.md
Outdated
template <typename T> | ||
requires std::totally_ordered<T> && std::copyable<T> | ||
T Min(const T& t) { | ||
return t; | ||
} | ||
|
||
template <typename T, typename... Params> | ||
requires std::totally_ordered<T> && std::copyable<T> && | ||
(std::same_as<T, Params> && ...) | ||
T Min(T first, Params... rest) { | ||
T min_rest = Min(rest...); | ||
if (min_rest < first) { | ||
return min_rest; | ||
} else { | ||
return first; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think it might be worth rewriting the C++ examples to use if constexpr
to provide the base case without overloading as that seems a bit briefer and more representative, here and below. WDYT?
Here is what I think this one would look like:
template <typename T> | |
requires std::totally_ordered<T> && std::copyable<T> | |
T Min(const T& t) { | |
return t; | |
} | |
template <typename T, typename... Params> | |
requires std::totally_ordered<T> && std::copyable<T> && | |
(std::same_as<T, Params> && ...) | |
T Min(T first, Params... rest) { | |
T min_rest = Min(rest...); | |
if (min_rest < first) { | |
return min_rest; | |
} else { | |
return first; | |
} | |
} | |
template <typename T, typename... Params> | |
requires std::totally_ordered<T> && std::copyable<T> && | |
(std::same_as<T, Params> && ...) | |
T Min(T first, Params... rest) { | |
// Base case. | |
if constexpr (sizeof...(rest) == 0) { | |
return first; | |
} else { | |
T min_rest = Min(rest...); | |
if (min_rest < first) { | |
return min_rest; | |
} else { | |
return first; | |
} | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put the // Base case.
comment inside the first branch of the if
.
I feel very mixed about this suggestion since the overloading approach was the way to do this in the day, but I guess C++ has moved and the example is documented as using C++20.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell this technique only works if the minimum arity is at least 1, so that the signature can do the destructuring while still supporting all usages. That makes it a great fit for Min
, but awkward to apply to StrCat
. I've done it (see below), but I had to rely on the fact that there's a separate implementation function (which is basically a coincidence), and the result still doesn't seem like a net improvement to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this if constexpr
trick is an improvement for StrCat
, where it creates two edge cases instead of one. @zygoloid had an alternate implementation that uses fold expressions to avoid quadratic compile time and the need for a helper function, but I don't know if that serves the narrative purpose you are looking for. Is this supposed to be a common way of writing this function, a way that shows common pitfalls, or a way that is as similar as possible to the Carbon approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had been aiming for these examples to be representative of how functions like this would typically be written. I think that means we shouldn't use the comma-fold trick, and probably not if constexpr
either, although that's more debatable. However, it sounds like maybe Chandler is asking for the examples to represent the ideal versions of these functions; in that case, it might make sense to use the comma-fold trick in StrCat
. In either case, I agree with you that StrCat
should not use if constexpr
, but I'll wait for @chandlerc to clarify what he'd like to see.
proposals/p2240.md
Outdated
This approach has some major advantages: the keywords are more consistent with | ||
`each` (and `expand` to some extent), substantially less visually noisy than | ||
`...`, and they may also be more self-explanatory. However, it does have some | ||
substantial drawbacks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we should mention & credit the Swift syntax explicitly here, which is similar but not exactly the same as this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. How's this?
|
||
<!-- tocstop --> | ||
|
||
## Basics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful to have a few sub-sections of this section to help the reader keep track of where things are... At the least, the point where this switches from pack expansion expressions / statements into pack patterns is a bit easy to miss I find.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, how's this?
A pack literal can be _expanded_, which moves its parent AST node inside the | ||
pack literal, so long as the parent node is not `...`. For example, | ||
`... Optional(⟬each X, Y⟭)` is equivalent to | ||
`... ⟬Optional(each X), Optional(Y)⟭`. Similarly, an arity coercion can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find "parent AST node" a strange concept to introduce here. This is a language design, and not really a compiler design, so it seems a bit circular to talk about AST nodes.
Maybe "moves its parent delimited syntactic construct inside the pack literal"? "deliminited syntactic construct" seems to be a reasonable description to cover the cases you care about here maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I would read "delimited syntactic construct" to mean that there need to be delimiters, such as the parentheses in this example, but that's not the intent; the example could just as well be that ... 2 * ⟬each X, Y⟭
is equivalent to ... ⟬2 * each X, 2 * Y⟭
. "Syntactic construct" might work, but it seems awfully vague.
Would "syntax tree node" work better for you?
It's possible that any way of expressing this is going to seem a little strange, because the language rule I'm describing here is a little strange. Even the "formal" version of this rule in the appendix (starting around line 704) winds up being pretty handwavy, because it can't be expressed as a reduction rule; it's really a rule for generating reduction rules.
proposals/p2240.md
Outdated
template <typename T> | ||
requires std::totally_ordered<T> && std::copyable<T> | ||
T Min(const T& t) { | ||
return t; | ||
} | ||
|
||
template <typename T, typename... Params> | ||
requires std::totally_ordered<T> && std::copyable<T> && | ||
(std::same_as<T, Params> && ...) | ||
T Min(T first, Params... rest) { | ||
T min_rest = Min(rest...); | ||
if (min_rest < first) { | ||
return min_rest; | ||
} else { | ||
return first; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put the // Base case.
comment inside the first branch of the if
.
I feel very mixed about this suggestion since the overloading approach was the way to do this in the day, but I guess C++ has moved and the example is documented as using C++20.
Co-authored-by: Chandler Carruth <[email protected]> Co-authored-by: josh11b <[email protected]>
Co-authored-by: Carbon Infra Bot <[email protected]>
Co-authored-by: josh11b <[email protected]>
### Overview | ||
|
||
A "pack expansion" is a syntactic unit beginning with `...`, which is a kind of | ||
compile-time loop over sequences called "packs". Packs are initialized and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth: I don't find it is helping my understanding to refer to this as a compile-time loop, or to describe the operation of pack expansion as iterative. The outcome would be the same if we processed pack elements in any order or in parallel, and the implied sequencing here is adding unnecessary complexity for me, even though I'd expect a typical implementation to be iterative.
Buuut... maybe this is the right way to describe this for other audiences than me :) And just so long as people think of this as "there is a loop in the compiler", not "there is a loop in the generated SemIR", I don't think it's giving the wrong impression. So, I'm not asking for a change here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the tradeoffs of expressing this in procedural vs. functional terms are really complicated. Not only are some people more comfortable with one or the other, but different contexts work better with one or the other (e.g. pack expansion statements really lend themselves to a loop description, but the type system is much easier to express functionally).
I think the main reason I took the current approach is that the procedural description requires less up-front explanation, so it's a better fit for these introductory sections, but then I shift more to the functional description as we get more into the weeds, because it's much more suited to formal reasoning.
One thing I could potentially change here: I could say that pack expansions are run-time loops, but they typically must be unrolled during type checking (because of heterogeneity and because of the tuple usage), which causes them to behave like compile-time loops as well. I think that's more consistent with how I talk about them in the rest of the doc (e.g. it would resolve the issue you flagged with ...and
and ...or
below). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think changing to talking about runtime loops would make me less happy, because of the cases you mentioned where additional handwaving is required, though perhaps that is more consistent with the exposition here. We can only really give a loop at runtime in the case where the pack is "sufficiently" homogeneous (in similar cases to when we could avoid monomorphization), and that seems to be enough of an edge case that it's probably misleading to center our description around it, even if we do want loop-like lowering for the sufficiently homogeneous cases.
docs/design/variadics.md
Outdated
- `‖each X‖` refers to the deduced arity of the pack expansion that contains | ||
the declaration of `each X`. | ||
- `⟪E; N⟫` evaluates to `N` repetitions of `E`. This is called a _arity | ||
coercion_, because it coerces the expression `E` to have arity `N`. `E` must | ||
not contain any pack expansions, each-names, or pack literals (see below). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Unicode symbols in use here are working badly for me both in github's code review flow and in its rendered markdown -- both the ‖
symbol and the ⟪
are rendering in two different ways in different contexts, and so it looks like there are four different sets of symbols in use here, not two. I don't know if this is a browser bug or not, but it's definitely making this harder to read and review.
Screenshot: note the much smaller ‖
in the paragraph starting "Combining the two", and that the definition of ⟪
appears to be a single chevron whereas the bottom line of the screenshot is clearly a double chevron.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I see that too. It also breaks monospacing.
This could be a browser bug, or a bug in the font itself, or a combination of the two. The deeper I dig into this, the more baffling it gets. The size of the characters depends not only on spatial context, but temporal context: In a text box, I can get them to change size by editing the surrounding text, in ways that persist even when I undo those edits. Fortunately those changes don't persist past saving the contents of the text box.
I haven't been able to find any options for fixing this directly, given that we don't control the font or the CSS. It looks like we could avoid the problem with other glyphs, but our options aren't great, because most of the math glyphs seem to have this problem to some extent. The best option I've found so far is:
⸨f32, Optional(each T), ❰i32; ╎each y╎❱⸩
⟅⟆
also seem pretty well-behaved, and we might still be able to work with ‖
and ⟬
if we don't care about exact monospacing. Conversely, ╎
seems to behave itself in monospace rendering, but it sometimes acts hinky in the proportional-width editing text box, so I don't entirely trust it.
But I don't know if we want to let our notation be dictated by the vagaries of font rendering bugs; for all we know, whatever we choose could be broken the next time GitHub changes their CSS. That suggests we should either fall back on some ASCII notation (my previous approach was to use {}<>||
and hope the theoretical ambiguities didn't come up too much in practice), or just live with these rendering glitches (and maybe file some bug reports).
What do you suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try this:
‖each X‖
refers to the deduced arity of the pack expansion that contains
the declaration ofeach X
.⟪E; N⟫
evaluates toN
repetitions ofE
. This is called a arity
coercion, because it coerces the expressionE
to have arityN
.E
must
not contain any pack expansions, each-names, or pack literals (see below).Combining the two, the type of
... each y
is... ⟪i32; ‖each y‖⟫
. Thus, the
type ofz
is(f32, ... Optional(each T), ... ⟪i32; ‖each y‖⟫)
.Now, consider a modified version of that example:
fn F[... each T:! type]((... each x: Optional(each T)), (... each y: i32)) { let (.. each z: auto) = (0 as f32, ... each x, ... each y); }
each z
is a pack, but it has the same elements as the tuplez
in our earlier
example, so we represent its type in the same way, as a sequence of segments:
⟬f32, Optional(each T), ⟪i32; ‖each y‖⟫⟭
. The⟬⟭
delimiters make this a
pack literal rather than a tuple literal. Notice one subtle difference: the
segments of a pack literal do not contain...
. In effect, every segment of a
pack literal acts as a separate loop body. As with the tuple literal syntax, the
pack literal pseudo-syntax can also be used in patterns.The shape of a pack literal is a tuple of the arities of its segments, so the
shape of⟬f32, Optional(each T), ⟪i32; ‖each y‖⟫⟭
is
(1, ‖each T‖, ‖each y‖)
. Other expressions and patterns also have shapes. In
particular, the shape of an arity coercion⟪E; A⟫
is(A)
, the shape of
each X
is‖each X‖
, and the shape of an expression that does not contain
That seems to be working for me. What I did was to insert a U+200C (zero width non-joiner) on both sides of each of the fancy characters. Unfortunately that seems to give the worse rendering of the double-chevron. I wonder if using guillemets (U+AB, U+BB: « ») instead of the mathematical double-angle brackets would work better:
⟬f32, Optional(each T), «i32; ‖each y‖»⟭
. The⟬⟭
delimiters make this a
That seems OK. Maybe there's a better character to use here, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I avoided guillemets and similar characters because their small vertical size interferes with visual scanning, but maybe that's OK for shape coercions since they're generally pretty small. In any event, it looks like most of the glyphs I rejected for that reason also screw up the rendering of ‖
. ‹›
also seem to work, but I like the visual consistency of all the variadic-related glyphs being doubled (ish).
I've converted the double-angles to guillemets. WDYT?
proposals/p2240.md
Outdated
```cpp | ||
template <ConvertibleToString... Ts> | ||
std::string StrCat(const Ts&... params) { | ||
std::string result; | ||
result.reserve((params.Length() + ... + 0)); | ||
StrCatImpl(&result, params...); | ||
return result; | ||
if constexpr (sizeof...(params) == 0) { | ||
return ""; | ||
} else { | ||
std::string result; | ||
result.reserve((params.Length() + ... + 0)); | ||
StrCatImpl(&result, params...); | ||
return result; | ||
} | ||
} | ||
|
||
void StrCatImpl(std::string* out) { return; } | ||
|
||
template <ConvertibleToString T, ConvertibleToString... Ts> | ||
void StrCatImpl(std::string* out, const T& first, const Ts&... rest) { | ||
out->append(first.ToString()); | ||
StrCatImpl(out, rest...); | ||
if constexpr (sizeof...(rest) > 0) { | ||
StrCatImpl(out, rest...); | ||
} | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josh11b asked me whether I'd write / prefer the form with if constexpr
(with two termination conditions) or the form with the extra function overload. I said I'd actually write this:
```cpp | |
template <ConvertibleToString... Ts> | |
std::string StrCat(const Ts&... params) { | |
std::string result; | |
result.reserve((params.Length() + ... + 0)); | |
StrCatImpl(&result, params...); | |
return result; | |
if constexpr (sizeof...(params) == 0) { | |
return ""; | |
} else { | |
std::string result; | |
result.reserve((params.Length() + ... + 0)); | |
StrCatImpl(&result, params...); | |
return result; | |
} | |
} | |
void StrCatImpl(std::string* out) { return; } | |
template <ConvertibleToString T, ConvertibleToString... Ts> | |
void StrCatImpl(std::string* out, const T& first, const Ts&... rest) { | |
out->append(first.ToString()); | |
StrCatImpl(out, rest...); | |
if constexpr (sizeof...(rest) > 0) { | |
StrCatImpl(out, rest...); | |
} | |
} | |
``` | |
```cpp | |
template <ConvertibleToString... Ts> | |
std::string StrCat(const Ts&... params) { | |
std::string result; | |
result.reserve((params.Length() + ... + 0)); | |
(out->append(params.ToString()), ...); | |
return result; | |
} | |
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me too, but I don't think that's representative of what a typical C++ programmer would do in this situation, so this gets back to the question of whether we want the C++ examples to reflect typical usage or expert usage (see also this comment thread).
Co-authored-by: Richard Smith <[email protected]>
Proposes a set of core features for declaring and implementing generic variadic
functions.
A "pack expansion" is a syntactic unit beginning with
...
, which is a kind ofcompile-time loop over sequences called "packs". Packs are initialized and
referred to using "pack bindings", which are marked with the
each
keyword atthe point of declaration and the point of use.
The syntax and behavior of a pack expansion depends on its context, and in some
cases by a keyword following the
...
:...
iteratively evaluates its operand expression, and treats the values as
successive elements of the tuple.
...and
and...or
iteratively evaluate a boolean expression, combiningthe values using
and
andor
, and ending the loop early if the underlyingoperator short-circuits.
...
iteratively executes a statement....
iteratively matches the elements of the scrutinee tuple. In conjunction with
pack bindings, this enables functions to take an arbitrary number of
arguments.