Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Variadics #2240

Open
wants to merge 75 commits into
base: trunk
Choose a base branch
from
Open

Conversation

geoffromer
Copy link
Contributor

@geoffromer geoffromer commented Sep 30, 2022

Proposes a set of core features for declaring and implementing generic variadic
functions.

A "pack expansion" is a syntactic unit beginning with ..., which is a kind of
compile-time loop over sequences called "packs". Packs are initialized and
referred to using "pack bindings", which are marked with the each keyword at
the point of declaration and the point of use.

The syntax and behavior of a pack expansion depends on its context, and in some
cases by a keyword following the ...:

  • In a tuple literal expression (such as a function call argument list), ...
    iteratively evaluates its operand expression, and treats the values as
    successive elements of the tuple.
  • ...and and ...or iteratively evaluate a boolean expression, combining
    the values using and and or, and ending the loop early if the underlying
    operator short-circuits.
  • In a statement context, ... iteratively executes a statement.
  • In a tuple literal pattern (such as a function parameter list), ...
    iteratively matches the elements of the scrutinee tuple. In conjunction with
    pack bindings, this enables functions to take an arbitrary number of
    arguments.

proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made another pass without diving into the type checking bits, since those seem like something that could be handled in a separate proposal once the other issues are handled.

proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I really appreciate the examples you have included, they are quite helpful (and so I've asked for even more!).

docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I think this is ready for leads review.

Copy link
Contributor

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all but the appendix (largely trusting Josh's review there) and generally pretty happy across the board. Left a bunch of comments, but mostly minor wording / presentation improvements.

Comment on lines 283 to 299
template <typename T>
requires std::totally_ordered<T> && std::copyable<T>
T Min(const T& t) {
return t;
}

template <typename T, typename... Params>
requires std::totally_ordered<T> && std::copyable<T> &&
(std::same_as<T, Params> && ...)
T Min(T first, Params... rest) {
T min_rest = Min(rest...);
if (min_rest < first) {
return min_rest;
} else {
return first;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think it might be worth rewriting the C++ examples to use if constexpr to provide the base case without overloading as that seems a bit briefer and more representative, here and below. WDYT?

Here is what I think this one would look like:

Suggested change
template <typename T>
requires std::totally_ordered<T> && std::copyable<T>
T Min(const T& t) {
return t;
}
template <typename T, typename... Params>
requires std::totally_ordered<T> && std::copyable<T> &&
(std::same_as<T, Params> && ...)
T Min(T first, Params... rest) {
T min_rest = Min(rest...);
if (min_rest < first) {
return min_rest;
} else {
return first;
}
}
template <typename T, typename... Params>
requires std::totally_ordered<T> && std::copyable<T> &&
(std::same_as<T, Params> && ...)
T Min(T first, Params... rest) {
// Base case.
if constexpr (sizeof...(rest) == 0) {
return first;
} else {
T min_rest = Min(rest...);
if (min_rest < first) {
return min_rest;
} else {
return first;
}
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put the // Base case. comment inside the first branch of the if.

I feel very mixed about this suggestion since the overloading approach was the way to do this in the day, but I guess C++ has moved and the example is documented as using C++20.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell this technique only works if the minimum arity is at least 1, so that the signature can do the destructuring while still supporting all usages. That makes it a great fit for Min, but awkward to apply to StrCat. I've done it (see below), but I had to rely on the fact that there's a separate implementation function (which is basically a coincidence), and the result still doesn't seem like a net improvement to me.

Copy link
Contributor

@josh11b josh11b Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this if constexpr trick is an improvement for StrCat, where it creates two edge cases instead of one. @zygoloid had an alternate implementation that uses fold expressions to avoid quadratic compile time and the need for a helper function, but I don't know if that serves the narrative purpose you are looking for. Is this supposed to be a common way of writing this function, a way that shows common pitfalls, or a way that is as similar as possible to the Carbon approach?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had been aiming for these examples to be representative of how functions like this would typically be written. I think that means we shouldn't use the comma-fold trick, and probably not if constexpr either, although that's more debatable. However, it sounds like maybe Chandler is asking for the examples to represent the ideal versions of these functions; in that case, it might make sense to use the comma-fold trick in StrCat. In either case, I agree with you that StrCat should not use if constexpr, but I'll wait for @chandlerc to clarify what he'd like to see.

proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
Comment on lines 898 to 901
This approach has some major advantages: the keywords are more consistent with
`each` (and `expand` to some extent), substantially less visually noisy than
`...`, and they may also be more self-explanatory. However, it does have some
substantial drawbacks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we should mention & credit the Swift syntax explicitly here, which is similar but not exactly the same as this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. How's this?

docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Show resolved Hide resolved

<!-- tocstop -->

## Basics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to have a few sub-sections of this section to help the reader keep track of where things are... At the least, the point where this switches from pack expansion expressions / statements into pack patterns is a bit easy to miss I find.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, how's this?

Comment on lines +390 to +393
A pack literal can be _expanded_, which moves its parent AST node inside the
pack literal, so long as the parent node is not `...`. For example,
`... Optional(⟬each X, Y⟭)` is equivalent to
`... ⟬Optional(each X), Optional(Y)⟭`. Similarly, an arity coercion can be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find "parent AST node" a strange concept to introduce here. This is a language design, and not really a compiler design, so it seems a bit circular to talk about AST nodes.

Maybe "moves its parent delimited syntactic construct inside the pack literal"? "deliminited syntactic construct" seems to be a reasonable description to cover the cases you care about here maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I would read "delimited syntactic construct" to mean that there need to be delimiters, such as the parentheses in this example, but that's not the intent; the example could just as well be that ... 2 * ⟬each X, Y⟭ is equivalent to ... ⟬2 * each X, 2 * Y⟭. "Syntactic construct" might work, but it seems awfully vague.

Would "syntax tree node" work better for you?

It's possible that any way of expressing this is going to seem a little strange, because the language rule I'm describing here is a little strange. Even the "formal" version of this rule in the appendix (starting around line 704) winds up being pretty handwavy, because it can't be expressed as a reduction rule; it's really a rule for generating reduction rules.

proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
Comment on lines 283 to 299
template <typename T>
requires std::totally_ordered<T> && std::copyable<T>
T Min(const T& t) {
return t;
}

template <typename T, typename... Params>
requires std::totally_ordered<T> && std::copyable<T> &&
(std::same_as<T, Params> && ...)
T Min(T first, Params... rest) {
T min_rest = Min(rest...);
if (min_rest < first) {
return min_rest;
} else {
return first;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put the // Base case. comment inside the first branch of the if.

I feel very mixed about this suggestion since the overloading approach was the way to do this in the day, but I guess C++ has moved and the example is documented as using C++20.

Co-authored-by: Chandler Carruth <[email protected]>
Co-authored-by: josh11b <[email protected]>
proposals/p2240.md Outdated Show resolved Hide resolved
proposals/p2240.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
### Overview

A "pack expansion" is a syntactic unit beginning with `...`, which is a kind of
compile-time loop over sequences called "packs". Packs are initialized and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth: I don't find it is helping my understanding to refer to this as a compile-time loop, or to describe the operation of pack expansion as iterative. The outcome would be the same if we processed pack elements in any order or in parallel, and the implied sequencing here is adding unnecessary complexity for me, even though I'd expect a typical implementation to be iterative.

Buuut... maybe this is the right way to describe this for other audiences than me :) And just so long as people think of this as "there is a loop in the compiler", not "there is a loop in the generated SemIR", I don't think it's giving the wrong impression. So, I'm not asking for a change here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the tradeoffs of expressing this in procedural vs. functional terms are really complicated. Not only are some people more comfortable with one or the other, but different contexts work better with one or the other (e.g. pack expansion statements really lend themselves to a loop description, but the type system is much easier to express functionally).

I think the main reason I took the current approach is that the procedural description requires less up-front explanation, so it's a better fit for these introductory sections, but then I shift more to the functional description as we get more into the weeds, because it's much more suited to formal reasoning.

One thing I could potentially change here: I could say that pack expansions are run-time loops, but they typically must be unrolled during type checking (because of heterogeneity and because of the tuple usage), which causes them to behave like compile-time loops as well. I think that's more consistent with how I talk about them in the rest of the doc (e.g. it would resolve the issue you flagged with ...and and ...or below). What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think changing to talking about runtime loops would make me less happy, because of the cases you mentioned where additional handwaving is required, though perhaps that is more consistent with the exposition here. We can only really give a loop at runtime in the case where the pack is "sufficiently" homogeneous (in similar cases to when we could avoid monomorphization), and that seems to be enough of an edge case that it's probably misleading to center our description around it, even if we do want loop-like lowering for the sufficiently homogeneous cases.

docs/design/variadics.md Show resolved Hide resolved
Comment on lines 373 to 377
- `‖each X‖` refers to the deduced arity of the pack expansion that contains
the declaration of `each X`.
- `⟪E; N⟫` evaluates to `N` repetitions of `E`. This is called a _arity
coercion_, because it coerces the expression `E` to have arity `N`. `E` must
not contain any pack expansions, each-names, or pack literals (see below).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Unicode symbols in use here are working badly for me both in github's code review flow and in its rendered markdown -- both the symbol and the are rendering in two different ways in different contexts, and so it looks like there are four different sets of symbols in use here, not two. I don't know if this is a browser bug or not, but it's definitely making this harder to read and review.

Screenshot: note the much smaller in the paragraph starting "Combining the two", and that the definition of appears to be a single chevron whereas the bottom line of the screenshot is clearly a double chevron.
Screenshot 2024-09-29 at 13 22 37

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I see that too. It also breaks monospacing.

This could be a browser bug, or a bug in the font itself, or a combination of the two. The deeper I dig into this, the more baffling it gets. The size of the characters depends not only on spatial context, but temporal context: In a text box, I can get them to change size by editing the surrounding text, in ways that persist even when I undo those edits. Fortunately those changes don't persist past saving the contents of the text box.

I haven't been able to find any options for fixing this directly, given that we don't control the font or the CSS. It looks like we could avoid the problem with other glyphs, but our options aren't great, because most of the math glyphs seem to have this problem to some extent. The best option I've found so far is:

⸨f32, Optional(each T), ❰i32; ╎each y╎❱⸩

⟅⟆ also seem pretty well-behaved, and we might still be able to work with and if we don't care about exact monospacing. Conversely, seems to behave itself in monospace rendering, but it sometimes acts hinky in the proportional-width editing text box, so I don't entirely trust it.

But I don't know if we want to let our notation be dictated by the vagaries of font rendering bugs; for all we know, whatever we choose could be broken the next time GitHub changes their CSS. That suggests we should either fall back on some ASCII notation (my previous approach was to use {}<>|| and hope the theoretical ambiguities didn't come up too much in practice), or just live with these rendering glitches (and maybe file some bug reports).

What do you suggest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try this:

  • ‌‖‌each X‌‖‌ refers to the deduced arity of the pack expansion that contains
    the declaration of each X.
  • ‌⟪‌E; N‌⟫‌ evaluates to N repetitions of E. This is called a arity
    coercion
    , because it coerces the expression E to have arity N. E must
    not contain any pack expansions, each-names, or pack literals (see below).

Combining the two, the type of ... each y is ... ‌⟪‌i32; ‌‖‌each y‌‖‌‌⟫‌. Thus, the
type of z is (f32, ... Optional(each T), ... ‌⟪‌i32; ‌‖‌each y‌‖‌‌⟫‌).

Now, consider a modified version of that example:

fn F[... each T:! type]((... each x: Optional(each T)), (... each y: i32)) {
  let (.. each z: auto) = (0 as f32, ... each x, ... each y);
}

each z is a pack, but it has the same elements as the tuple z in our earlier
example, so we represent its type in the same way, as a sequence of segments:
‌⟬‌f32, Optional(each T), ‌⟪‌i32; ‌‖‌each y‌‖‌‌⟫‌‌⟭‌. The ‌⟬‌‌⟭‌ delimiters make this a
pack literal rather than a tuple literal. Notice one subtle difference: the
segments of a pack literal do not contain .... In effect, every segment of a
pack literal acts as a separate loop body. As with the tuple literal syntax, the
pack literal pseudo-syntax can also be used in patterns.

The shape of a pack literal is a tuple of the arities of its segments, so the
shape of ‌⟬‌f32, Optional(each T), ‌⟪‌i32; ‌‖‌each y‌‖‌‌⟫‌‌⟭‌ is
(1, ‌‖‌each T‌‖‌, ‌‖‌each y‌‖‌). Other expressions and patterns also have shapes. In
particular, the shape of an arity coercion ‌⟪‌E; A‌⟫‌ is (A), the shape of
each X is ‌‖‌each X‌‖‌, and the shape of an expression that does not contain

That seems to be working for me. What I did was to insert a U+200C (zero width non-joiner) on both sides of each of the fancy characters. Unfortunately that seems to give the worse rendering of the double-chevron. I wonder if using guillemets (U+AB, U+BB: « ») instead of the mathematical double-angle brackets would work better:

‌⟬‌f32, Optional(each T), ‌«i32; ‌‖‌each y‌‖‌‌»⟭‌. The ‌⟬‌‌⟭‌ delimiters make this a

That seems OK. Maybe there's a better character to use here, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I avoided guillemets and similar characters because their small vertical size interferes with visual scanning, but maybe that's OK for shape coercions since they're generally pretty small. In any event, it looks like most of the glyphs I rejected for that reason also screw up the rendering of . ‹› also seem to work, but I like the visual consistency of all the variadic-related glyphs being doubled (ish).

I've converted the double-angles to guillemets. WDYT?

docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Show resolved Hide resolved
Comment on lines 355 to 375
```cpp
template <ConvertibleToString... Ts>
std::string StrCat(const Ts&... params) {
std::string result;
result.reserve((params.Length() + ... + 0));
StrCatImpl(&result, params...);
return result;
if constexpr (sizeof...(params) == 0) {
return "";
} else {
std::string result;
result.reserve((params.Length() + ... + 0));
StrCatImpl(&result, params...);
return result;
}
}

void StrCatImpl(std::string* out) { return; }

template <ConvertibleToString T, ConvertibleToString... Ts>
void StrCatImpl(std::string* out, const T& first, const Ts&... rest) {
out->append(first.ToString());
StrCatImpl(out, rest...);
if constexpr (sizeof...(rest) > 0) {
StrCatImpl(out, rest...);
}
}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josh11b asked me whether I'd write / prefer the form with if constexpr (with two termination conditions) or the form with the extra function overload. I said I'd actually write this:

Suggested change
```cpp
template <ConvertibleToString... Ts>
std::string StrCat(const Ts&... params) {
std::string result;
result.reserve((params.Length() + ... + 0));
StrCatImpl(&result, params...);
return result;
if constexpr (sizeof...(params) == 0) {
return "";
} else {
std::string result;
result.reserve((params.Length() + ... + 0));
StrCatImpl(&result, params...);
return result;
}
}
void StrCatImpl(std::string* out) { return; }
template <ConvertibleToString T, ConvertibleToString... Ts>
void StrCatImpl(std::string* out, const T& first, const Ts&... rest) {
out->append(first.ToString());
StrCatImpl(out, rest...);
if constexpr (sizeof...(rest) > 0) {
StrCatImpl(out, rest...);
}
}
```
```cpp
template <ConvertibleToString... Ts>
std::string StrCat(const Ts&... params) {
std::string result;
result.reserve((params.Length() + ... + 0));
(out->append(params.ToString()), ...);
return result;
}
```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too, but I don't think that's representative of what a typical C++ programmer would do in this situation, so this gets back to the question of whether we want the C++ examples to reflect typical usage or expert usage (see also this comment thread).

docs/design/variadics.md Outdated Show resolved Hide resolved
docs/design/variadics.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
long term Issues expected to take over 90 days to resolve. proposal rfc Proposal with request-for-comment sent out proposal A proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants