Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build a list of dependent constants to recompute in each instance of a generic. #4110

Merged
merged 12 commits into from
Jul 8, 2024

Conversation

zygoloid
Copy link
Contributor

@zygoloid zygoloid commented Jul 4, 2024

For each generic, build a list of instructions describing the computations we need to do when resolving an instance of the generic: this is a list of the instance-specific constants and types that the generic uses. Another way of viewing this list is as a block of Carbon SemIR code that is evaluated in order to form an instance of the generic -- this is referenced in the code as the "eval block" for the generic.

For each instruction in the generic whose type or value is a symbolic constant, replace that type or constant value with a symbolic reference that says "to find the actual type or value, look at index N in the list of values for the generic instance".

For an instruction with a symbolic constant value, we can just add that instruction to our list. For an instruction with a symbolic constant type, however, we may not have a corresponding instruction computing the type within the generic and may need to build a new instruction, but will reuse one where possible. In the case where we build a new instruction, we use the existing substitution code to build the type within the eval block.

For now, this transformation is only done in the declaration region of the generic, not in the definition region. Also, we map back from the symbolic references to the underlying constant value in a few places where we will eventually need to do a lookup into a generic instance, in order to avoid regressing the tests.

When forming a `ConstantId` for a symbolic constant, add storage to
track the generic in which the constant was formed and the index within
that generic. These fields are not yet populated.
@zygoloid zygoloid requested review from jonmeow and removed request for geoffromer July 4, 2024 01:01
Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good, but I spent a lot of time examining the substitute/rebuild flow changes due to the lambdas. The flow seems fine (although may be worth considering a class with virtual functions instead of lambda parameters?), but a pass on comments seems like it could be helpful.

// Returns whether two type IDs represent the same type. This includes the
// case where they might be in different generics and thus might have
// different ConstantIds, but are still symbolically equal.
auto EqualAcrossDeclarations(TypeId a, TypeId b) const -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I'd lean towards IsEqual... or AreEqual... here, similar to IsComplete (for verb phrasing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AreEqual... seems to work quite nicely with the value store accessors: ...types().AreEqual....

Done, and also renamed the corresponding function on the constant_values store.

@@ -5,6 +5,7 @@
#ifndef CARBON_TOOLCHAIN_CHECK_SUBST_H_
#define CARBON_TOOLCHAIN_CHECK_SUBST_H_

#include "common/map.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing something, but why is this added? Maybe leftover while refactoring? Perhaps ironically to me, things like llvm::function_ref maybe should have an include.

Suggested change
#include "common/map.h"
#include "llvm/ADT/STLFunctionalExtras.h"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, left over from refactoring. Just removing the include since I'm also going to look at switching from lambdas to virtual functions.

Comment on lines +114 to +115
auto inst_id = SemIR::InstId(arg);
if (!inst_id.is_valid()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just note, this looks kind of weird versus line 66 in the delta. I don't know if it's worth switching formats since this is consistent with line 122 below, or you could switch both. Or don't do anything, it's probably not going to look as odd later in code, versus when I'm just seeing the delta.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. I think this does look less weird in the full context of the file.

Comment on lines 166 to 167
// Pops the operands of the specified instruction off the worklist and rebuilds
// the instruction with the updated operands.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like it would unconditionally call rebuild_fn: should the documentation be adjusted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 28 to 29
// returns false, the instruction will be decomposed and substitution will be
// performed recursively into its operands.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the comments seem to be describing implementation constraints. From reading the implementation, I think this is different: is this a constraint on the caller, or is it a constraint on the implementation? If it's a constraint on the caller, perhaps it should be on SubstInst instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to talk more about what the return value means and less about what SubstInst will do with it.

using SubstRebuildFn = llvm::function_ref<
auto(SemIR::InstId orig_inst_id, SemIR::Inst new_inst)->SemIR::InstId>;

// Performs substitution into `inst_id` and its operands recursively.
Copy link
Contributor

@jonmeow jonmeow Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the flow should have some more comments here? From the comments:

subst_fn: "A function that performs any needed substitution into an instruction"
rebuild_fn: "A function that rebuilds an instruction after substitution."
SubstInst: "Performs substitution into inst_id and its operands recursively."

  • Is subst_fn called on all the operands of inst_id unconditionally [unconditional on direct, conditional on indirect], or only recursively [conditional on both direct and indirect]?
  • On line 237, "// No need to rebuild this instruction.", why do instructions that return false on subst_fn but have no operands not get rebuilt? (or, why are they allowed to return false when they don't need substitution?)
  • I've also commented on Rebuild, because the call to rebuild_fn looks conditional in a way I don't see explained here. [but note since that's going to be in the cpp file, may still be worth a note here]

Here, maybe something like:

Performs recursive substitution on instructions, starting with inst_id. Each instruction is passed to subst_fn:

  • On true, the instruction is a leaf.
  • On false, this recurses into the instruction's operands. After all operands are processed, then when [Rebuild condition], the instruction is passed to rebuild_fn.

But it may be worth looking at further than just that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a pass over the comments.

Comment on lines 110 to 111
for (size_t i = 0;
i != context.generic_region_stack().PeekDependentInsts().size(); ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (size_t i = 0;
i != context.generic_region_stack().PeekDependentInsts().size(); ++i) {
for (auto i : llvm::seq(context.generic_region_stack().PeekDependentInsts().size())) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yeah, while the collection could get reallocated (eg, by import ref handling), the number of instructions at this level on the stack really shouldn't change. Made this change and also added a CHECK.

Copy link
Contributor Author

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow seems fine (although may be worth considering a class with virtual functions instead of lambda parameters?), but a pass on comments seems like it could be helpful.

Switched to a class with virtual functions as suggested. I think that is an improvement. I didn't find a really satisfying name for the class, but I think SubstInstCallbacks is at least clear as to its purpose.

Comment on lines 110 to 111
for (size_t i = 0;
i != context.generic_region_stack().PeekDependentInsts().size(); ++i) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yeah, while the collection could get reallocated (eg, by import ref handling), the number of instructions at this level on the stack really shouldn't change. Made this change and also added a CHECK.

Comment on lines +114 to +115
auto inst_id = SemIR::InstId(arg);
if (!inst_id.is_valid()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. I think this does look less weird in the full context of the file.

Comment on lines 28 to 29
// returns false, the instruction will be decomposed and substitution will be
// performed recursively into its operands.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to talk more about what the return value means and less about what SubstInst will do with it.

using SubstRebuildFn = llvm::function_ref<
auto(SemIR::InstId orig_inst_id, SemIR::Inst new_inst)->SemIR::InstId>;

// Performs substitution into `inst_id` and its operands recursively.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a pass over the comments.

@zygoloid zygoloid requested a review from jonmeow July 8, 2024 22:11
Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The edits are helpful for me.

As noted in chat, in my head I'd been thinking a full class for SubstInst logic, not just the callbacks, but this still seems a little easier to grasp.

@zygoloid zygoloid added this pull request to the merge queue Jul 8, 2024
Merged via the queue into carbon-language:trunk with commit 7322a1e Jul 8, 2024
7 checks passed
@zygoloid zygoloid deleted the toolchain-generic-2 branch July 8, 2024 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants