Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Migration Path For EKS Developer Preview #245

Closed
wants to merge 11 commits into from

Conversation

iliapolo
Copy link
Contributor

@iliapolo iliapolo commented Aug 26, 2020


By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache-2.0 license

@iliapolo iliapolo changed the title RFC: Migration Path For Intrusive Changes RFC: Migration Path For EKS Developer Preview Aug 26, 2020
@iliapolo iliapolo requested a review from a team August 26, 2020 16:48
eladb
eladb previously requested changes Aug 26, 2020
text/0244-eks-developer-preview-migration-path.md Outdated Show resolved Hide resolved

The user will then choose to either change imports to point to the `aws-eks-experimental` module, and continue working without interruption. Or, add the necessary feature flag to `cdk.json` and continue with the new module which will replace the cluster.

# Future Possibilities
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that's worth talking about is how do we make sure that the stuff we actually release as developer preview is working? It feels like there might be risk in accumulating many changes on the eks-dev-preview branch and then releasing all of them together as "developer preview" and then realizing something is horribly broken and having to release a breaking change to fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following our talk today, we decided to go with releasing an @aws-cdk/aws-eks-next that will incrementally introduce those changes. This will give us the option to use the module exactly as users would and validate it. We can also expose this module to a select few, or to those who choose to use it, knowing fully that it may introduce these types of changes frequently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that decision for aws-eks-next effectively invalidate the rest of this PR? Is this equivalent to the Rename new module alternative, except that we're not (yet) declaring the new module dev preview?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much

@mergify mergify bot dismissed eladb’s stale review August 26, 2020 18:32

Pull request has been modified.

This approach feels more fragile and doesn't provide much value since the user would still incur a breaking change.
It would just be in the form of a class name change, instead of a module name change.

## Rename new module
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternately, we could put the new developer preview module in a separate repository and vend it as an independent package. When we go to GA with CDK 2, the old module could be replaced entirely. The CLI warning is probably still a good idea. This way, anyone who does not want to replace their clusters stays on 1.x until they are ready.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of putting it in a different repo as opposed to just using a different name in the current repo? Also, are you suggesting the EKS library will be vended separately from the CDK until we release V2? Under which name? It feels wrong from a user's perspective.

We have actually already done this once with `eks.LegacyCluster`.

The benefit here is that users won't need to change dependencies. However, this change would also require a name change and
duplication of any class that interacts with `eks.Cluster` since those classes may not work with both `eks.Cluster` and `eks.ExperimentalCluster`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have been an ICluster. Is it still true in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately some constructs currently cannot accept an ICluster. If they were able to, you might be right.


# Motivation

As part of transitioning the `aws-eks` module to *Developer Preview*, we need to introduce a few breaking changes that require
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty interested in the specifics of this case. Breaking changes and replacement don't have to coincide. Can you go a little into the changes that prompted this RFC? Have we already done the work to see if we can avoid this process completely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking changes and replacement don't have to coincide

You're right, and actually the changes we talk about are not breaking the API. But since they require cluster replacement, we treat them with even more care than API breakage.

Can you go a little into the changes that prompted this RFC?

Essentially we weren't using the right mechanisms to configure the cluster connections and its interaction with our custom resource that runs kubectl commands against it. To make it work properly, we need to change the VPC configuration of the cluster, which requires replacement.

I'm not sure if you wanted more deep technical details, if so let me know, happy to talk about it.

You can have a look at this issue, which kind of prompted this whole thing.

Have we already done the work to see if we can avoid this process completely?

We have. Strictly speaking, we can probably get away without requiring cluster replacement. However, as we dove deeper into how EKS works, we realized that the our current configuration is messy, fragile, and in some (albeit rare) cases, is just plain wrong. In addition, we already had a feature that required cluster replacement a while ago, and eventually managed to avoid it at the expense of keeping a complicated stack modeling and introducing a 1 cluster per stack limitation.

At the end of the day, we decided we should do the right thing for GA, and make the cluster as robust as possible so that we minimize the change of having to do cluster replacements post GA.

@iliapolo
Copy link
Contributor Author

@rix0rrr @ericzbeard

After a discussion with @eladb - we've decided to go in a slightly different direction. The reasoning behind dropping the previous approach is described here.

@iliapolo
Copy link
Contributor Author

iliapolo commented Oct 8, 2020

Closing - we wound up not going in this migration path at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants