We were unable to load Disqus. If you are a moderator please see our troubleshooting guide.
Thanks Rémi, glad you found it helpful 😊.
Hi Adam,
I found your post in a open case in AWS CDK from Feb 2025: https://github.com/aws/aws-cdk/issues/33395#issuecomment-2670585006
I am passing in a ECR into a lambda stack that is built from that image repository. I wanted to replace the constructor passing with the from__XXX methods - from_repository_name is what I am using since I used fixed names where possible which lets me construct arns with no imports including from SSM parameters.
Even with —exclusive (as mentioned below) CDK fails to delete the export. I tried the exportValue using the repository in that producer stack but it didn’t work. My situation could be similar to the lambda layered below.
Seeking your wisdom as AWS Enterprise support is clueless.
Hey Aaron,
can you provide some details? Can you show what you changed in your CDK code, and what happened when you deployed that change? What error did you get when using exportValue()?
I have two stacks, producer ECR and consumer for Lambda. The existing stacks were already deployed! So if I make a change to the lambda stack it will try to recreate the ECR and throw an already exists exception since I used fixed names.
Some AWS Resources such as CloudWatch Logs, ECRs, and Secrets have this issue but not all such as the lambda. My goal is to not orphan the ECR repository in the producer stack by using a from_XX.
Producer ECR Stack:
(From your blog and github) self.export_value(self.ecr_repository.repository.repository_arn)
Consumer Lambda Stack:
- IRepository constructor param
- code=aws_lambda.Code.from_ecr_image(
repository=ecr_repo, <---- HERE tag_or_digest=namespace.get_container_version(scope),
)
I want to remove the IRepository and replace it with a from_repository_name or arn to make it more loosely coupled so the already exists exception doesn't get thrown from the dependent producer stack. When I deploy the stack I get the import issue where the ECR tries to remove the export first before the Lambda stack can update it appropriately.
Error: Cannot delete export <lamda stack>:ExportsOutputRefs<repository stack><hash> as it is in use by <lamda stack>
Maybe the export is for the name of the repository, and not the ARN? And that's why the exportValue() trick doesn't work?
And what error do you get when you deploy only the Lambda stack with the --exclusive flag?
Okay Adam. I accidentally replied to a top level thread and bit this one. Please see below. But I ended up getting it to work. After setting both arn and name exportValue I was able to remove the constructor param and the dummy exports.
Thank you again!
Nice, glad you got it working!
Hey Adam,
I found a negative side effect with this particular set of AWS constructs being the ECR and lambda. In my code base I usually explicitly define IAM roles and permissions. However, the ECRs permissions are created as part of the lambda from_ecr_image method since I don’t define one. Apparently, when I successfully decouple the stacks using the exports and from_repository_name … when I navigate to the lambda in the console it has an error displayed saying “Failed to restore the function <name>: the function does not have permission to access the specified image.” When I navigate to the ECR console, the repository permissions were apparently removed during this process.
I wonder if this is a bug or feature gap with CDK for a scenario like this. I will attempt to explicitly create the permissions but am going to follow up with AWS support and keep you updated. Any insights are greatly appreciated.
Yes, this is expected, because you made the ECR Repository immutable by changing to using from_xyz() methods, instead of the real object. So, you need to handle permissions correctly yourself now.
You can't have it both ways 🙂.
Adam I appreciate you making the time.
I use exclusive on the consumer stack. It attempts to update the producer ECR first (this is dev env so all policies are Destroy if that makes a difference) and I get the same error as indicated above.
I tried the the name for export value it had “arn” in the error before the hash. When I flipped it back to arn it only had the hash.
Now for the best part - I set both the name and the arn as the export and it worked!!!
What is your advice then going to production. I am taking notes in dev and now need to proceed with next phase to remove the constructor imports.
For production, I would suggest you follow the procedure explained in the article, just with two exportValue() calls -- one for the ARN, and the second one for the name.
Hey Adam!
I found a new side effect and want to run it by you. I only handled the ECR swap but for some reason there is another stack where the lambda loses its event source in the console.
class ExampleProcessingPipeline(Stack):
def __init__(
self,
scope: Construct,
id: str,
*,
processing_topic: ITopic,
processing_queue: IQueue,
processing_lambda: IFunction,
**kwargs: StackProps,
) -> None:
super().__init__(scope, id, **kwargs)
processing_topic.add_subscription(
aws_sns_subscriptions.SqsSubscription(
queue=processing_queue, raw_message_delivery=True
)
)
. # This loses its entry in the lambda stack because on cdk deploy lambda update this was deleted
processing_lambda.add_event_source(
aws_lambda_event_sources.SqsEventSource(
processing_queue
)
)
Is there a way to prevent this from being removed when the only refactor is the ECR on the lambda?
I don't understand what the question is, sorry. Can you please:
1. Show the initial code.
2. Show the change you made.
3. The output of running cdk diff with the above code change.
High Level:
ECR Stack -> Lambda Stack -> SubscriptionStack
Exports only used in ECR stack to do the refactor against the lambda stack from constructor params to from_XXX.
For some reason, during this process the lamda stack deleted the event source mapping:
lambda-stack | 3/11 | 1:36:14 PM | DELETE_IN_PROGRESS | AWS::Lambda::EventSourceMapping | lambdaSqsEventSources...9FFD07D724EA2006
class Lambda(Stack):
def __init__(
self,
scope: Construct,
id: str,
*,
ecr_repository: IRepository,
vpc: IVpc,
vpc_subnets: Sequence[ISubnet],
security_groups: list[ISecurityGroup],
deploy_env: DeploymentEnv,
**kwargs: StackProps,
) -> None:
super().__init__(scope, id, **kwargs)
lambda_function = ContainerizedLambdaFunction(
scope=self,
namespace=OffboardingFunctionNamespace.X,
ecr_repo=ecr_repository.repository,
vpc=vpc,
vpc_subnets=ec2.SubnetSelection(subnets=vpc_subnets),
security_groups=security_groups,
env=deploy_env,
).lambda_function
XApiAccess(
scope=self,
namespace=OffboardingFunctionNamespace.X,
lambda_function=lambda_function,
env=deploy_env,
)
CfnOutput(
self,
CfnOutputNames.X_ROLE_ARN,
value=lambda_function.role.role_arn,
export_name=CfnOutputNames.X_ROLE_ARN,
)
class Labmda(Stack):
def __init__(
self,
scope: Construct,
id: str,
*,
vpc: IVpc,
vpc_subnets: Sequence[ISubnet],
security_groups: list[ISecurityGroup],
deploy_env: DeploymentEnv,
**kwargs: StackProps,
) -> None:
super().__init__(scope, id, **kwargs)
ecr_repository = Repository.from_repository_name(
self,
OffboardingFunctionNamespace.X.container_name,
repository_name=OffboardingFunctionNamespace.X.container_name,
)
lambda_function = ContainerizedLambdaFunction(
scope=self,
namespace=OffboardingFunctionNamespace.X,
ecr_repo=ecr_repository,
vpc=vpc,
vpc_subnets=SubnetSelection(subnets=vpc_subnets),
security_groups=security_groups,
env=deploy_env,
).lambda_function
I will need to redeploy to get the cdk diff. Not sure if you can figure it out from the above.
I seriously doubt changing the ECR repository that the Function uses resulted in the mapping being deleted. I'm 99% sure there are other changes made before that you didn't deploy that were the reason for that. But I can't know unless I see the entire app, the output of cdk diff before your code change, the exact code change that you made, and then the output of cdk diff after that change.
From your code, it's not even clear that the Function you're showing is the one whose mapping was changed (the one you showed isn't exported from LambdaStack, and the other one is called processing_lambda).
So Adam,
This is how I fixed it but wondering if there is a better way ... since I don't understand how the export of the lambda itself affects the event source mapping. I also tried retaining the original CDK helper methods by trying to change the parameters of the SQS event source but unfortunately it didn't trigger a change in cloudformation to update the lambda stack:
processing_queue.grant_consume_messages(processing_lambda)
EventSourceMapping(
self,
"DeviceFreezerSQSEventSourceMapping", # Change logical id to force change...?
target=processing_lambda,
event_source_arn=processing_queue.queue_arn,
batch_size=10,
enabled=True
)
I wonder if it should be considered a best practice to proactively add exportValue(...) whenever you create a cross-stack dependency because I repeatedly forget about this issue.
That's probably a little overkill 🙂. It's not that common to remove these cross-stack references.
That workaround seems excessive. Why not just use cdk deploy ConsumingStack --exclusive first, then complete with cdk deploy --all, or am I missing a problem with that?
Ah, just saw your explanation to jk451.
Yep - comment http://disq.us/p/2r8vozt contains my answer.
Hi Adam, quick question please:
I was testing the github example you've provided, and noticed that the referenced import gets removed from the consuming stack, and this works well when we need to 'Delete' the resource being Exported.
But is there an easier way to break the deadly embrace while trying to 'Update' the resource being exported and imported eventually? In this case, I have RDS cluster that needs to be updated and is also being imported by several stacks(Close to 7 of them). Thank you.
Hi,
so, usually when updating a resource being shared, that doesn't change the name/ARN of the updating resource, so this procedure is not needed.
Are you maybe talking about an update that causes a replacement (so, creating a new resource, and deleting the old resource)? Because if that's the case, you need to instead do the split described in this article (create a new resource in step 1, and then remove the old resource in step 2).
If you run cdk diff, it will tell you whether your change requires a replacement.
Hope this makes sense!
Thanks,
Adam
Is it correct that the main scenario this article is trying to address is that when the CDK app is getting deployed via a pipeline to many production environments? Otherwise, if I have access to `cdk deploy`, then presumably I could just run `cdk deploy [consuming stack]` first, then `cdk deploy '*'`?
Actually, no, that would still not work in all cases 🙂.
1. Just cdk deploy ConsumingStack will not work, because by default cdk deploy tracks dependencies between Stacks, so it would still try to deploy ProducingStack first, which would fail. You can add the --exclusively flag, or its shorthand, -e, to cdk deploy to change that behavior.
2. But even if you did that, this would only work if you're completely removing any link between the Stacks. But that's normally not what happens - you don't want to remove all links between the ProducingStack and ConsumingStack, you just want to change what that link is, like in the example shown in this article, where we're changing the link from being an S3 Bucket to a DynamoDB Table. In the case of changing the link, running cdk deploy --exclusively ConsumingStack would fail (as the changed CloudFormation export has not been deployed in the ProducingStack yet).
Got it, thank you for the detailed explanation! I do like the approach laid out here more anyway than hacking around with the CLI (other than for dev testing) since this can be CRed and go through the normal pipeline stages without any production account futzing :)
My pleasure! Yes, doing this through code reviews, and your normal pipeline flow, is exactly the point.
This is wonderful.
Thanks Emma, glad you liked it 😊
Do you know if the latest cdk version handles it automatically?
I've removed dependency, it updated stacks without any problems.
As far as I know - no. Nothing has changed here in recent CDK versions.
Given that, I'd be curious why it worked for you! If you could share your code, we could dig into what happened.
Interesting
I haven’t looked what exactly happens. With this example I see cdk adds and removes dependency without any error
Did you mean this change in your CDK code? https://github.com/Catenary...
Can you show me the output of `cdk diff` after you've deployed the initial version of your CDK app, and then made the changes to your code I linked to above?
here it is
https://gist.github.com/Cat...
Thanks. I see what happened from the diff.
You changed the code in the consumer Stack to use a Bucket from that same Stack, instead of using the Bucket from the consumer Stack. Because of that, there's no dependency relationship between the Stacks anymore. So, when you ran `cdk deploy`, CDK probably chose to deploy the consumer Stack first, and that worked - it removed the usage of the exports. So then, deploying the producing Stack that removes the exports themselves worked fine, because it had no consumers.
In general, that's not what happens though; usually, you still have a dependency relationship between the producing and consuming Stacks when you run into this problem, just modified because of sharing a different resource.
Thanks for the great article!
When the code base gets larger and more complicated, is there an easier way to `create dummy exports and remove references` at many places?
Also, I've been a bit surprised that there is no built-in solution to handle this cross-stack reference issue (during code change deployments), which seems to be happening very often, or what am I missing?
Thanks Leo!
As for doing this in an automated way, there is an RFC planned to switch to using SSM Parameters for doing cross-stack references (https://github.com/aws/aws-... ), but it's still in the design phase.
However, I don't think this situation should happen that often... is that not the case for you? Is this something that happens constantly in your project(s)?
A good example is when updating a layer. The layer is in a separate stack and used in 50 lambdas in 20 different stacks. When changing the layer you have to remove all dependencies first.
In this case the cdk should only create dependencies between stacks but should not block updates of the layer because of use in other layers at all.
Maybe the best way to update the layer currently is to create a new layer with the update and reference the new layer in all lambdas. Then deploy it. After deploy remove the old the layer and deploy again.
Thanks for the great article!
Thanks for the comment Herbert!
Maybe the best way to update the layer currently is to create a new
layer with the update and reference the new layer in all lambdas. Then deploy it. After deploy remove the old the layer and deploy again.
Yes, I think that's exactly how you should approach a change with so much impact.
Hi Adam Ruka,
We are facing the same issue that Marcus has mentioned about layers. Shouldn't creating a new layer (or the new version) and updating the related consumers of that layer happened inside the CDK? By the way, we are not even using "CfnOutput" and "import" in cross stack deployments. Layers are getting created within the same CDK context with multiple stacks by passing layer objects into related stacks.
EDIT: I found the related CDK RFC thread Parameter Store for cross stack references. So, nevermind I guess the only way is right now handling this with manual interruption.
I think the correct way is still to do the change in 2 steps ("Update and reference the new layer in all lambdas. Then deploy it. After deploy remove the old the layer and deploy again"), because this is a very impactful change, so you want to make it easy to roll back by preserving the old layer until you're certain the new one works correctly. Actually, I would probably change it in just one Lambda function first, and deploy just that, to be 100% certain.
Yeah, we're making many changes recently while launching a new service and those changes are breaking changes to some extent.
So I guess that's the primary reason why we see this so often.
Thanks for the pointer on the SSM parameters, will check that out!
Clear and interesting post Adam, keep it up!
Thanks 😊
Amazing post! You saved me from this weird case of cdk…
I needed to delete an aws_ec2.BastionHostLinux. The first method exportValue() didn't work for me as I was not able to obtain the ARN or the specific thing to fill.
I created the dummy CfnOutput.
I'm pasting the code if it helps anyone
Cheerz Adam ! :)
cfn = CfnOutput(self, id="dummy",
value="xx",
export_name="infra-base-staging:ExportsOutputFnGetAttsgbastion106385A9GroupId9B776F37"
)
cfn.value = {"Fn::GetAtt":["sgbastion106385A9","GroupId"]}
cfn.override_logical_id('ExportsOutputFnGetAttsgbastion106385A9GroupId9B776F37')