Terraform State Lock: ConditionalCheckFailedException
Problem Statement
Terraform state lock errors occur when Terraform cannot acquire exclusive access to your infrastructure state file. The most common error message is:
Error: Error acquiring the state lock: ConditionalCheckFailedException: The conditional request failed
This protection mechanism prevents multiple Terraform processes from modifying your infrastructure state simultaneously, which could lead to corruption and inconsistent infrastructure.
Common Causes
1. Interrupted Processes
The most frequent cause is when a Terraform operation is terminated unexpectedly:
- Network connectivity loss during
terraform apply
orterraform plan
- Process termination (Ctrl+C, system crash, out-of-memory errors)
- Pipeline job cancellation or timeout
2. Permission Issues
Insufficient permissions for your backend storage:
- Missing
s3:DeleteObject
permission for AWS S3 backends - Missing
dynamodb:DeleteItem
permission for DynamoDB lock tables - Insufficient Google Cloud Storage permissions
3. Concurrent Execution
Genuine concurrent Terraform operations:
- Multiple team members running Terraform simultaneously
- Overlapping CI/CD pipeline executions
4. Configuration Problems
- Incorrect AWS profile or credentials
- Corrupted lock entries in backend storage
Solutions
1. Force Unlock the State
The primary solution is to use Terraform's built-in force-unlock command:
terraform force-unlock <LOCK_ID>
Where <LOCK_ID>
is the UUID from your error message (e.g., 9db590f1-b6fe-c5f2-2678-8804f089deba
).
WARNING
Only use force-unlock when you're certain no other Terraform processes are actively working with the state. Forcing unlock during an active operation can corrupt your state.
2. Forced Unlock (When Standard Method Fails)
For stubborn lock conditions:
terraform force-unlock -force <LOCK_ID>
3. Manual Lock Removal by Backend
Different backends require different manual intervention:
AWS S3 + DynamoDB:
- Delete the lock entry from your DynamoDB table
- Ensure proper IAM permissions (
s3:DeleteObject
,dynamodb:DeleteItem
)
Google Cloud Storage:
- Navigate to your storage bucket
- Locate and delete the
.tflock
file - Or use the "Break lease" button in Google Cloud Console
Azure Blob Storage:
- Use the "Break lease" functionality in Azure Portal
4. Disable Locking (Temporary Workaround)
As a last resort, you can disable locking:
terraform plan -lock=false
terraform apply -lock=false
DANGER
Disabling locking removes Terraform's protection against concurrent state modifications. Use this only temporarily and never in team environments.
5. Resolve Permission Issues
Ensure your execution environment has proper permissions:
AWS IAM Policy Example:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::your-bucket/path/to/state"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:DeleteItem"
],
"Resource": "arn:aws:dynamodb:*:*:table/your-lock-table"
}
]
}
6. Terminate Stuck Processes
If Terraform processes are genuinely stuck:
# Find Terraform processes
ps aux | grep terraform
# Terminate specific process
kill -9 <PROCESS_ID>
Prevention Strategies
1. Implement Proper Timeouts
Configure reasonable timeouts in your CI/CD pipelines to avoid hanging processes.
2. Robust Error Handling
Ensure your automation scripts properly handle failures and cleanup:
#!/bin/bash
set -e
terraform plan -out=tfplan
terraform apply tfplan || {
echo "Apply failed, attempting to unlock state..."
terraform force-unlock $LOCK_ID
exit 1
}
3. Regular State Maintenance
Periodically review and clean up old lock entries in your backend storage.
4. Team Coordination
Implement processes to prevent multiple team members from working on the same infrastructure simultaneously.
When to Use Each Solution
Scenario | Recommended Solution |
---|---|
Known interrupted process | terraform force-unlock <LOCK_ID> |
Permission errors | Fix IAM/storage permissions |
Genuine concurrent execution | Wait for other process to complete |
CI/CD pipeline failures | Implement automatic cleanup in scripts |
Unknown root cause | Investigate backend storage manually |
Troubleshooting Steps
- Verify no active Terraform processes are running
- Check your backend storage for existing locks
- Validate permissions and credentials
- Attempt standard force-unlock
- Consider manual intervention if automatic methods fail
- Document the incident for future prevention
INFO
Always check the lock creation timestamp in the error message. Locks older than a few minutes are likely stranded and safe to remove.
State locking is Terraform's safeguard mechanism. While force-unlock provides an escape hatch, understanding and addressing the root causes will lead to more stable infrastructure management.