Deploying Atlantis on AWS ECS (Elastic Container Service) is a robust solution that allows you to take advantage of the managed container orchestration capabilities of ECS. Here’s a step-by-step guide to setting up Atlantis on ECS.
1. Prepare Your Atlantis Docker Image
Atlantis can be run from its official Docker image. You may want to customize the Docker image or Dockerfile to fit your specific configurations or include specific Terraform versions.
Deciding whether to use the official Atlantis Docker image as-is or to create a custom version depends on several factors related to your specific operational and security needs. Here are some key considerations that can help guide your decision:
Terraform Version Requirements
Atlantis supports multiple versions of Terraform. If your projects require a specific version of Terraform that isn’t included in the latest Atlantis release, or if you need to support multiple versions simultaneously, you might consider customizing the image.
Additional Tools and Plugins
If your Terraform configurations depend on specific tools or plugins (like terraform-docs
, tflint
, or custom providers) that aren’t included in the standard Atlantis image, you might need to extend the Docker image to include these tools.
Security Enhancements
Custom images can be tailored for enhanced security:
- Removing unnecessary tools or services included in the base image.
- Adding security tools or scripts that scan for vulnerabilities or ensure compliance with your company’s security policies.
Integration with Internal Tooling
Sometimes, internal development or deployment workflows require custom scripts or tools to be available in your CI/CD environment:
- Integrating with internal logging, monitoring, or notification systems.
- Custom entrypoint scripts that configure Atlantis based on dynamic or environment-specific parameters.
Configuration and Environment Setup
Although Atlantis can be configured largely through environment variables and command-line arguments, certain scenarios might require changes to the file system or specific configuration files that aren’t easily managed through runtime configuration:
- Pre-populating Atlantis with specific configuration files or templates.
- Modifying the underlying operating system or container settings for performance optimizations or to comply with internal requirements.
Compliance and Audit Requirements
In regulated industries, there might be compliance requirements that necessitate specific changes to the software configuration or the inclusion of audit tools within the container.
Steps to Create a Custom Atlantis Docker Image:
If you decide to create a custom image, here’s a simple example of how to extend the official Atlantis image:
# Use the official Atlantis image as a base
FROM runatlantis/atlantis:v0.17.0
# Install additional tools
RUN apk add --no-cache jq curl
# Add custom scripts or configuration files
COPY custom-scripts/ /usr/local/bin/
COPY my-atlantis-config.yaml /etc/atlantis/config.yaml
# Set custom entrypoint if needed
COPY entrypoint.sh /usr/local/bin/entrypoint.sh
RUN chmod +x /usr/local/bin/entrypoint.sh
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
2. Create an ECS Task Definition
Create an ECS Task Definition that specifies the Atlantis Docker container. You’ll need to configure:
- CPU and memory allocation for the container.
- Environment variables such as
ATLANTIS_GH_USER
,ATLANTIS_GH_TOKEN
,ATLANTIS_GH_WEBHOOK_SECRET
,ATLANTIS_REPO_WHITELIST
, etc. - Logging to an AWS CloudWatch Log Group to keep track of Atlantis server logs.
Setting up an ECS Task Definition for the Atlantis Docker container involves several steps. Here’s a detailed guide on how to create and configure the ECS Task Definition using the AWS Management Console:
Create a New Task Definition
- Go to the Amazon ECS console: Navigate to ECS and select “Task Definitions” from the sidebar.
- Create new Task Definition: Click the “Create new Task Definition” button.
- Select launch type compatibility: Choose “Fargate” or “EC2” depending on your ECS cluster configuration. Fargate is serverless and does not require you to manage the underlying EC2 instances.
Configure Task and Container Definitions
Task Configuration
- Task Role: Assign an IAM task role that Atlantis will use to interact with AWS resources if necessary.
- Network Mode: Select “awsvpc” for Fargate or based on your network setup for EC2.
- Task execution IAM role: Specify an execution role that allows ECS to manage resources on your behalf, such as pulling images and storing logs in CloudWatch.
- Task size: Define the CPU and memory for the task. For example, 512 CPU units (0.5 vCPUs) and 1024 MB (1 GB) memory.
CPU and Memory Usage
Atlantis itself is not particularly CPU-intensive unless it’s processing multiple and complex Terraform plans simultaneously. For a small team of 5 to 10 people who run workloads occasionally, the following are reasonable starting points:
- CPU: Allocation of about 256 to 512 CPU units (0.25 to 0.5 vCPUs) should suffice for handling occasional concurrent operations.
- Memory: Atlantis could be expected to use around 512 MB to 1 GB of RAM. The memory usage can spike depending on the complexity and size of your Terraform plans.
For bigger teams the number will need to be big or much bigger based on the demand of the software.
Cost Factors
To estimate the cost of running Atlantis on AWS ECS for a small setup and if the setup doubles, we’ll consider a typical configuration using AWS Fargate, as it simplifies management by abstracting the server layer. We’ll look at two scenarios: one for a team of 5-10 people and another for a team size doubling (up to 20 people).
- Fargate Pricing:
- You pay for the amount of vCPU and memory resources that your containerized application requests.
- Prices vary by region. I’ll use the US East (N. Virginia) region as an example.
- Network Costs:
- Data transfer costs within the same AWS region are typically negligible.
- Costs accrue primarily from data transfer across regions or to the internet.
- AWS CloudWatch:
- Logging and monitoring costs, depending on usage.
Base Scenario (Small Team: 5-10 People):
- Fargate CPU: 0.5 vCPU
- Fargate Memory: 1 GB
- Region: US East (N. Virginia)
Cost Calculation:
- vCPU Cost: $0.04048 per vCPU hour
- Memory Cost: $0.004445 per GB hour
Daily Cost:
- CPU: 0.5×0.04048×240.5 \times 0.04048 \times 240.5×0.04048×24 = $0.48576
- Memory: 1×0.004445×241 \times 0.004445 \times 241×0.004445×24 = $0.10668
Monthly Cost (assuming 30 days):
- CPU: 0.48576×300.48576 \times 300.48576×30 = $14.57
- Memory: 0.10668×300.10668 \times 300.10668×30 = $3.20
- Total: 14.57+3.2014.57 + 3.2014.57+3.20 = $17.77 per month
Doubled Scenario (Up to 20 People):
- Fargate CPU: 1 vCPU
- Fargate Memory: 2 GB
Daily Cost:
- CPU: 1×0.04048×241 \times 0.04048 \times 241×0.04048×24 = $0.97152
- Memory: 2×0.004445×242 \times 0.004445 \times 242×0.004445×24 = $0.21336
Monthly Cost:
- CPU: 0.97152×300.97152 \times 300.97152×30 = $29.15
- Memory: 0.21336×300.21336 \times 300.21336×30 = $6.40
- Total: 29.15+6.4029.15 + 6.4029.15+6.40 = $35.55 per month
Additional Costs:
- CloudWatch Logs: Depending on the volume and retention of logs. Typically, $0.50 per GB ingested and $0.03 per GB per month for log storage.
- Network: Generally low within the same region; mainly associated with internet egress if applicable.
Cost summary:
The above estimates provide a rough cost for running Atlantis in a Fargate environment. Doubling the team and hence the workload essentially doubles the compute resources required and hence roughly doubles the cost, from about $17.77 to $35.55 per month. These costs are highly scalable and depend on actual usage, so monitoring and adjusting configurations based on actual needs can provide cost savings. Adjust the estimates based on your specific AWS region and usage patterns for more precise budgeting.
Container Definitions
- Add container: Click “Add container” to define the Atlantis container details.
- Container name: Give your container a name, like “atlantis”.
- Image: Enter the Atlantis Docker image URL from Docker Hub, e.g.,
runatlantis/atlantis
. - Memory Limits (MiB): Specify the hard limit and soft limit for memory usage.
- Port mappings: Add the port Atlantis listens on, typically 4141 unless configured otherwise.
Environment Variables
Configure the necessary environment variables:
ATLANTIS_GH_USER
: The GitHub username Atlantis will use.ATLANTIS_GH_TOKEN
: The GitHub token for authentication.ATLANTIS_GH_WEBHOOK_SECRET
: The secret used to validate GitHub webhooks.ATLANTIS_REPO_WHITELIST
: Repositories Atlantis is allowed to interact with, e.g.,github.com/org/repo
.
Logging
- Log configuration: Select “awslogs” as the log driver.
- Log options: Configure the log options:
- awslogs-group: Name of the CloudWatch Log Group.
- awslogs-region: AWS region where the logs should be stored.
- awslogs-stream-prefix: Prefix for the log streams.
Create the Task Definition
After configuring the details, click “Create” to save the task definition.
Additional Notes:
- Security: Make sure to secure your environment variables, especially tokens and secrets. Consider using AWS Secrets Manager and referencing secrets directly in the task definition.
- Volumes and Storage: If persistent storage is needed for any reason, configure volumes in the task definition.
- Scaling and Load Balancing: If you expect high load or require high availability, consider setting up service auto-scaling and an Application Load Balancer as previously described.
By following these steps, you’ll have a configured ECS task definition ready for deploying the Atlantis service in your ECS cluster. This setup leverages the AWS managed services, providing robustness and scalability for your infrastructure automation workflows.
3. Configure Networking
Set up a VPC, subnets, and a security group for ECS:
- Ensure that the security group allows HTTP/HTTPS traffic from the internet if you plan to receive webhooks from GitHub/GitLab/Bitbucket.
- Consider using an Application Load Balancer (ALB) in front of ECS to handle SSL termination and to route traffic to the Atlantis container.
4. Create an ECS Cluster
- If you don’t already have an ECS cluster, create one in your defined VPC.
5. Set up the ECS Service
Create an ECS service that uses the task definition. Configure the service to:
- Use the ALB for load balancing.
- Define how many tasks should run simultaneously (typically, one Atlantis server is sufficient).
6. Configure an Application Load Balancer (ALB)
- Create an ALB to distribute incoming HTTP and HTTPS traffic to the ECS service tasks.
- Set up HTTPS listeners if you are handling sensitive data. You will need an SSL/TLS certificate, which you can manage with AWS Certificate Manager.
- Configure the ALB health checks to point to the Atlantis health check path (
/healthz
).
7. Set up Route 53
Configure a Route 53 DNS record to point to your ALB, making Atlantis accessible through a friendly DNS name.
8. Configure Atlantis
- Ensure Atlantis is configured to handle the repositories and workflows you need. Configuration can be done via command-line flags, environment variables, or a configuration file.
- Set up webhooks in your VCS (Version Control System) provider to point to the Atlantis server URL.
9. IAM Roles and Policies
- Attach an IAM role to the ECS task definition that grants the Atlantis task permissions to access necessary AWS resources and make API calls to manage the infrastructure.
10. Security Best Practices
- Secure your webhooks and ensure your secrets (e.g., GitHub token, webhook secret) are stored securely using AWS Secrets Manager or another secrets management tool.
- Regularly update the Docker images and ECS task definitions to include security patches and updates.
This setup will provide you with a scalable and secure Atlantis environment managed under ECS, allowing you to leverage AWS’s robust infrastructure for managing your Terraform workflows.
Conclusions
We explored the feasibility and setup of deploying Atlantis, a tool for automating Terraform workflows, using various AWS services. We discussed why directly implementing Atlantis on AWS Lambda might not be ideal due to the persistent and stateful nature required by Atlantis, contrasting with the ephemeral and stateless design of Lambda functions. Instead, deploying Atlantis on AWS ECS was identified as a robust solution, leveraging the managed container orchestration capabilities of ECS to handle scalability and reliability effectively.
We went through the steps to prepare an Atlantis Docker image, whether to use the official image or customize it, and detailed the setup of an ECS Task Definition, including configurations for CPU, memory, environment variables, and logging. The discussion also covered cost estimations for a small team scenario and the implications of doubling the team size, providing a practical view on budgeting for infrastructure costs.
Overall, this post provided a comprehensive look at setting up Atlantis in an AWS environment, highlighting the considerations for operational efficiency, security, and cost management, making it a valuable guide for teams looking to automate their Terraform operations in the cloud.