Build failures and delays due to upstream service disruption

Incident Report for CircleCI

Resolved

This incident has been resolved. All job types are now operating normally with standard queue times. Linux machine and Remote Docker capacity has been fully restored following the AWS incident recovery.

Thank you for your patience.
Posted Oct 20, 2025 - 22:15 UTC

Monitoring

Queue times for all job types have returned to normal as AWS continues to recover from their incident (https://health.aws.amazon.com/health/status).

All job types: Operating at normal capacity with standard queue times
Performance has fully restored across Linux machine, Remote Docker, Docker, ARM, IP-Ranges Docker, and Mac jobs. Windows and Android were unaffected throughout the incident.

We're continuing to monitor system performance closely to ensure stability. Thank you for your patience during this incident.
Posted Oct 20, 2025 - 21:42 UTC

Update

We're seeing meaningful improvement in Linux machine and Remote Docker job performance as AWS recovers from their incident (https://health.aws.amazon.com/health/status).

Current Status:
Linux machine and Remote Docker jobs: Average wait times have improved to approximately 10-15 minutes
Docker and Mac jobs: Operating normally though ARM and IP-Ranges Docker jobs may still experience longer queue times
Windows and Android jobs: Unaffected

While we haven't returned to full capacity, the situation is steadily improving. We continue to work and monitor the queue times closely.
Thank you for your patience during this incident. We'll continue to update you as conditions improve.
Posted Oct 20, 2025 - 21:08 UTC

Update

We continue to experience capacity limitations for Linux machine and Remote Docker jobs related to the AWS incident (https://health.aws.amazon.com/health/status). Docker, Mac, Windows and Android jobs are all operating normally.

We're actively monitoring the situation and will continue to provide updates as the situation evolves.
We appreciate your patience as we work through these infrastructure constraints.
Posted Oct 20, 2025 - 20:06 UTC

Update

We continue to experience issues running Linux machine and Remote Docker jobs due to capacity issues at AWS following their recent incident (https://health.aws.amazon.com/health/status).

Docker and Mac job performance has recovered to normal levels, and Windows and Android jobs are still unaffected.
Posted Oct 20, 2025 - 18:31 UTC

Update

We are continuing to experience high levels of errors attempting to run instances in AWS (https://health.aws.amazon.com/health/status).
Due to this, we are currently unable to run Linux Machine and Remote Docker jobs, and Docker is experiencing slow scaling.
Additionally, AWS network instability prevents us from booting our MacOS M4 fleet.

Customers attempting to trigger Docker jobs will see queueing with slow progress.
Windows and Android jobs are unaffected.
Posted Oct 20, 2025 - 17:01 UTC

Update

We continue to experience delays in acquiring new instances from AWS (https://health.aws.amazon.com/health/status) and are actively monitoring recovery. In addition, we’re investigating issues affecting macOS M4Pro jobs where a critical network service is intermittently failing, causing an increased queue time. Our teams are working to mitigate impact and will provide updates as we learn more.
Posted Oct 20, 2025 - 15:45 UTC

Update

We continue to see delays in getting instances from AWS (https://health.aws.amazon.com/health/status) and are actively monitoring the situation.
Posted Oct 20, 2025 - 14:40 UTC

Identified

We continue to see delays in getting instances to run jobs from AWS due to their continued incident https://health.aws.amazon.com/health/status. This is causing delays in jobs starting across Docker and Linux.
Posted Oct 20, 2025 - 12:06 UTC

Update

Our upstream service provider has identified the root cause and is actively working on mitigation. We'll continue monitoring and provide updates as more information becomes available.
Posted Oct 20, 2025 - 09:10 UTC

Investigating

We are currently experiencing an issue with an upstream service provider that is causing builds to fail or experience delays in the queue. We will provide updates as more information becomes available.
Posted Oct 20, 2025 - 07:49 UTC
This incident affected: CircleCI Dependencies (AWS) and Docker Jobs, Machine Jobs, macOS Jobs, Windows Jobs, Pipelines & Workflows, Artifacts, Notifications & Status Updates.