Some commit status updates were not updated
Incident Report for CircleCI
Postmortem

Summary

On January 23, 2025, from 19:48 UTC to 20:43 UTC, customers using CircleCI GitHub OAuth and Bitbucket projects stopped receiving commit status updates. This was due to a code change deployed at 19:48 UTC that negatively impacted the service responsible for sending commit statuses to the Version Control System (VCS) provider.

What Happened

On January 23, 2025, at 19:48 UTC, we deployed a change in how we send events from our service that orchestrates workflows. This change inadvertently modified the value of a key field used by a downstream service responsible for setting commit statuses.

At 20:03 UTC, the team responsible for the downstream service was alerted to an increase in errors when setting commit statuses. This alert auto-resolved without intervention, delaying our response time.

At 20:12 UTC, our support team notified us that customers were experiencing issues with commit status updates. This prompted an investigation. By 20:40 UTC, we had identified and reverted the faulty code change, with customer impact ceasing at 20:43 UTC.

Future Prevention and Process Improvement

We will add more comprehensive testing to cover the events sent by our orchestration service. Additionally, we will implement synthetic tests to catch failures in setting proper commit status updates.

We are also investigating why the alert auto-resolved to ensure similar issues are actioned sooner.

While investigation and remediation started promptly after being notified of the issue, there was a delay in initializing our incident protocol, which delayed the creation of a status page update and limited the information available to provide clear timing on the published update. We are revisiting our incident declaration procedures and tool configuration to provide further clarity around incident declaration and improve response time.

Posted Jan 31, 2025 - 21:39 UTC

Resolved
At 19:48 UTC, some customers' projects may have stopped receiving commit status updates. The incident was resolved at 20:43 UTC. To ensure that the checks are reported correctly, we recommend rerunning the impacted workflows from the start.
Posted Jan 23, 2025 - 20:30 UTC