Degraded performance - telephony services

Incident Report for Dixa

Postmortem

Summary 

On the 20th of October 2025 at 6:55 PM CEST due to the ongoing AWS incident Dixa experienced intermittent issues with Telephony Inbound and Outbound connectivity. On some Dixa instances the calls were not being placed or the connection between agents and users was not established. 

Root cause 

The incident originated from a major AWS service disruption in the us-east-1 region caused by a DNS race condition within Amazon DynamoDB’s endpoint management system. The fault temporarily removed valid DNS records for several AWS services, resulting in widespread API resolution failures across multiple dependent systems.

Dixa is using Twilio’s APIs for telephony services. When DynamoDB’s DNS records were invalidated, Twilio’s regional load balancers lost access to key internal routing data, which caused inbound and outbound voice requests to fail. 

Because Twilio’s APIs are globally routed through Twilio’s us-east-1 infrastructure, Dixa’s platform became unable to establish or maintain voice sessions. 

Timeline

At 06:55 pm CEST: We started observing intermittent interruptions with Telephony Inbound and Outbound service. Only some Dixa instances and phone numbers were affected. The incident was reported with the Identified status. 

At 07:52 pm CEST: The update is posted stating that some of the services are still impacted. 

At 08:51 pm CEST: There are no signs of issues on Dixa services anymore, however the third party services are still recovering and minimal interruptions could be expected. 

At 09:46 pm CEST: The impacted systems have fully recovered. The incident status was moved to Monitoring. 

At 00:56 am CEST on the 21st of October: AWS confirmed that the incident is fully resolved on their end. Dixa’s incident status was changed to Resolved. 

Preventive measures 

Dixa will evaluate implementing the following measures to mitigate the risk of similar issues in the future: 

  • evaluate the feasibility of enabling outbound and inbound routing via multi-regional providers in case the us-east-1 Twilio region fails.
  • Independent external health check monitoring outside of AWS and Twilio.
  • Service dependency mapping and redundancy revisit.
  • Incident communication redundancy 

We sincerely apologise for the inconvenience this has caused.

Posted Oct 28, 2025 - 16:46 CET

Resolved

The issue is now fully resolved.
Posted Oct 21, 2025 - 00:56 CEST

Monitoring

The services at AWS and Twilio that caused the telephony inbound and outbound intermittent issues have been mostly recovered. We do not see signs of issues on our service. We are moving the incident to Monitoring status.
Posted Oct 20, 2025 - 21:46 CEST

Update

The telephony inbound and outbound traffic looks stable at the moment, however the main AWS incident causing this issue is still not fully resolved therefore temporary and intermittent issues might be expected.

We will post the next update within 60 minutes.
Posted Oct 20, 2025 - 20:51 CEST

Update

Some customers might still experience intermittent issues with inbound and outbound calls as an upstream incident causing these issues is not fully resolved.

We will keep updating the status of this issue within the next 60 minutes.
Posted Oct 20, 2025 - 19:52 CEST

Identified

We continue seeing interruptions with the telephony service. Inbound and outbound traffic on some numbers might be affected. The issue is caused by the problem with Amazon AWS services (https://health.aws.amazon.com/health/status) and services affected by it like Twilio (https://status.twilio.com/). We will keep monitoring status of the issues and adding updates to the status page.

We will publish the next update in 60 minutes.
Posted Oct 20, 2025 - 18:55 CEST
This incident affected: Telephony & SMS (Inbound, Outbound).