Summary: On March 11, 2026, Dixa experienced platform-wide degraded performance lasting approximately 3 hours (08:30 - 12:35 CET). Customers experienced slow or failed conversation loading, timeouts on email sending, conversation transfers, assignments, and flow processing. No data was lost, and there were no security issues at any point.
Impact:
- Availability: Platform-wide slowness and partial inaccessibility for ~3 hours.
- Affected functionality: Conversation loading, email sending, conversation transfers, conversation assignments, and flow processing - all experienced significant slowness and intermittent failures.
- Data integrity: All emails were fully processed after the fix. No data was lost, and no security issues occurred at any point.
Root Cause: The incident was caused by an atypical traffic pattern in inbound email processing that resulted in repeated internal retries - retries are a normal part of email distribution, accounting for factors such as sending delays and server availability, but this expanded exponentially. The sustained retry volume placed excessive load on a central platform component, causing cascading timeouts across dependent services and resulting in platform-wide degradation.
Timeline (CET):
- Mar 11, 06:00 - First signs of email processing errors detected
- Mar 11, 08:30 - Platform degradation begins; customer impact starts
- Mar 11, 10:50 - First mitigation deployed; partial improvement
- Mar 11, 12:30 - Root cause fully identified; final fix applied
- Mar 11, 12:35 - Platform stability confirmed
Resolution: We identified and addressed the source of the abnormal email volume, which resulted in an immediate reduction in error rates, and the platform to recover.
What We Have Done Since This Incident: Following this incident, we have already implemented the following improvements:
- Added validation to reject invalid email addresses early in the pipeline, preventing them from entering retry loops.
- Optimised internal lookups to fetch only necessary data instead of the full conversation history, significantly reducing load during email processing.
- Added deduplication logic to prevent redundant data fetches during email processing.
- Enforced concurrency limits: platform components now shed excess traffic when saturated, allowing requests to be redistributed rather than queued indefinitely.
- Added deadline checking: expired requests are now discarded immediately instead of consuming resources on work that is no longer needed.
- Reduced internal timeout thresholds to fail fast under contention rather than blocking for extended periods.
What We're Continuing to Work On:
- Loop detection and interruption - Introduce mechanisms to detect and automatically halt email processing anomalies before they can accumulate significant load.
- Improved alerting and escalation - Ensure processing anomalies are detected and escalated with appropriate urgency.
Closing Note: We sincerely apologize for the disruption this caused. These improvements are our highest priority. If you have any questions, please reach out to friends@dixa.com.