Between 10:35 to 15:31 CET on 11/05/2022, the agent interface presented intermittent slowness. Agents were forced to reload the interface after handling each conversation.
Our support team triggered the incident management process following the client's reports.
Our incident manager involved the relevant engineering teams and management at 10:35hs, and an active investigation began.
After further examination, it was determined at 10:58hs, and our team decided to roll back a recent deployment to the Agent interface.
At 11:23hs, the roll-back was completed, and we started to see improvements in all services.
From 11:40 hs, we continued to closely observe until customers and our monitoring systems confirmed full recovery and system stability.
At 12:11hs, a permanent fix was deployed.
We moved into the monitoring phase until 15:31hs.
To prevent this from happening in the future, we are working on improving the automatic alerts on our internal system.
Timeline (all times are in CET):
10:30 We started to get the first reports from our customers.
10:35 Incident management process triggered. Our status page was updated.
11:01 Issue was identified
11:23 Fix implemented by our engineering team. We started to see improvements in the services.
11:40 Monitoring phase. The system was operating normally.
12:11 A permanent fix was deployed.
15:31 Case has been resolved.