Duplicate events in Webhooks EAP feature
Incident Report for Dixa

On February 17th, 2021 at 11.36 am CET our engineering deployed a new version of the Dixa Webhook service. Approximately 1 hour later, we received alerts of a large number of webhook events being queued for processing. In the hour that passed, events from the past two weeks as well as new events had been (and were still) queueing up. This was caused by misconfiguration on a service - something we were not aware of at that time.

After considering our options and finding the cause of the re-queueing, we decided to process all events once more as to not lose new events mixed inside the queue of old events and at 4.01 pm CET all events, new and old, had been processed again.

We have performed a Root Cause Analysis of this incident and are implementing solutions in order to prevent such instances from happening in the future.

We sincerely apologise for any inconvenience that this may have caused.

Posted Feb 18, 2021 - 13:11 CET

This incident has been resolved.
Posted Feb 17, 2021 - 16:15 CET
We're still processing the backlog and we expect full resolution around 4pm now.

Next update at 16h10 CET.
Posted Feb 17, 2021 - 15:41 CET
We continue to monitor the situation and expect all duplicates (and new events) to be processed within half an hour.

Noteworthy is if your webhook endpoints failed to cope properly with the duplicate events, the platform may have disabled your hooks after a few failed attempts. We will reach out to you if this is the case. You can confirm this yourself if you are using the Webhooks EAP in Settings -> Integrations -> Webhooks.

Next update around 15h40 CET.
Posted Feb 17, 2021 - 15:07 CET
We continue to process the backlog and are seeing the queue shrinking massively. We expect all duplicates to have been processed over the next hour, effectively resolving this incident completely.

We'll update you again in half an hour.
Posted Feb 17, 2021 - 14:38 CET
A fix has been implemented.

Unfortunately, during a two hour period a backlog of "duplicate" events stacked up. We expect this backlog to be processed automatically over the next hour or two.

We're closely monitoring the situation and will update again in half an hour.

Next update around 14h30 CET.
Posted Feb 17, 2021 - 13:56 CET
Our engineers are implementing the fix. We're currently seeing improvement and are expecting a resolution soon. We're currently monitoring the situation.

Next update around 14h CET
Posted Feb 17, 2021 - 13:30 CET
If you're using the Webhooks feature which is currently in an early access program, you may experience some events are being received twice by your application, potentially even events already sent and received in the last week(s). The event_id would remain the same on both received events in case you need to deduplicate the events in your application.

Our engineers have found the cause and we're currently working on a resolution.

We will continue to update you periodically on the progress. Sorry for the inconvenience.

Next update at 13h30.
Posted Feb 17, 2021 - 12:58 CET
We are experiencing duplicate processing of webhooks. We are investigating the matter.

Next update at 13h08.
Posted Feb 17, 2021 - 12:39 CET
This incident affected: Webhooks.