Partial outage in MT1 cluster

Incident Report for Pusher

Resolved

Between 19:22 and 19:51, approximately 1.2% of messages failed, and a considerable number of our clients experienced re-connections. Our monitoring system experienced partial downtime, leading to confusion in our autoscaling policy. This confusion triggered a scale-down event in the cluster, resulting in reduced available capacity.

Our engineering team has taken steps to prevent this from happening again.

This incident has been resolved.
Posted Jun 21, 2023 - 17:30 UTC