Database cluster failure
Incident Report for ClickMeter
Resolved
We have identified an issue affecting our database cluster. One of our databases instances in the cloud has reached an inconsistent state and was automatically excluded from the cluster as per our health checks. We expect this event to negatively affect performances of Redirection and API services, and ultimately the Dashboard experience, for as long as the original cluster size is not fully reinstated.

No data has been lost in the event, and consistency of data is still guaranteed so that records were either written on all instances, or they were not written at all. We had to temporarily interrupt the service in order to make sure no inconsistency was introduced at all by this critical event in our database architecture. We experienced severe issues in rebooting one of the remaining database server processes, which caused our temporary service interruption to take much longer than we expected, for a total of 33 minutes of downtime in Redirection and API services. Following a needed change in the database configuration to reflect the new cluster replication status, we were finally able to restore all services.
Posted Apr 28, 2022 - 12:00 UTC