Alibaba cloud services unit’s review finds system breakdown a week ago ‘longest major-scale failure’ in Hong Kong and Macau
- The Alibaba subsidiary’s incident review found that the system failure was caused by ‘a malfunction of the data centre’s chillers’
- The incident resulted in a lengthy service outage that stretched for more than 24 hours at some customer sites
The incident, which happened at the internet data centre of an Alibaba Cloud partner, resulted in a lengthy service outage that stretched for more than 24 hours at some customer sites. It suspended withdrawals at major cryptocurrency exchange OKX and disabled the website of the Monetary Authority of Macau.
Alibaba Cloud blamed the system failure to “a malfunction of the data centre’s chillers”. It said the problem was detected at 8.56am on December 18, but engineers were not able to fix the problem expeditiously.
The engineers were alerted of “rising corridor temperatures in [the partner facility’s] data centre rooms in Zone C of the China (Hong Kong) region”, Alibaba Cloud said in the statement. “At [2.47pm], the fire sprinkler system of a room was triggered automatically due to high temperature, which increased the difficulty in troubleshooting.”
It took the engineers nearly 11 hours before the engine room’s temperature became stable enough for them to start service recovery.