Advertisement

Alibaba cloud services unit’s review finds system breakdown a week ago ‘longest major-scale failure’ in Hong Kong and Macau

  • The Alibaba subsidiary’s incident review found that the system failure was caused by ‘a malfunction of the data centre’s chillers’
  • The incident resulted in a lengthy service outage that stretched for more than 24 hours at some customer sites

Reading Time:2 minutes
Why you can trust SCMP
1
Alibaba Cloud said the system failure on December 18, 2022, “caused a severe impact on the business of our customers”. Photo: Shutterstock
Tracy Quin Shanghai
Alibaba Group Holding’s cloud computing services subsidiary has described the system breakdown it experienced on December 18 as “the longest major-scale failure” at its Hong Kong and Macau operations for more than a decade, following a review of the incident.
Advertisement
“We want to apologise to all of the customers affected by this event, and we will handle the compensation as soon as we can,” Alibaba Cloud said in a statement published on its website on Sunday. “This event has caused a severe impact on the business of our customers.”

The incident, which happened at the internet data centre of an Alibaba Cloud partner, resulted in a lengthy service outage that stretched for more than 24 hours at some customer sites. It suspended withdrawals at major cryptocurrency exchange OKX and disabled the website of the Monetary Authority of Macau.

Alibaba Cloud blamed the system failure to “a malfunction of the data centre’s chillers”. It said the problem was detected at 8.56am on December 18, but engineers were not able to fix the problem expeditiously.

Racks of servers at a data centre run by Alibaba Cloud, the cloud computing subsidiary of e-commerce giant Alibaba Group Holding. Photo: Handout
Racks of servers at a data centre run by Alibaba Cloud, the cloud computing subsidiary of e-commerce giant Alibaba Group Holding. Photo: Handout

The engineers were alerted of “rising corridor temperatures in [the partner facility’s] data centre rooms in Zone C of the China (Hong Kong) region”, Alibaba Cloud said in the statement. “At [2.47pm], the fire sprinkler system of a room was triggered automatically due to high temperature, which increased the difficulty in troubleshooting.”

Advertisement

It took the engineers nearly 11 hours before the engine room’s temperature became stable enough for them to start service recovery.

loading
Advertisement