Cloudflare Worldwide Outage Caused by Bad Software Deployment

Bleeping Computer – by Sergiu Gatlan

Cloudfare experienced a worldwide outage today for about 30 minutes, with network performance issues that brought down a multitude of websites and web services all around the world, and triggered “502 Bad Gateway” errors.

While various speculations said that the provider of content delivery network and DDoS mitigation services was being under attack, Cloudfare’s John Graham-Cumming says that the 502 errors seen by visitors of Cloudfare sites were actually caused by a spike in CPU utilization on the provider’s network. 

“This CPU spike was caused by a bad software deploy that was rolled back,” adds Graham-Cumming. “Once rolled back the service returned to normal operation and all domains using Cloudflare returned to normal traffic levels.”

“This was not an attack (as some have speculated) and we are incredibly sorry that this incident occurred,” also stated Graham-Cumming in a post on the company’s official blog.

Cloudfare was not under attack

The Cloudfare blog post confirms what the company’s CEO said on Twitter during the network outage:

https://twitter.com/eastdakota/status/1146069084540264449

Prince also tweeted about the network issues experienced by Cloudfare during the incident, reassuring customers that the team was “working on getting to the bottom of what’s going on.”

https://twitter.com/eastdakota/status/1146061591143538688

The incident report regarding today’s HTTP 502 errors experienced by customers was updated by the Cloudfare team one hour after the initial update and 30 minutes after the network performance issues were fixed, saying that:

Major outage impacted all Cloudflare services globally. We saw a massive spike in CPU that caused primary and secondary systems to fall over. We shut down the process that was causing the CPU spike.

Seven minutes later, Cloudfare changed the status of the incident report to Resolved and added the following message: “Cloudflare has resolved the issue and services have resumed normal operation.”

Regions affected by the Cloudfare outage
Regions affected by the Cloudfare outage

Second Cloudfare outage in week

This is the second time within a week that Cloudfare experienced a network outage, although the BGP route leak it went through on June 24 was caused by Verizon and Noction.

According to Cloudfare’s CEO, the BGP route leak was quite hard to fix given that Cloudfare’s team was not able to contact the Verizon NOC during the outage.

https://twitter.com/eastdakota/status/1143183635731795968

Seeing that more than 16 million websites use Cloudfare’s DDoS mitigation, performance enhancement, and various other services, Cloudfare outages usually have a large impact on the Internet as a whole.

https://www.bleepingcomputer.com/news/technology/cloudflare-worldwide-outage-caused-by-bad-software-deployment/

One thought on “Cloudflare Worldwide Outage Caused by Bad Software Deployment

  1. Trust us with all your most sensitive information because it is secured in the cloud. If you want access, well, you should have known better. We said secured, didn’t we?

    I prefer pen and paper.

Join the Conversation

Your email address will not be published. Required fields are marked *


*