CrowdStrike blames bug that caused worldwide outage on faulty testing software

CrowdStrike has blamed faulty testing software for a buggy update that crashed 8.5 million Windows machines around the world, it wrote in an post incident review (PIR). "Due to a bug in the Content Validator, one of the two [updates] passed validation despite containing problematic data," the company said. It promised a series of new measures to avoid a repeat of the problem.

The massive BSOD (blue screen of death) outage impacted multiple companies worldwide including airlines, broadcasters, the London Stock Exchange and many others. The problem forced Windows machines into a boot loop, with technicians requiring local access to machines to recover (Apple and Linux machines weren't affected). Many companies, like Delta Airlines, are still recovering.

To prevent DDoS and other types of attacks, CrowdStrike has a tool called the Falcon Sensor. It ships with content that functions at the kernel level (called Sensor Content) that uses a "Template Type" to define how it defends against threats. If something new comes along, it ships "Rapid Response Content" in the form of "Template Instances."

A Template Type for a new sensor was released on March 5, 2024 and performed as expected. However, on July 19, two new Template Instances were released and one (just 40KB in size) passed validation despite having "problematic data," CrowdStrike said. "When received by the sensor and loaded into the Content Interpreter, [this] resulted in an out-of-bounds memory read triggering an exception. This unexpected exception could not be gracefully handled, resulting in a Windows operating system crash (BSOD)."

To prevent a repeat of the incident, CrowdStrike promised to take several measures. First is more thorough testing of Rapid Response content, including local developer testing, content update and rollback testing, stress testing, stability testing and more. It's also adding validation checks and enhancing error handing.

Furthermore, the company will start using a staggered deployment strategy for Rapid Response Content to avoid a repeat of the global outage. It'll also provide customers greater control over the delivery of such content and provide release notes for updates.

However, some analysts and engineers think the company should have put such measures in place from the get-go. "CrowdStrike must have been aware that these updates are interpreted by the drivers and could lead to problems," engineer Florian Roth posted on X. "They should have implemented a staggered deployment strategy for Rapid Response Content from the start."

This article originally appeared on Engadget at https://www.engadget.com/crowdstrike-blames-bug-that-caused-worldwide-outage-on-faulty-testing-software-120057494.html?src=rss

HOT news

Related posts

Latest posts

Intel names Lip-Bu Tan its new CEO

Intel has a brand new chief on the helm, hoping to vary course after a difficult interval for the chipmaker. The corporate introduced in...

Rep. Tom Emmer Asserts CBDC Know-how Is “Inherently Un-American”

Key Takeaways: Critics warn that government-controlled digital cash may erode privateness safeguards. Some favor private-sector fashions to help much less intrusive digital funds....

FLock Web3 Agent Mannequin accuracy surpasses GPT-4o and Gemini

Web3 chief FLock.io has unveiled its milestone FLock Web3 Agent Mannequin—a specialised LLM designed to execute complicated duties with exact perform calls and...

iOS 18.4 introduces a brand new default navigation app selection, however solely in Europe

When it goes dwell, iOS 18.4 will permit some customers to set a default navigation app aside from Apple Maps. The software program of...

Sec’s Case Towards Ripple Set to Shut, Sources Say: Fox Enterprise

Key Takeaways: The SEC and Ripple seem near settling their prolonged authorized battle. Altering enforcement approaches are rising inside the US crypto sector....

Want to stay up to date with the latest news?

We would love to hear from you! Please fill in your details and we will stay in touch. It's that simple!