And people wonder why I hate Crowdstrike...

Sky-Knight

Well-Known Member
Reaction score
5,788
Location
Arizona

I can't make up my mind between Crowdstrike or Sentinel One which of the two of them has crappier supply chain checks. It's like the Windows 98 patching days again all day every day with these two.


It's a oneliner to fix: del %WINDIR%\System32\drivers\CrowdStrike\C-00000291*.sys

The problem is, getting to the console to perform this deletion, and heaven help you if the drive is bitlockered. My NOC is mounting Azure hosted server disks to another VM do process the delete, and get the VMs there back online. VMWare / Hyper-V hosts once repaired are pretty quick, because physical console access to the platform gives you a repair command console pretty quickly.

But all the blasted endpoints that must be physically touched... it's an ugly weekend.
 
From the New York Times about all this, with ongoing updates: What Caused Such a Widespread Tech Meltdown?
The article says they will be applying a fix but will it automatically bring PCs back online or will we still have to apply the fix ourselves. So far, none of my PCs are affected and I haven't heard from any customers but I'd like to be prepared, just in case.

Another thing that has me curious. Did this come through on a Windows Update and, if so, is there a reference to which update it is so it can be removed?
 
The article says they will be applying a fix but will it automatically bring PCs back online or will we still have to apply the fix ourselves. So far, none of my PCs are affected and I haven't heard from any customers but I'd like to be prepared, just in case.

Another thing that has me curious. Did this come through on a Windows Update and, if so, is there a reference to which update it is so it can be removedFrom what I've seen and heard, you have to apply the fix manually to each computer, due to the boot loop.
From what I've seen and heard, you have to apply the fix manually to each computer, due to the boot loop. Can't connect to computers that won't boot. The fix is to remove the file mentioned above.
 
From what I've seen and heard, you have to apply the fix manually to each computer, due to the boot loop. Can't connect to computers that won't boot. The fix is to remove the file mentioned above.
That's what I was afraid of. And if someone's drive is encrypted and they don't have the key, there won't be any fixing it at all, I'm sure. :(
 
Dealing with this at work. It is a specific file in the crowdstrike update that is the issue. For endpoints, you have to physically be there and rename the file to get the system to boot. I was able to use ESXi to remote into servers to get them up. Not a MS patch, specifically a CrowdStrike patch.
 
Correct, you cannot deploy an automated fix to machines that won't boot. The downed machines must be remediated by hand... this is the worst of all possible outcomes.

The "fix" to the back end has been deployed so it won't break again. But you have to remediate the downed equipment anyway.
 
Did this come through on a Windows Update

For the sake of a direct statement, though the answer has already been made: No.

This was the result of an update pushed out by CrowdStrike for its own software, and based on the nature of that software, it has permissions to touch areas that mere mortal applications do not (not unlike virtually any security suite you can name - they've got to have that capability).
 
For the sake of a direct statement, though the answer has already been made: No.

This was the result of an update pushed out by CrowdStrike for its own software, and based on the nature of that software, it has permissions to touch areas that mere mortal applications do not (not unlike virtually any security suite you can name - they've got to have that capability).
Thank you. I was reading up on this more and found out it was a Crowdstrike update and did not come through a Windows update.
 
That is correct, it's a CrowdStrike definitions update that broke it. Windows SHOULDN'T be able to be disabled by any software, much less AV software... but to do its job it has to own things to a high degree, making it dangerous as all anti-malware solutions are. Every product in this space does this every once in a while as a result.

The MS MVP on my SOC team just killed me...

Just tossed into the SOC huddle chat...

#CrowdStroke

Drop the mic, because that just nailed it!
 
It gets better!

So... this is a borked update for CrowdStrike that causes this on Windows platforms right?

The current CEO and co-founder of CrowdStrike is George Kurtz, making him responsible here.

Back in 2010 he was the CTO of McAfee, and oversaw the team managing updates for McAfee Enterprise AV... which also busted the Internet.

This man has overseen the teams that broke the Internet TWICE!

He has literally killed people with piss poor software update controls TWICE!
 
On a personal level, I bet many folks are already wondering if they made the wrong career decision when they got into IT. As in the person who wrote the buggy code, the supervisor who signed off on it, the QA team that neglected to test it properly, the manager of these teams, and now all the poor suckers around the world scrambling to fix this!
 
As in the person who wrote the buggy code, the supervisor who signed off on it,

But (and it may not be true here, I don't know) it's possible to write something that "takes down the world if it lands on computers with select setups" that works just fine in your in-house developer testing and even, sometimes, all the way through in-house QA.

I long ago let go of the idea that any software company, however dedicated to care and quality, can possibly write anything that will not bork some subset of computers out in the wild. And that's because we know that all sorts of user-induced or adventitious corruptions exist in some part of computers "in the wild" that simply do not exist on the vast majority of them.

This is why the whole "rollout in waves" method of doing updates started and became the de facto standard method. This entire disaster would have been stopped in its tracks, probably on the first (and smallest) wave if this company were following what are standard rollout protocols with telemetry monitoring after each wave to check to see if everything's working as expected.

The fact that this was delivered as a flash cut, effectively going out to everyone at the same time, is simply not tenable in this day and age, and whoever signed off on that should end up with their head on a pike!
 
I bought a few Crowdstrike shares today. Take advantage of the chaos.

Considered this myself. Up until today shares were looking very healthy. It's a market leading product in a rapidly growing market. There is a reason so many fortune 500 companies, banks, airlines, governments etc are using CrowdStrike - it's a damn good product and nothing about todays incident changes that.

Undoubtedly it was a huge balls-up but could have happened to any security product on the market. Jumping ship won't make you any less vulnerable to a similar incident happening again.

Assuming CrowdStrike don't get crippled by lawsuits over this incident I expect them to fully recover within a year. Memories are short.
 
Assuming CrowdStrike don't get crippled by lawsuits over this incident I expect them to fully recover within a year. Memories are short.

Agreed, particularly with your final point. LastPass did not curl up and die, and if memories were not very short indeed, it should have.
 
Weird, Friend of mine works for a health authority here in Canada and two weeks ago their Crowdstrike software updated and did the same thing company wide. Blue screen reboot and had to manually fix each computer and reboot. like heads were rolling and trying to figure out who was at fault. Now this happens, the exact same thing they pushed out a update without actually doing some sort of QA before the push and this now was global. Oops!
 
The CrowdStrike fault was caused by a definition file full of zeros being pushed into production.

This indicates:
1.) Their production pipeline doesn't have proper unit testing.
2.) They do not have an internal ring zero testing group they hit before they shove updates to the world.
3.) The Falcon sensor software doesn't validate inputs correctly.
4.) The Falcon sensor software doesn't crash gracefully.
5.) The Falcon sensor software cannot therefore automatically recover from an supply chain incident.

Most of these lessons were learned in 1999. The CEO and Co-Founder of CrowdStrike made the EXACT SAME MISTAKES with the McAfee Enterprise issue back in 2010.

If you trust this product, I'm sorry but you're a complete idiot. They just demonstrated they are not to be trusted, and given this specific CEO should personally know better, I feel he should face criminal charges for the lives he took, and the injuries incurred by all the flights, and medical procedures cancelled due to his "mistake." Furthermore, the company should be subject to legal liability for all of the repairs required to bring systems back online.

If you're buying stock in this mess, I hope you lose your shirts over it. Sadly, I think we all know he'll walk away with a slap on the wrist if anything, and be allowed to fail again in the future.
 
Back
Top