this post was submitted on 19 Jul 2024
766 points (94.5% liked)

Linux

47172 readers
1736 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

This isn't a gloat post. In fact, I was completely oblivious to this massive outage until I tried to check my bank balance and it wouldn't log in.

Apparently Visa Paywave, banks, some TV networks, EFTPOS, etc. have gone down. Flights have had to be cancelled as some airlines systems have also gone down. Gas stations and public transport systems inoperable. As well as numerous Windows systems and Microsoft services affected. (At least according to one of my local MSMs.)

Seems insane to me that one company's messed up update could cause so much global disruption and so many systems gone down :/ This is exactly why centralisation of services and large corporations gobbling up smaller companies and becoming behemoth services is so dangerous.

you are viewing a single comment's thread
view the rest of the comments
[–] Wereduck@lemmy.blahaj.zone 8 points 1 month ago (4 children)

I get where you are coming from, but this event is pretty much entirely the fault of Crowdstrike and the countless organizations that trusted them. It's definitely a show of how massive outages are more likely when things are overly centralized and proprietary, and managed by big, shitty, profit driven organizations. Since crowdstrike operates in kernel space, it doesn't matter which operating system it's on, it can break it if it does something stupid. In fact they managed to break some redhat machines not too long ago, and some Debian machines not long before that. It's just the impact wasn't as far reaching as this recent utter fuckup, just because fewer critical machines were affected, so we didn't hear about those smaller fuckups in the news.

[–] electricprism@lemmy.ml -2 points 1 month ago (3 children)

Yes, thank you, exactly. The centralized model has its benefits but it also can act as a single point of failure.

If I was going to analyze from an engineering perspective I would focus on when these inevitable events occur due to human error do we have adequate tools to roll back updates? Do we snapshot OS drives before updates? Is there adequate Safe Mode or Fallback Tools to diagnose which files are offending in order to allow the user to remove them.

In my view the windows user isn't dignified to have the skills or intelligence needed to workaround a "setback" issue like the one yesterday.

It doesn't help that NTFS is missing modern capabilities, or that there isn't easy to use DIFF for the layman to understand which files were added to the filesystem that may be causing the breakage.

To be fair though even with those pot holes filled the entire design paradigm of Windows and a proprietary platform is part of the problem. Software is not broken up into package modules that can be assembled into a functioning system it is encumbered with "anti-piracy" boogie man where the software treats the user as an enemy and is designed to break.

Linux isn't like that. I've cloned many distro drives and swapped them into new machines and with 1 or 2 tweaks they JustWork

I see many people on the net defending Microsoft as blameless for technical reasons.

My criticisms were that Microsoft just sucks as you interpreted correctly and offered a eloquent summary. Thank You.

Where I think the entire conversation should move is --

What are the design flaws that allowed this to happen?

"More Rust & Less C" I see some people suggest as this was allegedly a null pointer issue.

And is Windows Broken By Design? My opinion answer - Yes.

(Okay, and what to do about it before the next billion dollars is lost. I would think critical infrastructure should have a model similar to NixOS in immutability but that's just my opinion.)

[–] areyouevenreal@lemm.ee 1 points 1 month ago* (last edited 1 month ago) (2 children)

Windows does have a fallback mode called safe mode and that's exactly what's being used to fix this utter mess.

Package management isn't going to save you from this as it didn't save the Linux systems affected last time. It didn't stop Arch Linux from failing to boot after a Grub update either.

Windows also has drive cloning tools, that isn't unique to Linux.

NixOS isn't immutable. It's not an a/b root system and / isn't read only. Rather it's what's known as reproducible. I am not convinced NixOS would make this any easier either given how simple the fix was. Funnily enough though tools exist called ansible and puppet for configuring systems in repeatable ways that apply to both other Linux systems, Windows systems, and even macOS.

There are like one or two valid points in this whole comment and the rest is pretty much falsehoods and misconceptions.

Edit: Forgot to mention tools exist to make Windows immutable as well. So that is an option.

[–] lemmyreader@lemmy.ml 0 points 1 month ago (1 children)

Windows does have a fallback mode called safe mode and that’s exactly what’s being used to fix this utter mess.

The other fix was reboot your Windows computer at least 15 times.

Package management isn’t going to save you from this as it didn’t save the Linux systems affected last time. It didn’t stop Arch Linux from failing to boot after a Grub update either.

Not everyone was affected though :

How come not everyone was impacted?

Prior to the most recent version, grub only registered the fwsetup if detected support. If your machine detected support, you would have had the fwsetup command registered and the failure wouldn’t occur.

[–] areyouevenreal@lemm.ee 1 points 1 month ago

The other fix was reboot your Windows computer at least 15 times.

How is that an argument against anything I have said?

Not everyone was affected though

Only machines running crowdstrike were affected, not all Windows machines. So in neither case were all systems affected. In this case though Microsoft doesn't bare any responsibility as they didn't distribute the software. In the case or Arch and EndeavourOS they had a responsibility to check packages before they shipped them to users. In this case the OS maker was more at fault.