Would you RAID 1 a SSD?

Diggs

Well-Known Member
Reaction score
3,575
Location
Wisconsin
I recently migrated a small business (3-4 desks) from a very old Seagate 2-drive RAID 1 (55,000+ hours) to a single SSD (Samsung Evo 850). I'm having some second thoughts about not setting them back up in RAID 1 with a second SSD but thought the over-kill isn't needed with the reliability of the SSD. I do have them doing nightly backups to both a WD Black spinner and the cloud. Still........second thoughts. (No domain. Peer-to-peer sharing from a standalone "server" Dell desktop.)

Thoughts?
 
I am a big fan of redundancy especially for businessesw and other critical users. Always try to negate the single point of failure. SSD's are less prone to failure compared to their electro-mechanical counterparts but they will eventually break.

Mirroring drives makes sense. The only draw back is a slightly longer write times since the data is written twice. Since you have the data backed up to a HDD and the cloud your clients data is safe but he can still potentially lose a days work when the inevitable happens.

Let the client know the facts and you did your job.
 
At first look I would say RAID1 isn't a big concern in this scenario. Nice to have, but not essential.

Reasoning:
No domain, no databases, no LOB apps etc... you can rebuild that thing from a backup in no time.


But ultimately it comes down to clients expectations. How much do they loose if their server goes down for 4 hours, or a full day?
 
I do RAID 1 (or 10) SSDs every time in servers.

It's more about business continuity than reliability .... not to mention my own peace of mind. I'd much rather receive an alert from a server informing me that I need to replace a drive (at my convenience) than a panic call from a customer asking me to get over there pronto.
 
The reason to do raid has nothing to do with being SSD or HDD. It is setting up redundancy to avoid downtime. If the argument is valid for spinning disks then it is just as valid for SSD units. A Raid of some sort is pretty much assumed in a server.
 
Striped raids however can burn out SSDs, modern controllers have features to help mitigate this but I still only use RAID 1 and RAID 10 with SSDs.

Oh, and RAID only with server grade SSDs, the weakest drive I'm willing to tolerate is a Samsung Pro, and that's not a server grade disk!

The real problem is, SSD faults are even more clustered than HDD faults. So if you got a server full of drives you bought at the same time, you can bet the RAM is going to fault on them all at about the same time. Cascade failure is greater on an SSD array.

Datto all the things... server goes down and the Datto VMs it back online.
 
For what it is worth, on my own PCs, I do RAID1 of SSDs, for continuity reasons mostly.

The web servers I rent also have dual SSDs in RAID1, that's fairly cheap nowadays.
 
Striped raids however can burn out SSDs,

Any reference or source for that? Because I've been running LSI with CacheCade for a long time, and the CacheCade is essentially a RAID0 of SSDs (I would typically have for), and never a problem. That was actually an LSI-approved setup.
 
Did... someone just ask me to Google something in a tech forum?

Also define "long time", if it's not 8-9 years that phrase doesn't qualify.

I will however remind you that Dell sells Intel Server grade SSDs. They have three types. Read intense, write intense, and mixed use. Read intense drives are expected to live for 3 years if the entire disk's capacity is written once per day. Mixed use is 3 times per day. And write intense is 5x per day. They don't sell these drives, and ONLY these drives to server customers by accident.

Stripes amplify read / write operations. The more of those you have the shorter the life you have on an SSD. That's the way SSDs work, why is this surprising? That doesn't mean the things are going to just burn up in a ball of fire. I've got 400GB mixed use SSDs here in my server in a RAID 10, 6 of them. Each disk needs to have 3 x 400gb of data slammed into it every day to cook in 3 years, my IO loads are nowhere near that so I'll probably get 20 years out of these things at the rate I'm consuming the disks. I'd give you a better idea of how long they'd actually last but the things are 7 months old and still reading 100% remaining write endurance.

RAID 0 is a bit of a special case because you're not using a parity stripe, but you are consuming multiple disks to read or write a single file. The performance has got to be staggering, but it is burning multiple SSDs at once to gain that performance.

Just use an SSD aware controller that can read the health of the disks and set the controller to alert you when a disk gets weak. Also, don't be surprised when all the disks in the array fail at about the same time, that just means your controller worked properly.

Also you're just using SSDs as a cache, which is a long standing and well supported feature. I've worked with people that have had to replace those cache disks every six months because they read / write that much through that controller! It all depends on load.
 
I don't know i've never actually ran a raid with SSDs. When I have an issue with an SSD its always something like the system freezing or mysterious blue screen, i wonder would a raid1 config stop that or simply multiply the chances of it happening?
 
Stripes amplify read / write operations.

The controller (or be it software) does not write the entire stripe if you only request to write a single sector. if you request to write a single sector, and there happens to be no caching (no batching of adjacent writes), it will write a single sector (and whatever parity updates are required if you have a parity RAID). The same applies for reads. If you ask for one sector, the controller will fetch that one sector (assuming no read errors), unless otherwise directed by read-ahead policy. However, the same read-ahead policy will apply to the single-disk operations, so nothing really interesting here.

In a RAID0, read amplification is nil and write amplification is nil.
In a RAID10, reads are decreased by the factor of two, and write amplification is nil.
In a RAID5, read amplification on reads is nil. Write amplification is in best case nil (full row write) and in worst case 2x write amplification + 2x additional reads per write (does the SSD even care about that?). How this mixes in the actual use depends on write profile (sequential writes or random writes). In actual use, nothing really staggering here, unless caching fails to mitigate effect of random writes (in which case you get 2x write amplification).

If someone has to replace caches every six months or so, they are really pushing large amounts, or alternatively they need to change their supplier.
 
Is part of this discussion hinging on the minimum amount of data that can be written to an SSD in a single operation, even if only 1 byte has changed?

What's the size of the smallest block that can be written to SSDs these days?
 
Is part of this discussion hinging on the minimum amount of data that can be written to an SSD in a single operation, even if only 1 byte has changed?

The host controller sends 512-byte sectors to SSDs. This applies to a standard SATA controller, a RAID controller, or a RAID software, all the same. SSD then makes some great effort to minimize write (and erase) overhead, but it is the same for any type of host controller, because this process (called wear levelling) is fully hidden inside the SSD and not available to the outside world (except for TRIM but that does not seem relevant here).
 
Last edited:
In a RAID0, read amplification is nil and write amplification is nil.
In a RAID10, reads are decreased by the factor of two, and write amplification is nil.
In a RAID5, read amplification on reads is nil. Write amplification is in best case nil (full row write) and in worst case 2x write amplification + 2x additional reads per write (does the SSD even care about that?). How this mixes in the actual use depends on write profile (sequential writes or random writes). In actual use, nothing really staggering here, unless caching fails to mitigate effect of random writes (in which case you get 2x write amplification).

If someone has to replace caches every six months or so, they are really pushing large amounts, or alternatively they need to change their supplier.

I'm sorry but that's demonstrably false. In RAID 0 any file larger than the stripe size will be read from all disks that contain it. This read operation imposes slight wear on all SSD disks asked to read. This will cause RAID 0 SSDs to degrade slightly faster than nonRAID SSDs of the same type.

RAID10 has the same issue, but because it's not using a parity stripe the two disks in each mirrored pair will wear evenly.

RAID5 every disk that contains the file, including the parity bits is read every time a given file is requested. You can call them blocks because they contain bits of files, but all of this juggling is done by the controller. RAID6 is even worse because it's adding yet another parity stripe to keep tabs on.

With SSD technology, reads aren't free but almost so, writes are very expensive.

But honestly, the SSD technology has absorbed so much of this into the drive that we don't really have to think about it. All we really need concern ourselves with is the write endurance of the disk against the load that's going to be put on the array. SSD reliability has reached a point where I'm honestly wondering if RAID for any purpose other than performance has any value. Any RAID controller today is going to spread the work over all the disks to ensure even wear, the disk is doing to do its own thing to maintain itself internally. So the reality is, you're looking at a bunch of devices with no moving parts, and therefore no unpredictable failing characteristics, all designed to work as a team and therefore will wear evenly. Just like the breaks on your car, you don't replace just one pad, you do all four on the axle. I wonder if when we do actually have disk faults with SSDs that the risk of cascade fault through the entire array. If all the disks fail, the array is obviously dead. If the disk are failing predictably with software alerts cluing us in... do we even need RAID anymore? Sure an SSD can go bad mid run, but it's extremely rare. It's very similar to having a memory stick go bad outside of the first month of operation. RAID 10 could be made to work and offset all the risks, but to do so the second half of the mirror would need made out of different disks than the first. I know of zero people that do this when deploying servers, we just stuff in the drives and set the thing up.

TLDR, SSD may have invalidated RAID for any purpose other than obtaining a single larger logical volume to store stuff. But I'm not crazy enough to risk my customers on that, so I stick to old tried and true methods until such time as a decade or to worth of actual use proves we can safely remove the RAID.
 
In RAID 0 any file larger than the stripe size will be read from all disks that contain it. This read operation imposes slight wear on all SSD disks asked to read. This will cause RAID 0 SSDs to degrade slightly faster than nonRAID SSDs of the same type.

Not really. Let's compare a single disk and a RAID0, two disks stripe size 64K. Let's write a 128K file on it.

A single disk will get 128K writes.
RAID0 members will get 64K each, for a 128K total. That's the same wear, just evenly distributed across the disks. There is no amplification of any kind, just the same wear gets distributed across all RAID0 members.

RAID5 every disk that contains the file, including the parity bits is read every time a given file is requested.

No, not really. Only the disks which contain parts of file which are requested will be accessed. Again it will be the same total wear, just distributed across several member disks. If you ask for a single sector, only one disk is accessed. Parity is never read in normal operations. Parity is only read if data read fails, or if resilvering is in progress.

If you want that from the source, the only one I can think of is the MD driver, which had open source implementations of all this. The rest I'm afraid requires NDAs or actually working for one of the companies, but they are pretty much all the same, at least up to RAID5.
 
Hmm... no you're right, the load is spread over the disks. So the only thing we're really doing in RAID is making sure that all the disks wear out at about the same time. Nothing is being amplified here unless the process of committing the data to the NVRAM via a stripe misaligns with the block commit the SSD has to do. If that alignment is off you end up with write amplification at the disk, but as far as I'm aware modern SSDs and controllers deal with this.

You said most controllers send 512k blocks to the disk for writing. If you've got a 64k stripe, the disk can commit 8 stripe changes at once potentially, nice even division. But if you tried a stripe size that wasn't evenly divisible by the block commit, I think that's when things get ugly... I never go down that road because I let the controller default that stuff.

I mean you do end up with more total writes using a parity based stripe, but that's not accounting for much.
 
Last edited:
Back
Top