Specs to Consider When Buying a RAID [u]

Posted on by Larry

I need to buy a new RAID for our company server. The one we have is both too slow and too full. I started thinking about this during this year’s NAB as I was talking to a variety of storage vendors about their latest products.

NOTE: A RAID (Redundant Array of Inexpensive Drives) is a collection of hard disks that are stored in a single box, connect via a single cable and, when attached to the computer, act as though they were one very large, very fast hard drive. RAIDs are used when you need more speed or storage capacity than a single hard drive can provide.

Our server is a fairly new Mac Mini with a Thunderbolt port, so I’m looking for a Thunderbolt RAID 5. (Here’s an article that describes what the different RAID levels mean.)

Our office network includes about twelve computers, three of which are wireless. Three wired computers do audio and video editing, while the rest handle standard office and web work.

The network is wired for gigabit Ethernet, as fiber or 10-gig Ethernet is way outside my budget. What this means is that the maximum data transfer rate between the server and wired computers is limited by the gigabit Ethernet Protocol, which is about 110 MB/second after overhead. Wireless devices will be much slower, depending upon which wireless protocol they support.

The genesis of this article came when I started to think about what gear to buy.

NOTE: Your network switch will often determine how efficient your network is. Low-cost switches often can’t support transferring data at full speed to all connected computers at the same time. For this reason, I upgraded to a Cisco SMB switch. It costs more, but avoids bottlenecks.

FACTORS TO CONSIDER

In thinking about this, I realized that there are five main factors to consider when buying a RAID:

The importance of each of these factors changes depending upon your needs. So, let me explain what they are and when you need to consider them for your own system.

STORAGE CAPACITY

The storage capacity of a RAID (or any hard disk) is measured in either Gigabytes or Terabytes. A Gigabyte is 1,024 Megabytes, while a Terabyte is 1,024 Gigabytes.

NOTE: In an effort to make storage numbers more accessible, many technology marketing departments describe a gigabyte as 1,000 MB, or a terabyte as 1,000 GB. However, hard disks don’t read marketing literature. After formatting, RAIDs will always store less data than printed on the packaging.

We are all familiar with picking a hard disk based on storage capacity. If we are storing small files, a smaller capacity is fine. If we are storing large files, more space will be necessary.

As you are deciding what size RAID to buy, keep in mind the adage: “It is impossible to buy a hard disk that is either too big or too fast.” Always buy a bit more than you need.

CONNECTION PROTOCOL

In the past, we had two choices on how to connect a hard drive to our computer:

Those protocols were good for the time, but either was very fast. In fact, the protocol was slower than the internal speed of a hard disk. (FireWire 800, for example, transfers data at about 85 MB/sec.)

Now, we have several additional new choices:

  1. USB 3
  2. Thunderbolt
  3. Thunderbolt 2
  4. Mini-SAS
  5. eSATA

For Mac users, the last two options: mini-SAS and eSATA require PCIe cards or conversion boxes. While the protocols themselves are excellent, if you are starting fresh use Thunderbolt because it is easier to connect and as fast, or faster, than mini-SAS or eSATA.

I’ve been told that USB is optimized for smaller files – think office files – while Thunderbolt is optimized for larger files – think media files. When it comes to speed, these new protocols are EXTREMELY fast:

However, as you’ll see in the next section, theoretical speed and actual speed are radically different.

DATA TRANSFER RATE

The data transfer rate is the speed that data travels between the computer and the RAID (or hard disk). For RAIDs and other storage, we measure this in MB/second; higher numbers indicate faster performance.

What you need to understand about standard hard drives – also called “spinning media” – is that a single hard drive can only transfer data at about 120 MB/second. (SSD, or Flash drives, are much faster and we’ll talk about them in a minute.)

This means that if you need speeds faster than 120 MB/second, you need to group multiple hard drives to work together. This is what a RAID does – it groups a bunch of standard hard disks together so they can transfer data faster.

Just to help you think about this, here are some data transfer speeds of different codecs (actual data transfer rates will vary with image size and frame rate):

For multicam editing, multiply the speed of the codec you are using by the number of cameras you are editing.

Data transfer rate is the most important spec we need to consider for direct-attached RAIDS and drives. However, if we are attaching a RAID to a server, the data transfer rate is determined by the network protocol, in my case gigabit Ethernet, not the speed of the RAID. A fast data transfer rate is important, but not critical.

A SIDE NOTE ON SSD DRIVES

SSD stands for “Solid State Drive.” It takes a bunch of RAM and makes it look like a hard disk to the computer. SSDs provide all the speed of RAM with the permanent memory of spinning media.

The good news is that SSDs are very, VERY fast. The bad news is that they are very expensive and don’t store as much as spinning media.

Depending upon which controller the SSD uses and the type of NAND-based flash memory, SSD drives can attain speeds of more then 1.0 GB per second when playing back a single files sequentially. This is fast enough to fully fill a Thunderbolt 1 connection. However, SSD speeds slow down dramatically when performing random reads and writes, which is what a server requires.

In general, SSDs are an excellent choice for boot drives and, if you can afford them, for media drives that are direct attached. However, for small businesses, a standard hard drive is a better option for server storage because most of that speed is lost when transferring data over the network.

Here’s an excellent article that compares SSD drives with standard hard drives.

IOPS

While the data transfer rate is critical for direct-attached RAIDs, when we are attaching a RAID to a server, a different measurement becomes more important: IOPS (pronounced: “eye opps”). This is the number of Input / Output oPerations per Second the RAID can perform.

With a server, multiple users are accessing different files on the same RAID at the same time. IOPs measure how quickly the RAID can respond to all these different requests.

Since the overall data transfer rate is determined by the network – which is FAR slower than the native data transfer rate of the RAID – we need to concentrate on which RAID can process the greatest number of requests in the least amount of time within the budget that we have to work with.

NOTE: As you might expect, high-performance storage with high IOPS to meet the needs of hundreds of users costs in the tens-of-thousands of dollar range, and fills entire equipment racks with drives. While providing vast performance and storage, they are beyond the budget of most smaller shops, like mine. As with all things, we need to balance performance against budget.

Calculating IOPS involves some tricky math and varies depending upon the RAID level you are using. (Do a Google search for “Calculate IOPS” and you’ll see what I mean.) However, when you are buying a RAID for a server, check it’s IOPS rating. The higher the IOPS rating, the better the RAID will perform when multiple users are accessing the RAID at the same time.

RAID CONTROLLER

There are two types of RAID controllers: hardware-based and software-based.

When performance is important, look for a hardware RAID controller. When flexibility is more important, a software RAID controller may be a better choice.

SUMMARY

The media storage industry is in the process of transitioning from older protocols to Thunderbolt 1 and 2. I saw this in all the announcements that were made at NAB in April of this year. Supporting Thunderbolt, or USB 3, means that storage can be attached to any Mac without needing a PCIe card or converter box.

And, for users with a deep investment in PCIe cards, expansion options were offered from a wide variety of vendors, including ATTO, Sonnet, mLogic and Akitio.

However, most of these new products won’t be shipping for a while. So this gives me time to do my research and figure out which drives makes the most sense for what I need.

Ultimately, I will be buying two RAIDs: one for the server and one for high-performance direct-attached editing. Given what I’ve learned in researching this article, they won’t be the same product, because they don’t do the same job.

I’ll let you know what I decide. In the meantime, I’m always interested in your thoughts.

UPDATE – April 21, 2014

After I published this article, I realized that I forgot to include a link to other articles I’ve written on storage. There is a wealth of information here that can save you a ton of headaches: Storage Basics – Collected Articles


Bookmark the permalink.

21 Responses to Specs to Consider When Buying a RAID [u]

  1. Hi Larry,

    Have you done a recent review on Drobo drives. I would like to hear your detailed assessment of them.

    P.S. I don’t work for them, I have worked for clients that use them.

    Phil

  2. Tim Jones says:

    First note – we dropped “Inexpensive” for Independent long ago, so Redundant Array of Independent Disks. Inexpensive only applied during the time when there were two types of drives – drives that were aimed at small systems (inexpensive) and drives that were aimed at midrange to mainframe systems (very expensive).

    Regarding performance, things can look very different from the trenches. Everything that you say about performance in the article is true, but not necessarily correct when sitting in the trenches.

    For example, you mention the theoretical top speeds for the interfaces:

    USB 3 has a theoretical speed of 640 MB/second
    Thunderbolt 1 has a theoretical speed of 1.1 GB/second
    Thunderbolt 2 has a theoretical speed of 2.2 GB/second

    However, in reality users can reliably expect to see performance of:

    USB 3 has a real speed of ~100 MB/second for a SINGLE drive, ~245 MB/sec for an array
    Thunderbolt 1 has a real speed of ~700MB/second when using an array
    Thunderbolt 2 has a theoretical speed of ~1.6 GB/second when using an array

    When referring to IOPS, unless a configuration is being defined to support 100s of users on a very fast connection layer (think Fibre Channel or Infiniband), The level of IOPS created by even 10 or 15 editors accessing shared data is minuscule by comparison to American Express’ data center where IOPS are the real limiting factor of their operations.

    Also, it’s important that users pay close attention to the case of the “B” in MB/Mb and the like. Big “B” means Bytes while little “b” means bits. For safe, easy comparisons, if the number is in little “b” bits, divide by 10 (even though there are 8 bits per byte) for a more real-world Bytes-related number.

    As for hardware versus software-based RAID control, the difference is truly only noticed in very large scale implementations. For most arrays of 8 to 16 disks, either will do a good job for a normal user. The problems that users will run into are more often related to filesystem and interconnect type. Don’t plan on sharing a software based RAID array with more than 2 users. OTOH, a more expensive hardware RAID solution will provide for more robust management and later expansion.

    None of these differences will hamper most editing operations, but it’s important that users know what to expect while their storage is in operation.

    • Larry Jordan says:

      Tim:

      Thanks for writing. I have seen SO many versions of what RAID stands for (Independent/Inexpensive and Drives/Disks/Devices) that I knew whatever I wrote would raise flags somewhere. I’m happy to use Independent in the future.

      You are also VERY correct when you say there is a big difference between theoretical and practical speeds. I could not agree more! The big thing many people don’t understand is that regardless of how you connect a single drive, it will never fill the full bandwidth of USB 3 or any version of Thunderbolt. In general, on a new Mac Pro, I’m seeing faster speeds from Thunderbolt 1 than you are reporting.

      I appreciate your comments on IOPS, this is something I’m still getting my head around.

      However, I disagree with you on speeds from software vs hardware RAID controllers. My experience has been that even with smaller systems of 4 – 8 drives, hardware RAID controllers are more than twice as fast as software controllers.

      By the way, I forgot to list other articles I’ve written about storage that go into more details. I’ll add an update with a link to that article.

      Larry

      • Tim Jones says:

        Understood. The RAID Advisory Board adopted “Redundant Array of Independent Disks” as the official term back in 1994. My suspicion is they are the most correct source :).

        “However, I disagree with you on speeds from software vs hardware RAID controllers. My experience has been that even with smaller systems of 4 – 8 drives, hardware RAID controllers are more than twice as fast as software controllers.”

        We’ve now attacked this in our lab from the perspective of 4 different vendors’ SW-based RAID controllers and the only one that falls short is the Drobo and we relate that more to the 4800/5400RPM drives than their RAID algorithms. Promise is also getting dinged on performance of their RAID when in reality it’s their use of MUCH slower hard drives that causes the performance drop versus other solutions. What we have uncovered is that, assuming all OTHER elements are equal (7200RPM 64MB Cache drives, dedicated SAS I/O channel, more than 3 spindles), we don’t see the fall over in performance using a SW-based RAID algorithm until after 14 – 16 drives.

        • Larry Jordan says:

          Tim:

          1994, hunh? Hmmm… I should catch up on my reading.

          Thanks for the update on software vs hardware RAID controllers. My experience was principally with Drobo, which tends to be desperately slow.

          This is all good stuff – thanks!

          Larry

  3. Robin Harris says:

    Larry,

    Good intro to a complex subject. I own RAIDs from WD, Promise and Drobo and produce videos. A couple of modest amendments:
    – A gigabyte – GB – is officially 1,000 MB. SI, IEC and IEEE have all specified that mega, giga, tera etc. are base 10 metrics. If you want base 2 you need to specify it with prefixes like kibi. Macs started using the correct metric in Snow Leopard. The difference is important because as you go to larger capacities the percentage difference between base 10 and base 2 grows.
    – SSDs use flash memory – like on a USB thumb drive – not DRAM, which is why SSD capacity costs so much less than DRAM. For media users, flash drives are excellent for large files if you can afford them. For servers a flash drive makes a fine boot drive, but because it is costly a RAID array makes more sense for large files.

    One CRITICAL point: RAID arrays and SSDs can and do FAIL. They MUST be backed up just like any other single storage device or you could lose everything. Too many people don’t understand that and suffer enormous heartburn because of it.

    Robin

    • Tim Jones says:

      “One CRITICAL point: RAID arrays and SSDs can and do FAIL. They MUST be backed up just like any other single storage device or you could lose everything. Too many people don’t understand that and suffer enormous heartburn because of it.”

      Amen, brother Robin! Also, the false sense of security of a RAID array doesn’t cover the “oops” factor of “Oops, I didn’t meant to delete that file.” or “Oops, I didn’t mean to overwrite that file.” Only a solid backup can help there.

  4. Frank T says:

    Hi Larry,

    I have a Thunderbolt 2 RAID from Promise on order. I plan on using RAID 5. Will this preclude me from having a back-up drive of my files due to the redundancy on RAID 5? I currently have 2 smaller G-RAIDs, and I copy my files using Goodsynch.

    Thanks,

    Frank

    • Tim Jones says:

      @Frannk – RAID-5 only costs you the capacity of one disk and the redundancy is related to a parity/ECC stripe operation. This has no impact on the what that you utilize the result volume. What are you trying to accomplish with the new RAID unit?

      • Frank T says:

        TIm, I am buying for speed and extra capacity, but I want to protect my files from a drive failure. I back up my FCPX projects by Duplicating Projects by Snapshot. Am I looking at RAID 5 incorrectly? Thanks

    • Robin Harris says:

      Frank, you can back up – make that MUST back up – from your RAID to another storage device – hard drive, network storage, or another RAID – if the latter has enough capacity.

      The RAID firmware or software handles the redundancy under the covers: all your PC sees is a storage device like any other. Thus the fact that you have a RAID is irrelevant – except for the larger capacity – to backing up.

      Hope this helps. BTW, I’ve been very pleased with my Thunderbolt 1 Promise array.

      Robin

      • Larry Jordan says:

        Frank:

        I agree with everyone. Backups, which make a copy of your files, are essential.

        What a RAID-5 does is guard against losing data when a drive inside the RAID fails. However, there is only one copy of your files on the RAID, which is why a backup is critical.

        Larry

        • Frank T says:

          Tim, Robin, Larry, Thanks for your help A two-part follow-up, if I may: Can I make my 2 smaller G-Raids (8GB and 6GB) into one virtual RAID drive as my back-up or does it make sense to partition my soon to arrive 24gb Pegasus into two partitions for backup purposes? Thanks, Frank

  5. Tim Jones says:

    Not wanting to step on Larry’s toes here, but for more in-depth discussion of RAID and drive speeds, check out my blog on the Cow from last month:

    http://blogs.creativecow.net/Tim-Jones/archive/2014/02

  6. Charles Love says:

    Hi, Mary here. I’m still a newbie at all this, but considering purchase of a RAID drive. Our needs have more to do with flexibility and packability, since we travel frequently between two states. Right now we have a fusion iMac and a Thunderbolt 2TB ( plug-in ) LaCie drive for our media files, which we back up periodically to a 3Tb hard drive (firewire 2) The latter 3T is, sadly, partitioned (half devoted to still images).

    I’d like to devote the 3TB drive exclusively to photo media backup. Does it make sense to replace the 3T LaCie with a mirrored RAID on grounds that it would take place the of this HD and be constantly backing up. (though I suppose one could argue we still need the back-up–perhaps to the old 2TB LaCie). Or does the process of writing simultaneously to 2 drives really slow down FCP X?

    If so, what would be the best configuration? And what’s this about controllers? What is ideal in my situation, where compactness and flexibility is needed. I have no idea what’s involved in setting up nor why required. Sorry!

  7. […] Specs to Consider When Buying a RAID […]

Leave a Reply

Your email address will not be published. Required fields are marked *

Larry Recommends:

FCPX Complete

NEW & Updated!

Edit smarter with Larry’s latest training, all available in our store.

Access over 1,900 on-demand video editing courses. Become a member of our Video Training Library today!

JOIN NOW

Subscribe to Larry's FREE weekly newsletter and save 10%
on your first purchase.