Storage is like your heart. You take it for granted until something goes wrong. At which point, nothing else matters.
I’ve been thinking a lot about storage recently. Two days ago, I presented a webinar on multicam editing in Final Cut Pro X. Nothing demands more from your storage system than multicam editing. Then, yesterday, I attended the Creative Storage Conference in Los Angeles. Finally, today, our podcast – Digital Production Buzz – spent the entire show talking about backups and archiving.
It is impossible, in a blog, to summarize the entire state of storage today. (I’m not sure I could even do it in a book – the subject is so vast and changes daily.) So, think of this as a snapshot of where we are today, along with trends that media professionals need to pay attention to in the near future.
WHAT WE NEED TO CARE ABOUT
In general, we will spend far more for storage than we will spend for a computer. Because storage is so essential, make sure to pick storage for its performance not just its ability to meet your budget. Storage that is too slow or too small is just wasted money.
When it comes to storage, there are two types of data that are stored to a hard disk: structured and unstructured. Structured data means databases, where the files are organized in precise ways. Everything else is non-structured data. Within the unstructured group are large files, like media, and smaller files, like word processing documents.
When it comes to handling large, unstructured media files, three specs are most important:
If you are doing standard media editing, the first two – capacity and bandwidth – are the most important. When we start doing multicam editing, seek times become relevant. The more cameras you edit at the same time, the faster the seek times need to be to keep up with playback.
There are four broad categories of storage technology that we can use:
There are also RAIDs, which are collections of either spinning media or SSD drives, and hybrid systems, which are a combination of two or more of these technologies.
Though some were projecting that spinning media was hitting a wall in capacity, traditional hard disks continue to hold staggeringly large amounts of data and are projected to grow for the next several years.
However, that doesn’t mean that drives last forever. They don’t. And drives from different manufactures die at different times. BackBlaze provides Cloud-based backup services, more on that in a minute. Recently, they published a blog detailing failure rates for various hard drives in capacities from 3 TB to 8 TB, based on the 82,000 drives they use in their server farms. The results are fascinating and worth reading. (Click here.)
NOTE: You can also download their statistics to analyze further, if you want.
In general, for video editing, I recommend at least a two-drive RAID, configured as RAID 0 for best performance. Ideally, if budgets permit, a 4-drive RAID 5 (or a RAID 5 with more drives) will deliver high-capacity, high-bandwidth performance along with data redundancy in case one drive dies.
NOTE: Here’s an article explaining the differences in RAID levels.
Tim Standing is the Chief Technology Officer for SoftRAID. This illustration is from his presentation yesterday illustrating that standard spinning media fails either when first turned on, or after several years of use. This is why RAIDs are so popular.
Losing data is a terrible feeling.
SSD prices continue to fall, though they have not yet equaled the price of spinning media. However, if you are doing multicam work, I strongly recommend you use an SSD RAID to store your multicam source files. SSDs are roughly three times faster than a spinning media system and they have no seek times – which makes them ideal for the simultaneous playback needs of multicam editing. They don’t hold as much, though, so think about storage capacity carefully before purchasing an SSD RAID.
PREVENTING DATA LOSS
SoftRAID recently announced a new application – currently in beta – called “SmartALEC“.
“SMART Alec works in the background, constantly monitoring & checking your disks, and warning you if any disks are faulty, or about to fail. With SMART Alec’s advanced warning, you’ll have plenty of time to replace a bad disk and keep your data safe.” (SmartALEC website)
The good news is that this is developed by SoftRAID, a company that’s been deeply involved with creating storage software for years. The even better news is that a version of SMART Alec will be available free from the Apple App store after launch in the next few months.
Sign up for the beta version here – I already have.
One of the buzzwords at the Conference was “object-oriented storage.” This began at the enterprise level (meaning it cost BIG bucks!) but is slowly migrating into The Cloud and local storage.
Traditionally, when we write data to a hard disk, we are storing a file into sectors on the hard drive. This works fine, but doesn’t scale easily. If we have hundreds of thousands or millions of files, file-based storage breaks down. And finding the right file out of millions becomes really challenging.
Lest you think that only studios need to worry about this many files, I was stunned to learn recently, that I’m storing over 1.7 million assets across my three computer systems. Almost two million files on about 90 TB of storage!
Wow…. I had no idea.
Object-oriented storage was invented to solve the problem of constantly needing more space, while still able to track millions of files. The best analogy about how it works is when you drive your car up to a valet to park. He hands you a ticket, then parks your car. You don’t know and don’t care where your car is parked, as long as you get it back safely when you hand the valet your ticket at the end of your meeting.
Behind the scenes, there could be one parking garage or ten. They could even be building new parking garages and you wouldn’t care. Whenever you needed your car, you handed over your ticket and, like magic, your car would be delivered.
That’s the idea behind object-oriented storage. We stop caring where our files are stored. Instead, whenever we need them, we hand the operating system a “ticket” and our file shows up (or is saved).
What I learned yesterday in talking with the team at Caringo is that object-oriented storage is infinitely scaleable, with built-in support for access via the web. They were demonstrating it at their booth at the Conference. But, for the moment, it isn’t cheap.
Short-term, it is too expensive to implement on the desktop, though Symply – gosymply.com – is working hard to do exactly that. I am also seeing access to object-oriented storage migrate to the desktop with the announced-but-not-yet-shipping Cloud File Gateway from XenData.
TRACKING FILES, MACHINE LEARNING AND METADATA
We are generating too much stuff for us to keep track of everything in our heads. As the number of files we need to track explodes into the millions, the only way to manage them is via metadata and asset management systems. (And, as one speaker said at the conference yesterday, metadata does NOT mean making the file name longer.)
Philip Hodgetts reminded me tonight on The Buzz that the reason asset management software exists is to help us associate metadata with the media it describes. The problem is that I do NOT want to personally enter metadata for a million clips. Instead, Philip said, what we will see this year and even more next, is using “machine learning,” (also called “artificial intelligence”) to create metadata by recognizing the content in an image or the text in an audio file.
For example, companies like Digital Heaven – SpeedScriber – and Digital Anarchy – Transcriptive – have released software that automatically converts audio files to text. Now, producers can search media clips by typing text into a search box. This information can then be linked to the original media file so that we can search for those clips that contain a particular piece of text or person in the image.
Sam Bogotch, CEO of Axle Video, told me yesterday at the Conference that when Axle was released in 2012 they were delighted that it tracked up to 30,000 assets. In contrast, their newest version, released at NAB this year, now supports over 2 million! In fact, one of their most popular new features is Axle AI.
“Based on axle Video’s radically simple browser user interface and a visual analysis and search engine licensed from Visual Atoms, a UK developer of deep learning software, axle ai helps video postproduction teams bypass the laborious task of entering detailed metadata about every scene. Instead, users simply select a frame from a video, or grab an image from any onscreen source including web pages. axle ai is then able to rapidly identify a ranked list of media clips, as well as the exact segments in those clips, whose contents most closely match the image.” (Axle press release announcing the product)
For the last two years, I’ve been working with the team at Axle Video to implement a working media asset management system. I’ve had all kinds of problems, but remain impressed with their willingness to give me a hand and improve their software. I take personal responsibility for getting them to support 2 million assets. That’s because I crashed their system trying to catalog the 1.7 million assets on my system.
If you are thinking of implementing a digital asset management solution (DAM), here are a few tips, based on my experience:
I’m still trying to figure out the best way to manage the assets I’ve got, because what I’m doing now isn’t working.
BACKUPS AND ARCHIVING – LOCAL
The most important thing to remember is that all hardware dies; generally, at the worst possible time. If you don’t have your data backed up, its your own darn fault.
RAIDS prevent against data loss in the event a hard drive dies. But even a RAID won’t help you if you delete the wrong file by accident. That’s why you need backups.
LTO tape is the current leader for long-term backups and archiving. This technology was developed by a consortium of three companies: HPE, IBM and Quantum called LTO Ultrium. The current version of LTO technology is LTO-7, which holds 6 TB of data on one tape.
While the LTO drives are expensive – an LTO-7 drive costs from $3,000 – $5,000 – a tape cartridge (~$100) is much cheaper than constantly purchasing new hard drives. Essentially, once you have the drive, you create a virtually unlimited storage capacity simply by adding another tape at a cost of about 2 cents per gigabyte.
NOTE: Two companies that sell LTO drives for the Mac are mLogic and TOLIS Group.
Last year, LTO added a new feature called: LTFS (Linear Tape Filing System). This allows an LTO tape to mount to the desktop where you can drag and drop files to it. While this is convenient, Macintosh, Windows and the various flavors of Linux have problems with it. For instance, whenever you open a folder in the Mac Finder, the Finder writes a small, invisible file into that folder detailing where icons and other screen elements are located. With a hard disk, writing a small file is no problem. With tape, the tape drive needs to shuttle to the end of all recorded material, record the file, then shuttle back to the location you are viewing. This process is called “shoe-shining” because of the whooshing sound the tape makes rushing back and forth and you open different folders on the tape.
As long as you don’t try to view the contents of a tape on your desktop, you’ll be fine. But if you treat a tape drive like a hard disk, you’ll get very frustrated, very quickly.
There are a number of vendors that provide software for LTO drives, for example, Imagine Products. However, only TOLIS Group and Retrospect avoid using LTFS.
The other problem I have with LTO systems – aside from the cost of the drives – is that every two years new drive technology is released. This is clearly detailed on the LTO roadmap, but it also means that choosing to store files on LTO technology means that every ten years or so, I need to purchase the latest LTO drive and migrate (copy) all my data from the old format to the new format. At 6-8 hours per tape, this will take a while.
This means that we need to budget for long-term archiving hardware and staff time in order to actively manage our library of archived media.
THE HIDDEN TRAP OF TECHNOLOGICAL OBSOLESCENCE
As if trying to figure out what’s the “best” hardware to use for longer-term storage wasn’t bad enough, we also have to deal with the fact that programs, file formats, even codecs go out of style and are no longer supported.
The most recent example is Apple’s decision to end support for QuickTime on Windows. This is devastating to everyone creating media – regardless of platform – because one of the most ubiquitous media containers on the planet just died.
But, you only need to look at all the files stored on your hard disk that you can no longer open to realize that this is an on-going, critical problem. What’s the sense of storing your amazing project for 50 years if you can’t open it when you restore it?
The folks at Digital Bedrock have developed a system to track the file formats and codecs used by files that they archive. Then, when one of these is announced as “end-of-life,” Digital Bedrock will notify you so that you can respond while there’s time to work out a Plan B. (I’ll have more on them in a minute.)
BACKUPS AND ARCHIVING – CLOUD
As I was chatting with the folks in the BackBlaze booth yesterday, they pitched me on the idea of backup and archiving my data to The Cloud. The benefits are that it is infinitely scaleable, I never need to buy more storage for backups, yet I still have access to all my files.
The price is based upon the amount of data stored on their servers and whether you are using them for backup or archiving. Backups can be automatically scheduled, with a variety of other services for $50 per year for their Business Backup service.
Or, if you need longer term storage, their B2 Cloud Storage costs $5 per month per terabyte, which is about 25% the cost of Amazon storage.
NOTE: The four biggest public cloud storage vendors are Amazon, Microsoft, Google and IBM.
The big problem with Cloud storage is the speed of your Internet connection. For any of these storage services to make sense, you need a minimum upload speed of 10 mbps. For many, that speed is easily achievable. However, for others, like me, even though we live in Los Angeles, my upload speed tops out at 1.3 mbps. FAR too slow for any serious Cloud-based backup or archiving!
NOTE: A great way to test your Internet connection is using SpeedTest.net.
AN ARCHIVING HYBRID
Digital Bedrock has developed a hybrid approach combining Cloud access with LTO storage to create something very useful to smaller shops.
When you are ready to archive footage, they will send you a 60 TB hard disk. You process your files through their file logger/asset manager, copy them to the hard disk, then ship the hard disk back to them. This solves the problem of Internet access speeds.
Digital Bedrock then transfers the files to three copies of LTO-7 tape. One set is stored on the West Coast, one on the East Coast and one is sent back to you. Even in the event of a natural disaster, there is a high likelihood that at least one set of data will survive.
Another excellent protection is that at no time do your files go online, so there is no chance for them to be corrupted, hacked or stolen.
At the same time they create the LOT-7 tapes, they load your list of files, along with metadata tracking formats and codecs, into their web-based database. This allows you to see, search and manage the files you have in archive storage, but, as this is only metadata and not the files themselves, your source files are secure.
I took a tour of their facility recently, along with interviewing their CEO and I’m impressed with the system they’ve developed. I’m especially impressed with how well they’ve thought out the issue of security, yet made the entire process affordable for smaller shops and individuals.
With almost daily reports of hacks, malware, stolen user names and ransomware, these are dangerous times.
NOTE: If you want to get depressed, take a look at this list of companies that have been hacked recently. It numbers in the thousands…!
The best way to protect yourself is to avoid connecting critical systems to the Internet. But, these days, that’s not possible for just about all of us.
I am still leery of storing sensitive files on a remote server in The Cloud. I like the accessibility, but I don’t like placing blind trust in the quality of an unknown Cloud vendor’s IT staff. (This probably comes from watching Jurassic Park at an impressionable age.) I use the Cloud daily for training files that I distribute. But, I don’t use the Cloud for anything that hasn’t been released.
As I talk with vendors and industry experts, I get a strong impression of a media industry in both crisis and transition. Our need for storage is growing far faster than technology is able to supply products to meet the need. In fact, like the weather, if you wait a minute, things will change.
At the non-enterprise level, there’s isn’t a single unified solution with consistent standards, smooth interoperability, strong security and reasonable price. Instead, we are forced to create our own custom storage, backup, and archiving solutions from a rapidly evolving collection of standards, vendors and products.
There are many different solutions out there, however, no one solution will work for everyone. The problem is that we can not risk our data by waiting. So, here’s my current thinking:
But I don’t. So I’m still trying to figure out what needs to be stored where. As always, I’m interested in your comments.
7 Responses to What a Mess! The Current State of Storage
Thanks for the fascinating piece about storage, I identify with a lot of it. But there is one element that you did not address relating to long-term storage: what’s the point? I recently retired after 30 years of running a company that produced long-form television documentaries, so our first media was 16mm film and our last was 4k Red files, with everything else in between. We stored final air tapes, original material, DVDs for non-broadcast use and so on. The introduction of HD the older material useless anyway. When I retired, I had two duplicate sets of hard drives made of everything I wanted saved, together with master air tapes, took one set to my new home in Vermont and let another with my long time editor. Over the last two years, I did dip into those hard drives for some footage that someone wanted to buy, and again for a small personal project but all the rest of it just turned into something taking up space—a lot of space. I didn’t care about the clients any more, and I used DVD’s if I ever wanted to look at a film for sentimental reasons, or to give a copy to someone else. A lot of my work has been stolen and uploaded to Youtube, so its there if I want it. So now I look at all this stuff and think “Who cares?”. When you are talking long term storage, you really need to think long term, which means past your own sell-by date…
Thanks for your comments.
I don’t disagree — but… The KEY point is that you had the media stored so that you COULD do something with it if you wanted to. My concern is that many younger filmmakers don’t think about long-term storage until they need something and discover that they no longer have it.
Media is easy to trash, it is very, very difficult to restore.
Good Morning and thank you for yet another timely and helpful article Larry.
I work as a Full Time Staff Editor and my employer has both Server and LTO backup policies in place. At my home however, is another story. I have a PC with a Hybrid C Drive and a WD Black 2TB Drive for my media. The PC has nothing but personal files on it as I do not work from home. I do however subscribe to the Adobe Suite, have Media Composer 8.3 and Sony Vegas 13 installed to “play” with.
I too am looking for a method to backup and archive my files. I purchased an 8TB WD external drive and duplicates of my internal drives for cloning. My last backup was in April using Acronis True Image. Thank goodness I made that backup as I recently had a major event.
When Adobe release Premiere 2017.1, there was an issue that when you moved the Media Cache File folder from it’s default location, as I did to my Media Drive, it would delete any and all media files from that drive without warning. This is what happened to me and many others. Every photo, movie and audio file was wiped from my drive leaving the folders intact. Since then, Adobe has released an update (2017.1.1) to correct this issue and to their credit, Adobe is not only making sure this never happens again, but they are also reaching out to their users.
So, I went to my Acronis software which I have been using for a number of years and went to restore the drive. For some reason the software would not see the backup?! After some internet searching, I found that I am able to open the archive itself and selectively restore what I want. Without the use of the software as a front end, this is very tedious and time consuming. In fairness to Acronis, I have not yet contacted their support.
Thanks to your article, I am now aware of other options and I have much to consider.
The bottom line is that everyone needs to be prepared and always expect the worst. Whether it’s a drive failure, a CPU failure, an electrical surge, an act of God or even your favorite Editing software messing with your files, backup and backup again.
Thanks again Larry!
This is a VERY scary story – which we can all learn from. I hope you are able to find and restore all your data.
Thanks for writing.
Thanks for the write-up Larry. Here’s my/our current strategy where I work per our money resources:
-Offload cards content to desktop RAID 10 storage for editorial
-Media manage and archive projects to TWO bare SATA drives using a drive dock (clones of each other)
-Synk drives with Decimus Synk (looking for new app since Decimus seems to be no more)
-Catalogue drives with NeoFinder
-Once drives are full, one last cataloging, put drives in silicon sleeves
-Keep one drive local, off-site the other
-Add a reminder to Calendar.app to spin the drive(s) up no later than a year from the final cataloguing.
So we buy drives in pairs and fill them up progressively with our shows, shuffling data between archive drives as needed. We typically buy WD green drives since these drives don’t need to be high performance for editorial; only for unarchiving older files.
Our NeoFinder Database is also on a RAID that is backed up externally as well to another location.
I just re-archived everything I had that was stored on old FW400 and USB2 drives to newer USB3 drives. While not as drastic a change in storage space from my conversion of 3/4 inch tapes to 8mbps DVDs a few years ago, I went from a shelf of drives to a single pocket drive. That was nice. Then I can duplicate that drive to another brand of USB3 (making sure that the physical drive in the box is from another manufacturer). I plan to do the same with a shelf of high performance RAIDs I have that are just sitting there on another shelf filled with client work that hasn’t been touched for years. This has given me a chance to review the contents of these drives and assess what I really need to keep. Lots of useless stuff builds up over the years. I also find the master edit movie files and duplicate those to another place so if the raw footage disappears, at least my finished work is saved.
The fact is that all media ages and not always in predictable ways. A five year old drive dies suddenly, a twenty year old drive still works fine and might for another twenty years. Tape seems dependable but a slight change in manufacturer composition can lead to a cassette filled with sticky, rubbery tape instead of a dependable source of material. I know this from archiving 3/4 inch tape and 1 inch. I’ve had better luck with old VHS tapes than 3/4. The same with film stocks, some 90 year old film stocks are in better shape now than film stocks from the 1970’s. The good thing now is that digital backups are cheaper and easier to make than ever if you already have a digital file. Having a rough backup plan is the best option. Maybe every five years transfer everything just to be sure. Who knows what we’ll get next?
Want to echo some of Robert Gardner’s comments about ‘relax’ contained in this thread. Speaking from experience as a 58 year old audio, film and video producer who transited successfully through film, tape, (2″/1″/betacam/linear digital) to Adobe Premiere, and from 2″ analog multi-track audio through 10 generations of ProTools, I can tell you one thing for sure. As you get nearer the end of your own career things get clearer. The (small) body of work you REALLY want to preserve you manage to preserve, (and sometimes preserve again and again as technology progresses. 🙂 On the other hand, many projects simply become irrelevant because they weren’t that significant in the first place. Finally, you begin to see that the “big project” you did back in 1982 maybe wasn’t so significant after all? 🙂 One tape I own brings a smile to my face. It is a DAT simply labeled “keep forever”. No idea what’s on it, but I keep it as a reminder that archiving is more than a label “keep forever”.
My best advice is be smart about what’s obvious and relax about the rest.