A Primer on Media Archiving

workflowThis week, I received two emails which triggered this article.

Larry G. wrote:

I have a lot of historic and unique video footage on the United Nations. Tons of it in fact . . . . (A garage full.) It includes, hundreds of VHS, SVHS, 8mm, miniDV tapes.

My question is what is the best media to store/archive all that footage?

Then, Sister Anne R. wrote:

I am working with FCP 7 to rescue the extraordinary content (but visually poor quality) we have on VHS…. This material is crucial to the ongoing life of our Benedictine community….

My question is, if I update Final Cut Pro7 now, will I run into trouble with the videos I have already completed? Would it be better if I just continued to work without updating it?

Sadly, there is no perfect – or long-term – answer when it comes to archiving. In fact, our industry seems to be determined to make this as challenging and difficult as possible. For companies eager to invent the future, they are notoriously reluctant to help us retain the past.

There are three separate issues here:

  1. What video codec (format) should you use to capture your assets?
  2. What hardware medium should you use to store it on?
  3. How do you keep track of the assets you’ve archived?

PICKING A VIDEO CODEC

To answer Sister Anne’s question: The video editing software you use to capture files doesn’t really matter. What does matter is the codec you use to capture these files into. So, Final Cut Pro 7 is fine for working with other video formats. For that matter, upgrading won’t hurt, because it isn’t the editor, it is the codec that matters.

DEFINITION: “Codec,” short for Compressor/DECompressor, is the mathematical algorithm that converts the light and sound of real life into numbers that can be stored by the computer. Different codecs are designed for different things, such as creating small files, high-quality images, portability between devices, and so on.

Of the two issues, picking the “best” video codec is the hardest, because codecs fall into and out of favor quickly, and many of them are proprietary to specific companies. This means that if the company that developed the codec decides it no longer wants to support that codec, then you can’t access your media.

This brings me to Archiving Rule 1: Assets can not be captured, then ignored. Always assume that any assets you want to preserve for the future will need to be transcoded into a new codec every few years to keep them current with technology.

DEFINITION: “Transcode” means to convert from one format to another. Transcoding does not necessarily damage image quality, depending upon the codec involved. Compression is a form of transcoding which also reduces the size of the compressed file.

While I’m thinking about it, here’s Archiving Rule 2: Never rely on archived project files being accessible in order to restore a project. Media files are easier to archive than project files, because media codecs change slowly over time, while project files can break between upgrades from the same developer.

And, as long as we are getting depressed, here’s Archiving Rule 3: There is no guarantee that hardware that is popular today will work tomorrow. Just by way of examples, think about these highly popular older technologies:

While it might make sense to archive media using the native format it was shot by the camera – DV, HDV, AVCHD – for assets destined for long-term archiving, transcoding into a more popular format probably makes more sense.

There are two-and-a-half basic types of codecs:

Lossy codecs include:

Lossless codecs include:

Visually loss-less codecs include:

CODEC RECOMMENDATIONS

Currently, for video projects on the Mac, I recommend Apple ProRes 422 HQ. Apple isn’t going anywhere quickly, they’ve licensed ProRes to multiple vendors, it is widely supported in the industry and even vendors that don’t like Apple acknowledge that they did an excellent job creating ProRes.

However, ProRes isn’t perfect:

For Windows users, or Macintosh archivists wanting a more universal solution, the situation is murkier. Here, there are two possible options:

Both codecs, like ProRes, offer 10-bit, intra-frame, visually loss-less compression, with reasonable file sizes and color image quality that matches that which was shot by a camera (i.e. 4;2:2 chroma subsampling).

However, both are proprietary, both are only visually loss-less, and both will cause image quality degradation when transcoding a RAW file into this format.

NOTE: When archiving, expect file sizes to be large. The smaller the file size the more data you are throwing away during compression, which is exactly what you DON’T want to do when creating assets to be preserved over the long-term.

TRANSCODING SUGGESTIONS

Always transcode into a video format that generates large files with high bit rates. reducing file size always damages image or audio quality, because it is faster and cheaper to buy more storage space than to try to recover data lost due to using the wrong codec.

Capturing standard-definition media from 8mm film, VHS or DV tapes into DV is OK. Capturing the same files directly into ProRes 422 HQ is better, because ProRes provides potentially higher quality than DV.

In terms of image quality:

  1. Capturing standard definition video via SDI connections is best.
  2. Capturing standard definition video via component connections is second.
  3. Capturing standard definition video via S-video connections is third.
  4. Capturing standard definition video via composite (RCA) connections is fourth.

For VHS tape, using a TBC (Time-Base Corrector) makes a BIG difference in video quality in terms of color and image stability. Many high-end VHS decks that you can rent include a TBC. It is worth the money.

Always transcode audio using at least 48,000 sample rate with a 16-bit depth; 24- or 32-bit depth is better.

THE ANALOGY OF THE BATHTUB

Imagine that you’ve dipped your five-gallon (or liter, for you non-US folks) bucket into the river of life and captured five gallons of beautiful images.

When you compress that five-gallons of media into something small enough for the web, you are pouring the contents of that bucket into a 1-quart measuring cup. Water is spilling out all over, becoming irretrievably lost. The file is smaller and representative of the the original five-gallon contents, but much that was captured was lost.

This is essentially what happens when you compress video with a lossy codec to reduce the file size for the web. Data is permanently lost and can’t be replaced at some point in the future.

Now, imagine that you are pouring that five-gallon bucket into a bathtub. The bathtub is so much bigger than the five-gallon bucket that every drop of water gets moved from the bucket to the tub.

This is essentially what happens when you transcode video from a more compressed format into a loss-less or virtually loss-less format. The codec provides plenty of room to store all your data without losing anything.

WHAT ABOUT…?

Not all camera formats are compressed. All RAW formats are stored in a very large, very expansive codec that is perfectly fine for archiving — provided that that RAW format will be supported into the future. This would include cameras from:

Also cameras, and digital recorders, that record directly into a visually loss-less format like ProRes 422 HQ or ProRes 4444 XQ.

The key issue is whether that RAW format is a temporary flash-in-the-pan or something that’s going to hang around for a while. And the real difficulty is that we don’t know the answer until almost too late.

Which brings me back to Archiving Rule 1: Assets can not be captured, then ignored. Always make sure that whatever technology you are thinking of adopting next does not prevent you from accessing the files you created yesterday.

ARCHIVING HARDWARE

The basic problem with picking the right hardware for archiving is that cheap solutions don’t last long and hardware that lasts long isn’t cheap. (Like I said, technology hardware companies are not making archiving very easy.)

Starting with the least reliable, here are your options:

NOTE: XD-CAM video recorded to a cartridge is, for the purposes of archiving, recorded to a Blu-ray Disc.

LTO is used virtually everywhere in the corporate world. Linear Tape-Open Technology (LTO) is a tape-based data storage solution designed in an “open” format technology that allows manufacturing by any vendor that wishes to license the technology. The market for devices is large and growing. Multiple companies provide both hardware and software options, including IBM, Quantum and HP. It is the most viable option when you need to backup terabytes, or even petabytes, of data. LTO tapes are generally considered to last as long as video tape – about 25 years.

NOTE: Here’s a link to Ultrium, the LTO organizing body.

However, the LTO spec, which is public, calls for new hardware every 18 months or so. The current version is LTO-6. The way that LTO works is that LTO-6 drives can read and write LTO-5 and LTO-6 tapes, but only read LTO-4. This means that you’ll probably need to convert your tapes every ten years or so to stay current with LTO technology.

NOTE: LTO-7 is on the LTO roadmap, but isn’t expected until 2016 at the earliest.

This means that every three generations of LTO hardware, you’ll need to move your data from the older format to the newer format. This involves both purchasing new LTO hardware and new tapes to transfer your data.

NOTE: When it comes to tape, there is also the Sun/Oracle T10000-series tape drives. However, the T10000C starts at $25k, which puts it beyond reach for most of us.

HARDWARE RECOMMENDATION

For companies with a budget for archiving, LTO makes the most sense. For individuals or organizations with limited budgets, well, you are between a rock and a hard place. Formats like optical media don’t store enough and hard disks don’t last long enough.

For those that can afford it, I strongly recommend LTO tape. Each LTO-6 tape stores about 2.5 TB of data, with speeds fast enough to equal a single hard drive.

For those that can’t afford the latest version of LTO, look into LTO-5. These older models still work, they are just slower and don’t hold as much as LTO-6. And they often cost a thousand dollars less, while still providing data access and longevity.

When purchasing LTO drives, consider manufacturers like:

Tapes recorded by drives from one manufacturer should be compatible with drives from other manufacturers.

NOTE: An older tape-based technology is DAT. This is, for all intents and purposes, a dead format. Don’t consider it.

TRACKING YOUR ASSETS

The hidden “gotcha'” in archiving is “how do you track your assets once you’ve archived them?” In almost all cases, this means you need to consider purchasing asset management software; and explaining how that works is a separate article in itself.

Companies that design products for the smaller budget include:

SUMMARY

It would be great if there were a single, inexpensive solution that would reliably store our media for a the long-term. But… there just isn’t.

As with many things in life, you need to balance the size of your budget with the size of your assets and figure out the system that comes the closest to meeting your needs.

Just keep in mind, this is absolutely a situation of penny-wise, pound-foolish. Saving money on inexpensive solutions is no consolation when you wake up one morning and discover you can no longer access your media.

Also, and this is equally important, ALWAYS keep multiple copies of essential media. “Everything put together falls apart.” Protect yourself. Plan for disaster. Keep multiple copies in multiple places. Or, if to save time and storage space, you decide not to keep multiple copies, remember that this, too, is a decision.

As always, I’m interested in your comments.


Bookmark the permalink.

9 Responses to A Primer on Media Archiving

  1. Gloria Messer says:

    Larry, I am wondering if eons and eons ago we had advanced to the stage we are at now, and there simply is not any record of having done this before????
    What do you think.
    xxo glo

  2. Oren says:

    Larry, you mention that spinning hard drives only last 4-5 years, but what if you archive onto spinning RAIDS with redundancy, so that hard drives are replaced as they fail, AND you have an offset backup of said RAID in case the RAID controller itself fails or some other catastrophe. Wouldn’t this preserve your media indefinitely, as long you kept your RAIDs up to date?

  3. Richard Hale says:

    Correction. M-disc are advertised to last 1000 years not 100 years as mentioned. The “M” in the name stands for one thousand. As a M-disc user I suggest everyone do a bit more research and realize what a blessing they are. They currently have 3 versions: DVD 4.7 gigabytes,BD 25 gigabytes, and BD 100 gigabytes. I suggest you have a rep from the company on your Buzz.

  4. Richard Day says:

    Always an important topic! I have a few terabytes of media and have been keeping them on (spinning) drives with a backup of each.

    It just occurred to me that doing a “Smart Backup” (using SuperDuper) would not, during the backup, rewrite files that have not changed.

    So I should at the bare minimum have one extra drive and rotate my backups at intervals. For example, I would do a completely fresh backup on the spare drive, which then becomes the backup drive. Then the former backup drive becomes the spare drive, I would move onto the next backup, and so on.

    That way, at regular intervals, every drive is refreshed.

  5. Bill Rabkin says:

    I’m prepared to make an initial investment in LTO technology, but my current computer is a late-2012 iMac with Thunderbolt ports. How do I attach an LTO-5 or LTO-6 external drive to this iMac?

Leave a Reply to Larry Cancel reply

Your email address will not be published. Required fields are marked *

Larry Recommends:

FCPX Complete

NEW & Updated!

Edit smarter with Larry’s latest training, all available in our store.

Access over 1,900 on-demand video editing courses. Become a member of our Video Training Library today!

JOIN NOW

Subscribe to Larry's FREE weekly newsletter and save 10%
on your first purchase.