[Updated 5/13/2018 to more accurately reflect upcoming changes from Apple and add a comment from Adobe.]
My son, Paul, is a digital archivist with ten years experience working with large firms like the IMF, NATO and, now, Brown University. (Brown is an Ivy League school, as are Harvard, Princeton and Yale, among others.)
Over the years, Paul and I have talked about the challenges of preserving our digital past. Recently, he had an experience at Brown that was such a scary preview of our future that I asked him to document it for me to share with you.
As you will read, we are on the edge of losing decades of our digital history and, unless we take action now – as individuals and developers – within a few years most media assets from the 1960’s onward will be lost. The problem is far more serious than I thought.
I’ll have more to say at the end of Paul’s report. As you read this, put yourself in Paul’s shoes, except think about trying to resurrect your own assets 20 years in the future. This isn’t “tomorrow’s issue” – it is an issue for every media creator right now.
Digital Preservation Case Study: WBRU Recordings
By Paul Jordan
Assistant University Archivist
Brown University
Project Overview
Brown’s student-run radio station, WBRU, is one of the country’s oldest student radio stations. Recently, the Brown University Archives approached the radio station regarding historical recordings in the station’s possession. After an appraisal trip by Archives staff that involved opening file cabinets that had been closed for years, sixteen linear feet of recordings were gathered and donated.
The recordings spanned the past several decades of the station’s existence, from the 1970’s up to the mid-2000’s, and were on a variety of subjects. The two largest groups were in-studio interviews of bands by student DJ’s and the weekly and daily news programs produced by the student staff.
As to be expected from a collection of recordings from a professional-grade studio over such a long timespan, the recordings were on a large variety of media. Included were:
As the University Library has an existing procedure (and budget line) for reel digitization, Archives staff focused on the other formats. The more than 200 CD’s were assessed as being at the greatest risk, as they were from 1997-2000 and 2004-2007. All of them were consumer-grade burnable CD’s, which are generally considered to have a 5 year lifespan before needing replacement. The Assistant University Archivist therefore began efforts to transfer the contents of the CD’s onto the Brown network for long-term preservation.
Initial Preservation Efforts
The first preservation attempt with the CD’s was an attempt to rip the CD’s using standard software. As the Assistant University Archivist’s computer (a newly-bought Dell) did not have an internal CD drive, an external CD drive was acquired from the Library’s IT department.
Of the slightly more than 200 CD’s in the collection, all but 71 could be read as audio CD’s. All of the CD’s were of student news reporting. WBRU had two main news shows: The Point, which was for breaking news and general Rhode Island news, and The Pulse, which was news focused on people of color. The CD’s contained finished works, and had been burned as audio CD’s for the use of the station’s DJ’s. Windows Media Player was used to rip them into mp3 format.
However, all of the 1997 and 1998 CD’s, and about a third of the 1999 CD’s (71 CD’s in total), were not readable by the Dell machine and were initially feared to be simply non-functional.
Further Investigation
While it was possible that all of the CD’s that could not be read had degraded beyond readability, Archives staff determined to investigate further. The Assistant University Archivist took the CD’s, intending to attempt a to read them with his personal computer, a 2013 iMac. It was hoped that a different CD drive and different operating system would be able to read the CD’s.
When the first CD mounted without difficulty, the truth of the situation was quickly determined. The CD’s were not audio CD’s, but were instead Macintosh data CD’s, created on the station’s Macintosh computers. Their format explained why they could not be read by the Windows machines in the Library.
The Assistant University Archivist was able to copy the data from all of the CD’s (save for four that were either entirely or partially unreadable) using the iMac and an Apple external CD drive.
Unfortunately, the CD’s did not contain audio files of finished work product like the audio CD’s. Fortunately, one of the CD’s was a backup of WBRU’s production computer hard drive from 1998, which contained all of the software that the station used. Based on the software, and the files contained in the data CD’s, the Assistant University Archivist was able to make some informed guesses about WBRU’s work practice in 1997-1998.
The station used Pro Tools version 4.0.1 for both recording audio segments in the studio and editing them into the final show for broadcast. Pro Tools was likely also used as the recording program for call-ins (either telephone interviews or transmission of edited pieces done remotely during election coverage). Each session was contained in a folder that provided its type (i.e. The Point, The Pulse, 90-Second Wraps) and the date. The sessions themselves usually included the date and a brief description of the content, but were sometimes named randomly by the student (i.e. Education, Police Brutality, Zach’s Fun Place).
Each session folder had the same format: the Pro Tools session file and a folder labeled Audio Files. The Audio Files folder contained all of the raw recordings and interviews. Unfortunately, the audio files had almost without exception not been relabeled by the editor, and had the default file naming convention (i.e. Audio 1-01, Audio 3-04) assigned by Pro Tools during each recording session.
The raw audio files could be played by modern computer audio players, however the Pro Tools file could not be opened. That meant that the original edit was “lost,” leaving Archives staff with two un-tenable options: either make the raw audio, with its periods of dead air, numerous takes with mis-speaking, and full, un-edited, interviews, available with a note that the original edit was unknown, or not accept any of the files into the Archive and thereby lose more than two years of student broadcasts.
MP3 Conversion
As neither option was acceptable, the Assistant University Archivist attempted to recover the original edit and export a complete version to a more widely-used format. This effort was ultimately successful, though only thanks to some unlikely events.
The first and most important was that the Assistant University Archivist had been a Mac user for more than two decades, and still retained a PowerBook G4 from 2003 that had not been upgraded beyond OS 10.3. The choice had been done deliberately at the time so as to not lose access to both the Classic emulation environment and the ability to boot directly into OS 9. And while that laptop had been replaced by several other, more modern, computers, it had been retained along with its peripherals and was still in working condition.
The second was that the hard drive backup CD had confirmed the program and version of the program needed to read the edit file. An initial attempt of installing the Pro Tools 4 instance from the CD onto the laptop failed, as Pro Tools required that the installation floppy disks be inserted for verification. That was obviously not possible.
After a bit of searching, a copy of Pro Tools 5 Free was found and successfully installed, courtesy of the website www.macintoshgarden.com. Pro Tools 5 was able to open and read the Pro Tools 4 files used by WBRU.
Finally, the Brown University Library contained a copy of Pro Tools 5 Visual Quickstart Guide, which allowed the Assistant University Archivist to learn the parts of Pro Tools required. Though the archivist had experience with video editing and had used an earlier audio editing software (Macromedia’s SoundEdit 16, version 2), Pro Tools was unknown. In particular, the command to export and convert to mp3 was called “Bounce to Disk,” which was completely unintelligible to someone without Pro Tools experience.
A first attempt to export to mp3, running Pro Tools 5 through emulation in OS 10.3 failed. Every time a session file was opened, an error message stating “DAE error -6001 was encountered” stopped the file from opening. Some internet research suggested that there was some kind of driver problem but did not suggest a solution. Rebooting the laptop into OS 9 and launching Pro Tools directly solved the problem; the error messages still appeared but this time could be closed without issue to show the mix and Pro Tools interface.
An Arbitrary Deadline
The jury-rigged conversion set-up worked. The Assistant University Archivist was able to open the Pro Tools 4 session files and view the original edit in the same condition it had been when the WBRU students had closed the file twenty years earlier. Using the Bounce to Disk option, the sessions could be exported to mp3. As the Archives was only interested in maintaining the final product, the loss of the edits was considered acceptable.
Not every session was able to be exported. In some instances the radio staff did not include the Audio Files folder, or left out key pieces of audio. Some edits did not appear to ever have been finished, while a few were in pieces but identifiable enough that the Assistant University Archivist felt justified in finishing the edit so that it could be exported.
Unfortunately, Pro Tools 5 Free’s mp3 exporter came with a built-in 30 day limit. The existence of the limit was not known until a test mp3 export was done, and by then it was too late. The clock was running.
The first time it was used each day a dialogue box appeared giving the remaining days in the trial period. It also advised that an unlimited license could be bought by going to the DigiDesign website. As the DigiDesign brand has long since been folded into Avid (who acquired the company in 1995), and the current version of Pro Tool is Version 12, it was considered very unlikely that such a license would be able to be procured.
Therefore, in order to avoid having to attempt workarounds such as removal of preference files and reinstallation of the software, the Assistant University Archivist attempted to export all of the WBRU audio before the arbitrary 30 day deadline expired. The chance of success was lowered by the fact that the converter required that each piece of audio be played in real-time before it could be converted. What likely would have taken a day or two with modern software stretched out over the entire month.
In the end, the Assistant University Archivist was able to export all of the WBRU audio, finishing mere hours before the converter ceased working. 692 individual mp3 recordings were made, containing almost all of the content created by the studio in 1997 and 1998, and the remaining half of the content from 1999. The files totaled 5.05 GB in size.
In addition to the massive increase in usability, in that the files were now the actual edited programs played on air instead of the various component pieces, the conversion represented a massive decrease in the size of the collection. In its raw form, as recovered from the CD’s, the collection was 32 GB in size over 9,600 files.
Lessons Learned
Over the course of the project several lessons were learned, and other, existing, lessons were strengthened.
Format obsolescence: The first and largest lesson is that format obsolescence remains a problem that plagues all manner of digital collections. Users do not perform updates or migrations of their data, either to new media or from old formats to new formats. Indeed, it was impressive that so much had been retained in the old cabinets, and that there was only a 2% failure rate of the twenty year old CD’s.
Necessity of Old Hardware: When doing these kinds of digital preservation projects, having original hardware is often the difference between success and failure. The inability to open the Pro Tools session files while running Pro Tools in emulation was a potential critical failure for the project. If the project laptop had been newer and unable to boot directly into OS 9, the session files would have remained unopenable and the project would have failed.
Dangers of Coding Restrictions: The first attempt to open the Pro Tools session files consisted of copying the Pro Tools 4.0.1 program that had been used to create them from WBRU’s backup CD to the project laptop. Sadly, it was stopped by the built-in digital rights management, which required insertion of the original installation floppy disks to continue.
In fairness, we were doing precisely what that feature had been built to stop: copy the program from a backup to a new and completely different computer. However, there was no way to tell the program that it was twenty years later and this was an archive trying to preserve WBRU’s work, and not a pirate. If Macintosh Garden had not had a version of Pro Tools 5 free (and if DigiDesign hadn’t released a free version of their Pro Tools 5 software), the project would have failed.
Similarly, the inclusion of the 30 day trial limit on the mp3 converter added an unnecessary time crunch to the project. Archives staff would have preferred to complete the project over a more leisurely time period instead of being forced to make it the main priority. And as with the copy protection, there was no way of telling the program that it was twenty years later and that Avid wouldn’t be able to supply a code even if we sent them money.
The existence of such features on newly-released software is understandable, but ideally they should have some manner of sunset, so that after they become obsolete any potential barriers against using them to preserve what they created are removed.
Importance of Old Training Materials: Though not as mission-critical as the old hardware, the fact that the University Library had a professional, published guidebook to the software made learning what was required in Pro Tools 5 much easier. The existence of the book would have been even more important if the Assistant University Archivist did not already have some training in different non-linear audio and video editing programs. Considering the tight timelines of the project, without the instruction book the project would have come even closer to running over the arbitrary deadline before completion.
Conclusion
Digital preservation is a problem that anyone who works on a computer needs to take into consideration. While nearly all of the WBRU material on the CD’s was successfully exported, it was only through the specialized knowledge and equipment of a newly-hired staff member, and a great deal of work. Without that equipment and expertise, the material would have been deemed impractical to preserve if not already degraded, and destroyed.
If you don’t care enough about your work to ensure that it is preserved, you cannot guarantee that anyone else will, either.
LARRY’S SUMMARY
Over the years I’ve written about the need to archive our projects and media assets for the long term. In my mind, that meant moving them from spinning hard disks to something more long-term, such as LTO tape.
But, simply moving media to a different storage medium is not the complete answer. As Paul made clear, the ability to recover media from the past requires five essential components:
Brown is a large, well-endowed and supported university. Yet, even with all its resources, it had no plans or ability to recover these stored assets. It took an individual with a 15 year old computer sitting on a shelf, along an archived version of ProTools that was 18 years old, plus a LOT of tinkering to get these files to play.
And these files are ONLY 20 years old! What happens in another 20 years? All our history is lost – stored on hardware we can’t access, using discontinued codecs and operating systems that died long ago.
Restoring the past rests far too much on chance, which is never a good method of preservation.
We are fortunate that Avid had the foresight to release a free version of an early version of ProTools in 2000, which allowed these files to be opened, though a 30-day deadline to install, learn and use 18-year-old software is asking a lot.
I find it inexcusable that Apple has not created a utility that converts FCP 6 & 7 projects into XML to allow past projects to be restored. Since FCP 7 uses a proprietary file format, the ONLY people that can create this utility are Apple. Apple would say that there is no reason to waste engineering resources on a “dead project.” I would argue that nothing is more important than preserving our past.
As well, because Apple also makes the Mac operating system, there needs to be an officially sanctioned way of restoring earlier versions of the macOS to older hardware for the purpose of restoration. Apple has made no secret that macOS High Sierra would be the last version of macOS to run 32-bit apps without compromise. Apple has also said that the transition to 64-bit for macOS and macOS apps is still underway, so final transition dates have not yet been established.
This warning is timely, and we have been told that upcoming versions of the OS will warn us when they encounter an application that will not run properly. My concern is that we need to be alerted prior to the release of a new OS of what specifically won’t work. I still have very clear memories of the unexpected launch of Final Cut Pro X coupled with the unnecessarily sudden discontinuance of Final Cut Pro 7 and Final Cut Server. Transitions that cause major apps or codecs to be “end-of-lifed” need to be better handled than that.
Adobe and Blackmagic Design also need to find a way for us to save their software for the future. I understand the need for protecting proprietary information, but twenty-five years into the future, any cutting-edge technology of today will seem, at best, a humorous antique.
We HAVE to have a way to play older software and codecs on the hardware of the future. At a minimum, if you have assets you need to make SURE survive into the future, you need to:
I don’t know what the solution is – but unless we, as an industry, think about the problems of archiving and preservation much more critically, we are going to wake up with a file cabinet of unplayable, irreplaceable CD-ROMs, Zip drives, and Cinepak media and wonder what happened.
I’m forwarding a copy of this blog to the product development teams at Adobe, Apple, Avid and Blackmagic Design, in hopes that they, too, can start to think of ways to assure that we can keep our past accessible. I encourage you to share this with the friends and developers you know.
We stand at the precipice of losing our past. It would be a shame to lose it due to inaction.
UPDATE – May 13, 2018
One of the Adobe product team got back to me after reading this article with a series of questions that are worth additional thought. He wrote:
Beyond dedicated archivists with rigorous backup and up-converting routines, what concepts do you have in mind? Should there be standard foundational formats that NLEs, DAWs, image editors, document layout packages, and animation tools should support for archive? Are the raw project formats crucial for long-term backup, or would the final product suffice for the most part? On a far out idea, what would it take for an AI to “recreate” an edit from the raw assets? (And could that learning be used to make new edits “in the style of” other editors?)
I replied that these were excellent questions. My goal, in writing this article, is to spark an ongoing conversation – because all of us are involved in the solution.
16 Responses to We Are On The Edge of Permanently Losing Our Past [u]
← Older CommentsDear Larry,
thank you again for this timely article and wake-up call for all of us.
For a production environment with potentially millions of files one important thing would need to be added and that is “findability”. File names and even project names might not be enough if someone who was not involved in the production years ago need to find something.
One very important part here is the metadata schema and the archive software that offers enough flexibility to customise it. Each production has their own needs and requirements and I encourage people always to put in enough thinking and planning ON PAPER before implementing a metadata schema. An archive software that keeps all files on long-term storage (and helps to migrate if new generations like of LTO tape appear) is a great tool for keeping files over many years.
We compiled helpful information on this and related topics in the FREE ebook titled:
“Data Mangement, Backup and Archive for Media Professionals”
available on the iBook store
https://itunes.apple.com/us/book/data-management-backup-archive/id850538526?mt=11
(If you´re on the outlook checkout Archiware P5 Archive http://www.archiware.com)
(Conflict of interest declared: I work for Archiware)
Thanks, Marc:
I respect your opinions.
Larry