Cloud AI Is The Wrong Place For The Media Industry

Posted on by Larry

Guest commentary by Sam Bogoch, CEO, Axle AI. (Reprinted with permission.)

Sam Bogoch, CEO, Axle AI

As Big AI, typified by hyperscalers OpenAI, Grok, Meta, Microsoft and Google, continue and even accelerate their exponential growth, it’s an important time to ask whether this approach is the right one for the media industry.

This is a conversation that’s distinct from the one about whether AI itself is a Good Thing, and what its role should be in the creative space. This one is about – when AI does make sense, where should it be running?

Many industries assume that most of AI’s power will continue to stem from the cloud providers who are aggressively leading on development of new Large Language Model (LLM) technologies, and even newer, bigger tech that may build on or even cut against the grain of that approach, while using even more aggregate GPU power.

However, incredibly powerful AI tools can now be run on premise, using standard computer hardware like the latest Mac minis and Mac Studios, high-end gaming GPUs from Nvidia, and more. These tools can make massive contributions to creative workflows, and they are distinctly better than Big Cloud AI for the media industry on several key grounds – which I will simplify as Cost, Trust, Speed and Guilt.

Cost

(Image courtesy of Mikhail Nilov, Pexels.com.)

Cloud AI providers take advantage of the ‘rent-seeking’ aspects of the cloud economy to charge multiples of their underlying hardware cost for GPUs and storage. This means that in most cases, you are either paying for heavily marked-up capabilities, often at prices per minute of footage processed, or taking advantage of billions of dollars of short term VC investment which will force your supplier to make customer-unfriendly choices at some point in the future. Neither of these is very attractive.

Meanwhile, on premise AI costs, well, what it costs. The hardware itself is getting dramatically better over time – the latest Mac minis measure 5 x 5 x 2 inches and can do real supercomputer-class AI – and much like the original PC versus Mainframe wars of the 80’s and 90’s, this is a case where the little guys are likely to surprise us with their power over time. Those Mac minis, and affordable if sometimes clunky PC gaming machines with Nvidia cards, can already do a surprising amount of AI work, taking up far less space and power than you’d expect. Our own Axle AI Tags software is capable of doing scene understanding, vector search, trainable face recognition, object, logo and character (OCR) recognition and speech transcription, all using only local resources with no cloud connection required. Meanwhile, compact versions of LLMs, and even a new breed of Small Language Models – SLMs – are now fully capable of running on local hardware as well.

Trust

(Image courtesy of Pixabay, Pexels.com.)

Cloud AI’s power, even the latest multimodal models, derives in large part from ‘scraping’ large parts of the public and semi-public Internet. When you send your media to a cloud provider, you are putting your intellectual property at risk. Exactly how much risk depends on your situation, and who your provider is. But as recent blowups over terms of service on Amazon’s Rekognition, Adobe’s Sensei, TikTok’s CapCut and most recently even WeTransfer have shown, cloud vendors seem to be edging towards a stance where they own the right to reprocess and train on your content. All four of these vendors have put out terms of service that assert significant rights to reuse your content for training purposes.

For nearly every content owner we work with at Axle AI, this is a bridge too far. Why expose your valuable IP to the possibility of being turned into someone else’s product with no attribution? This is especially true for VFX special effects, or cinematographers’ styles, which are notoriously hard to protect. But with the latest multimodal LLM’s like Google’s VEO3, it can even extend to acting and dialogue. The possibilities are endless, and not in a good way. Content owners and creators are right to look for a way to benefit from AI power, rather than the other way around, and keep control of their intellectual property.

Speed

(Image courtesy of Pixabay, Pexels.com.)

95% or more of media files are stored on premise. Typically, these are a mix of Network Attached Storage (NAS), loose hard drives, RAID arrays, LTO tape libraries, and occasionally high performance storage are networks (SANs). With the increasing trend towards 4k and higher resolution footage, these files are very large, and the bandwidth limitations of even today’s high-speed internet means that uploading them to the cloud for processing is necessarily going to be slow and inefficient compared to local options, which can leverage wired 10 Gigabit Ethernet and even faster options.

For many media AI applications, lower-resolution proxy versions of the original media are actually a better fit for AI processing needs, and on-premise hardware can be used to generate these proxies. But once you have hardware in place to do all that work on hundreds of terabytes or even petabytes of media, why not upgrade it a bit and run the AI locally as well? This is the realization that many media companies are increasingly coming to.

Guilt

(Image courtesy of Pixabay, Pexels.com.)

Recent headlines have included “Meta will build data center the size of Manhattan with latest AI push”, “Google inks $3bn US hydropower deal as it expands energy-hungry data centers”, and “Three Mile Island Is coming back to life – to power Microsoft’s AI”. And this is all for what amounts to the first generation or two of Cloud AI.You don’t have to be a hardcore environmentalist to be concerned about the long term impact as these trends continue to accelerate. We will all pay for the impact of these trends, either in environmental costs or in resources diverted to them instead of other, potentially better goals.

Many people in the media industries have chosen to do what they do in part because it’s not harmful – there are other professions where ethical tradeoffs have to be made, lives put at risk, and major resources squandered, and media work has largely been spared these concerns. Suddenly, media organizations – and the even larger, rapidly growing media portion of every part of the economy including sports, houses of worship, corporations and government – will have to reckon with the environmental impact of their work. But it doesn’t have to be that way. By focusing on powerful, increasingly efficient local computing power to run its AI, the creative world can continue to deliver its value without harming the world at large.

Admittedly, there are benefits to be gained from using Cloud AI, and the latest multimodal and generative AI models will continue to be developed there. Nobody is suggesting an outright ban or boycott on this kind of solution. But given the clear advantages of on-premise AI, and the combination of immediate (cost, trust and speed) and big-picture (guilt) benefits of choosing local processing, the media industry has a sensible, clear path forward – and it’s not the cloud.


Bookmark the permalink.

6 Responses to Cloud AI Is The Wrong Place For The Media Industry

  1. Mike Jan says:

    AI is a scam. It is this year’s metaverse. It’s the Edsel, New Coke and iOS’s “bump” feature rolled into one huge energy-hogging lump of excrement. Oh, I’m sure some use will come out of it, but it will never achieve a decent cost/benefit ratio.

    But don’t listen to me, read this guy: https://www.wheresyoured.at

    • Larry says:

      Mike:

      So, um, do you have an opinion? … smile.

      I think we need to differentiate between machine learning and generative AI. There’s a difference between tools that create masks, remove background sound, split video into individual scenes, transform horizontal video into vertical and generating images and text out of whole cloth. Personally, based on my experience, I think tools based on machine learning can be and are extremely useful. Apple, Adobe and Blackmagic have implemented some nice ones.

      Where I am in complete agreement with you is the unrealistic hype surrounding generative AI – text, images and sound – along with its partner “general intelligence.” (Though, these days I’m also questioning the existence of “human intelligence.” But that’s a different subject.)

      Companies that burn a billion dollars a month, steal whatever they need, resist any calls to improve accuracy and generate results which are only 70-80% accurate are not worth the soap box they are standing on. Wall Street may value them, I don’t.

      Larry

  2. Dan Katz says:

    AI isn’t a scam, it’s a set of tools and techniques. Like any other tech, or weapon for that matter. And I’m maximizing my education to explore and understand these tools. Already I’ve found many that accelerate my creative process, fill in blanks that traditionally I couldn’t complete myself or in a reasonable time. I mean, it’s becoming infinite what’s possible. And while many of my colleagues want to shit on AI or dismiss it, I’m embracing it. And my clients are expecting me to. I’m 55 this month. I don’t have time to fug around. Much like I did when non-linear grabbed hold in the late 90’s and my online buddies said, “nahhh I’m good” I opened up. They stopped getting work. Adapt or… you know the rest.

    • Larry says:

      Dan:

      I don’t necessarily disagree. I remain generally opposed to AI that seeks to be “creative” – especially to replace jobs. At best, that type of AI is iterative and can only do what’s been done before. But I am increasingly impressed with simple tasks that can be done better using machine learning. As you point out, ML can accelerate your creative process without replacing you.

      Larry

  3. Dan Katz says:

    All our tools are iterative, no? We’re all standing on the shoulders of giants.

    • Larry Jordan says:

      Dan:

      Not in the sense you mean.

      Spreadsheets allow us to calculate math faster, but we still need to enter the data.

      Word processors allow us to write faster, but don’t tell us what words to use.

      Machine language tools allow faster rotoscoping, faster removal of background noise, more accurate color correction. In other words, these tools use our input to process something faster. In all these cases, the artist is still firmly in control.

      Generative AI pretends that it can do the whole process – create the idea, then create the text/image/sound based on that idea. Its goal is to replace the artist. Look at the number of companies rushing to implement Gen AI so they can reduce headcount. Gen AI can only do what it has been trained to do. It firmly entrenched inside the box. Creative thought is the ability to think outside that box and create something unique.

      Larry

Leave a Reply to Mike Jan Cancel reply

Your email address will not be published. Required fields are marked *

Everything You Need to Know


2,000 Video Training Titles

  • Apple Final Cut Pro
  • Adobe Premiere Pro
  • DaVinci Resolve

Edit smarter with Larry Jordan. Available in our store.

Access over 2,000 on-demand video editing courses. Become a member of our Video Training Library today!


JOIN NOW

Subscribe to Larry's FREE weekly newsletter and
save 10%
on your first purchase.