Product Review: Simon Says On-Prem. AI Transcription for Mac and PC

Posted on by Larry

[ Read my product review disclosure statement here.]

I am not a fan of The Cloud. I’ll use it when I need to, but, it is notoriously insecure and bandwidth issues always get in the way of media projects.

Still, the web was the only option to get high-speed transcripts of media files – until the team at Simon Says released their on-premise version of the high-quality speech-to-text transcription software last Thursday: Simon Says On-Prem.

NOTE: For this review, Simon Says gave me a pre-release version of their system along with 5 free hours of transcription.


The Simon Says On-Prem provides high-quality speech-to-text transcription for Avid, Adobe and Apple media software in a secure environment that runs on any Mac you have lying around.

Files transfer at network speeds, transcription is fast and accurate, even on older systems – and much faster with newer hardware – and the results can be easily transferred into Premiere, Final Cut or exported as stand-alone text files.

The system requires a Mac or PC with an Intel processor to run the Linux-based artificial intelligence-based virtual machine (VM) that does the actual transcription. In my case, this ran smoothly on a 2013 iMac. The VM Mac is available for other tasks, though in this review, I just had it running the VM. All Macs need to be connected via a local network.

If all you need is an occasional transcript, this system is overkill. But, if you need an easy-to-use application that generates text transcripts day in and day out, accessed by one or more editors, with security that securely generates text transcripts that meet the stringent requirements of a studio, Simon Says On-Prem is a solid choice. Best of all, it’s a fraction of the cost of competing, legacy apps.

NOTE: Here’s a Simon Says blog on how to use this system in high-security environments.

System Requirements

Product: Simon Says On-Prem
Developer: Simon Says
Price: $1,995 (US) which include 100 hours of transcription time
Additional time available in blocks of 100 hours. Quantity discounts available.

NOTE: While this system doesn’t have a free trial available, you can easily see how Simon Says works by downloading their Cloud-based app or using their website.


There are two sides to installation:

Installing the Simon Says application, which is a different application from their Cloud-based Simon Says Transcription app, is straight-forward and similar to any other Mac app.

Installing the Virtual Machine, especially for those, like me, who have not installed a VM before, is trickier. You WILL need to carefully follow the user manual to avoid getting lost during the installation process which will take about an hour.

Once the VM is installed and running, you can ignore it. While the VM does all the work behind the scenes, everything you need to control is done from the client interface.

NOTE: I had a lot of problems installing a beta version of the VM – the software worked perfectly, but the user guide was far too confusing. Based on my comments, Simon Says updated their user manual so that the installation process should now be much clearer.

Once the VM is running, you’ll see this screen. The only number you need is the IP address (red arrow). This will vary by network. After you write down the IP number of the VM system, you can ignore this screen totally. However, the VM needs to be running in order for transcription to occur; so don’t shut this system down.

Start the client application and you’ll get a window to activate the license. After you enter the license key, you’ll also note that the software is not yet connected to the VM.

To connect it, select Simon Says > Preferences > Configure Service IP and enter the IP address you wrote down from the VM system.

Quit and restart the Simon Says app, not the VM, and it will automatically connect to the VM. Computers need to be on a local network, but they can be air-gapped from the Internet for additional security.

Air-gap. A computer attached to a local network, but not connected to the Internet; there is “air” between the computer and the Internet.

Once the client is connected, you are ready to start transcribing.

(Click to see larger image.)

You don’t need to use this dashboard at any time, but this shows what the VM is doing under the hood. There’s a lot going on. Note that when a transcript in not running, the VM takes about 15% of total CPU and RAM. You access this dashboard using your browser.


There are three principal ways we to use transcripts:

Simon Says supports all three and the transcription process is the same for all. Here’s how it works.

Click the Plus button in Simon Says and create a new project.

Select the language of your clips. (This system does not do translation.) There are about 30 languages to choose from; see the Simon Says website for the latest list.

Drag as many clips as you need to transcribe into the interface. Each clip generates its own transcription file.

The top left corner shows project statistics. In this example, I’m transcribing 5:43 of media, which Simon Says, like all transcription services, rounds up to the nearest minute.

When all looks good, click Transcribe.

The system asks for confirmation that this is what you really want to do.

Once you confirm, the animated parrot gets to work transcribing your text.

(Click to see larger image.)

Behind the scenes, the VM uses its AI-based system to transcribe your media. It does so without accessing the Internet. This software is running on a 4-core i7 iMac from 2013 and  yet is only using 66% of total CPU resources and less than 17% of total RAM.

NOTE: Viewing this screen is completely unnecessary for the transcription process. But I’m a geek – I love looking at this stuff. It is hidden by default.

Total transcription speed was about 2x real time. Which amazes me that an iMac this old can deliver that level of performance; and with complete security.

When the transcripts are done, the system alerts you.

(Click to see a larger image.)

And the transcript is displayed for editing or export, the same as their web-based app.


Here’s a completed project in Final Cut (the concept is the same in Premiere). After everything is in place, turn off (using roles in FCP X, or muting tracks in Premiere) all sound effects and music. The goal is to feed the cleanest possible dialog to Simon Says.

Export a ProRes Proxy master file. Video quality is ignored so the proxy format won’t matter. However, all versions of ProRes export uncompressed audio, which is what I want for the highest audio quality. H.264 compresses audio into AAC, which may decrease transcription accuracy.

Then, as illustrated in the section above, drag this master file into Simon Says and transcribe it.

(Click to see a larger image.)

Here’s the finished transcript. We can play it in real time, edit it, add speaker names and export it. Timecode is tracked for each word, however, it is only displayed at the start of a paragraph.

NOTE: If we exclude punctuation, this transcript was perfect. If your clips have good audio, without background noise or music, and avoid jargon or specialized acronyms, Simon Says we should expect 90-95% accuracy.


There are four options for exporting text in the version I was provided:

NOTE: The released version also supports exporting to Premiere markers.

I don’t own Media Composer, so I can’t demo that. But here are the other three options.

Export Text

When you click Export > Text nothing seems to happen. In fact, the application doesn’t even present a Save dialog. So, where did the text go?

As a ZIP file in your Downloads folder! No, this doesn’t make any sense and the application doesn’t tell you it did anything. But, once you know where to look, you’ll find your exported files here.

Here’s what the exported text file looks like, with timecode references at the start of every paragraph.

Export Captions to Final Cut Pro X

When you export to FCP, you can choose between captions and ranges. Both export ZIP files and store it in the Downloads folder. (As a note, I couldn’t figure out how to import this text as a range. That is probably user error.)

CAUTION: Regardless of which option you choose, both FCP X exports create the same file name, which means if you export both, whichever file you export later will erase the earlier file.

To import these as closed captions:

Export SRT

If you are using Premiere, export the transcript from Simon Says as SRT. Then, in Premiere choose File > Import and select the SRT file.

The captions show up in the Captions panel, ready for use.


Simon Says supports VPNs. For example, if a team of editors are each working from home, they can send files and receive transcripts securely via a VPN to this system running on the corporate network.


While this is an excellent package overall, there are a few things I’d like to see improved:


Simon Says On-Prem is a secure, fast, and easy-to-use transcription system that supports all major NLEs. It’s designed for editors needing a high volume of accurate transcripts with absolute security.

Once installation is complete, creating, editing and exporting transcripts is simple.

Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

Larry Recommends:

FCPX Complete

NEW & Updated!

Edit smarter with Larry’s latest training, all available in our store.

Access over 1,900 on-demand video editing courses. Become a member of our Video Training Library today!


Subscribe to Larry's FREE weekly newsletter and save 10%
on your first purchase.