HEARING IS NOT
THE SAME AS
UNDERSTANDING
Automated transcription doesn’t meet most needs yet.
Software can capture up to 90% of words spoken in some environments, but that isn’t good enough. SCRIBE lets you efficiently edit and format an auto-transcript from any source for a complete, accurate document.
Many SCRIBE users have access to free or fairly low-cost automated transcription programs, but are frustrated by their limitations. These apps do best in low noise environments with a single speaker using high quality recording equipment. Most auto-transcription apps struggle with recordings made in real-world situations with background noise, or made using pocket recorders or smart phones.
People talk over one another in conversation. They make references to popular slang and pop culture, technical terms and proper names. They sometimes have regional or foreign accents that the transcription app can’t understand, or they toss in words from other languages.
Transana SCRIBE lets you quickly import your media file and your auto transcript, and use our handy keyboard shortcuts to zip through and make sure you have that last essential 10+% accurately transcribed. Add the punctuation and spacing, format the document, and use autocomplete to instantly add the speaker names.
Most automated transcription is generated as run-on text, with no punctuation.
Proper nouns, jargon specific to a certain profession or subculture, and words in another language are usually not recognized by the software.
Here is an automated transcript made in nearly ideal circumstances. The video is recorded in a quiet environment with professional equipment. There is a single speaker. The resulting auto-transcript is excellent, but is still not intelligible without a human editor.
Automated Transcript:
mattingly then dropped his rifle empty rifle to the roadway and reached for a grenade and tossed it over into that emplacement whereupon there were four Germans rose up and threw their hands up and then Mattingly picked up his empty rifle stood up and covered these individuals and ordered amount of the emplacement all of this happened perhaps in the space of 15 seconds 20 at the most and as Mattingly was motioning the Germans to come forward out of the Gundam placement on the north side of the road five Germans rose up out of a concealed gun emplacement on the south side of the road by name and here he had captured nine Germans two machine guns in the space of about as I say 15 or 20 seconds and I I was I was absolutely amazed at how he conducted himself it was almost as if he was a robot and he was decorated with a Silver Star for that we continued on across the causeway and me and as we were about halfway across the causeway we witnessed a smoke explosion or never a smoke bomb smoke grenade actually were set off in the churchyard at kokanee church and that that signified to us that there were friendly troops up there because that was one of the signals that we had yellow smoke men friendly troops so we continued quickly across source water bringing up the main body of our group which is I saved probably 70 75 people and we got to the crossroads actually it was the more of a lion the road where the where the main road from San mayor gliese turned south westward to go to pick oval and and then another road took a went straight straight by the churchyard and went to am free bill a German ambulance came up the road from Pikeville made her turn to the west and he had the rear doors that the ambulance open we could see that there were some American uniforms casualties aboard dead ambulance and also German casualties German uniform people laying in the floor of the hamlet the ambulance made the turn toward am prevail and halted and and for maybe about two or three minutes and of course we did not bother the ambulance because it’s headed for an aid station and had American soldiers at American uniformed bodies in it so as we were deliberating on what our next move was going to be we heard the rumble of Tanks
Credit: National WWII Museum
TransanaSCRIBE:
Mattingly then dropped his rifle, his empty rifle to the roadway and reached for a grenade and tossed it over into that emplacement. Whereupon there were four Germans rose up and threw their hands up. And then Mattingly picked up his empty rifle, stood up and covered these individuals and ordered them out of the emplacement. All of this happened perhaps in the space of 15 seconds. 20 at the most. And as Mattingly was motioning the Germans to come forward out of the gun emplacement on the north side of the road, five Germans rose up out of a concealed gun emplacement on the south side of the road behind him and here he captured nine Germans, two machine guns in the space of about 15 or 20 seconds. I was absolutely amazed at how he conducted himself. It was almost as if he was a robot. And he was decorated with a Silver Star for that.
We continued on across the causeway and as we were about halfway across the causeway we witnessed a smoke explosion on, a smoke bomb or a smoke grenade actually, was set off in the churchyard at Cauquigny church. And that signified to us that there were friendly troops up there because that was one of the signals that we had. Yellow smoke meant friendly troops. So we continued quickly across there, Schwartzwalder bringing up the main body of our group which as I say is probably 70, 75 people. And we got to the crossroads, actually it’s more of a Y in the road, where the main road from Sainte Mère Église turned south-westward to go to Picauville. Another road went straight by the churchyard and went to Amfreville.
A German ambulance came up the road from Picauville, made a turn to the west, and they had the rear doors of the ambulance open. We could see that there were some American uniforms, casualties aboard that ambulance, and also German casualties, um German uniform people laying in the floor of the ambulance. The ambulance made the turn toward Amfreville and halted. And for maybe about two or three minutes and of course we did not bother the ambulance because it was headed for an aid station and had American soldiers in it, American uniformed bodies in it. So as we were deliberating on what our next move was going to be, we heard the rumble of tanks.
The voices of multiple speakers in conversation are usually quite easy for a human listener to recognize, but automated transcription software is seldom able to sort out different voices and label them.
Overlapping speech is poorly transcribed or ignored.
This video was made in a professional radio studio and broadcast live to tens of thousands of listeners. The banter between five speakers was easily understood by human ears, but the auto-transcription app was unable to differentiate the overlapping speech, and the resulting 17 minute transcript is gibberish. Here’s the first minute of the auto-transcript and the finished ‘human’ transcript, using SCRIBE:
Automated Transcript:
I was up late counting last night and the number that I ended up with was over 17 million albums sold and seven Grammy nominations ladies and gentlemen they are back on the Kevin Dean join guys huh how long has it been a couple years maybe since we saw you oh my god three years I think hasn’t really been that long think so I don’t know next question I just feel like we have we probably have missed so much what have you what have you guys been up to well the new album of course oh wait what oh wait I’m saying I want a new album
Credit: KROQ
TransanaSCRIBE:
Interviewer 1: I was up late counting last night and the number that I ended up with was over 17 million albums sold and seven Grammy nominations. Ladies and gentlemen they are back on the Kevin & Bean show on KROQ. [Shirley laughs]
Give it up for our friends in Garbage! [band members clap, cut to DJs clapping]
Band members: Woo hoo!
Interviewer 1: Yeah
Interviewer 2: Guys how, how long has it been? A couple years maybe since we saw you?
Shirley Manson: Oh my god!
Interviewer 2: Three years I think.
Interviewer 1: Has it really been that long?
Interviewer 2: I think so.
Shirley Manson: I don’t know. Next question! [laughter] We’re all too old to answer that question.
Interviewer 2: Shirley just wants us to move right along! [laughter]
Interviewer 1: I just, I just feel like we have, we have probably missed so much. What have you, what have you guys been up to?
Interviewer 2: Well the new album, of course. Oh wait what?
Interviewer 1: oh wait.
Butch Vig: We’re working on a new album
Interviewer 2: You are? [sarcastically]
Shirley Manson: Oh, guys!
Woman off camera: Wow. That was harsh, Kevin.
Shirley Manson: That was harsh. Cold room.
Interviewer 2: I’m saying I want a new album.
Credit: TEDxStPeterPort
A professionally recorded and produced TEDx talk with a single speaker provides an ideal example of the recorded human voice. In the automated transcript, every word is there, but in a long string with no punctuation. The reader struggles to understand and ‘hear’ what the speaker is saying, without the pauses and flow provided by the human transcriber.
Automated transcript:
“according to the official NHS statistic over 1 in a hundred people in the UK are autistic there are roughly 200 people here today so there should be at least one other but that’s not how statistics work of course not only are they always changing but they’re different wherever you go and people aren’t always gathered into one room in the US for example the official statistic is one in 68 children but a more recent estimate guesses it to be one in 45 of course this also doesn’t count the amount people who go undiagnosed and only recently as research being done into late diagnosis and also people who are on private institutions and aren’t on public record so this is my autism I love music in this way I am like most teenagers I cannot stop thinking about music in that way I am not so much like most of the teenagers I struggle with school that’s not uncommon but I also can’t stand direct sunlight or bright white rooms this time last year I was getting migraines three times a week and even when I stopped testing three times a week I kept saying I did I can’t follow vague instructions which means that clean your room or make dinner doesn’t really work with me and I can’t cope with changes in routine or unplanned events they are my nightmare I don’t like heat or cold or foods I don’t know or textures I don’t like I cannot start things and when I do I tend to stop them I don’t like large crowds or parties and I cannot stop thinking about music but there’s another side to this I’m a liar and I can understand sarcasm I can search for subtext though it may not come naturally
SCRIBE’s tools let you make a fast ‘one-and-done’ pass through any media file and automated transcript to correct it to 100% accuracy.
TransanaSCRIBE:
According to the official NHS statistic, over one in a hundred people in the UK are autistic. There are roughly 200 people here today, so there should be at least one other. But that’s not how statistics work, of course.
Not only are they always changing but they’re different wherever you go, and people aren’t always gathered into one room.
In the US for example the official statistic is one in 68 children, but a more recent estimate guesses it to be one in 45.
Of course this also doesn’t count the amount people who go undiagnosed and only recently as research being done into late diagnosis, and also people who are on private institutions and aren’t on public record.
So this is my autism.
I love music. In this way I am like most teenagers. I cannot stop thinking about music. In that way I am not so much like most other teenagers I struggle with school, that’s not uncommon. But I also can’t stand direct sunlight or bright white rooms. This time last year I was getting migraines three times a week. And even when I stopped getting them three times a week, I kept saying I did.
I can’t follow vague instructions, which means that clean your room or make dinner doesn’t really work with me, and I can’t cope with changes in routine or unplanned events. They are my nightmare.
I don’t like heat, or cold, or foods I don’t know, or textures I don’t like.
I cannot start things, and when I do I tend to stop them.
I don’t like large crowds or parties, and I cannot stop thinking about music.
But there’s another side to this. I’m a liar and I can understand sarcasm. I can search for subtext though it may not come naturally.