Print this article Print this article   |   Read this article in: | IT| ES |

January 11, 2006



Audio Transcription For Podcasts: JC Human vs. CastingWords.com

 

CastingWords is a nifty new online service which allows anyone to submit an audio podcast for immediate transcription into text.

microphone_classic_350.jpg
Photo credit: Kryptos

The key innovations brought in by this very affordable new service (see further down my comparison with a traditional established broadcast transcriber) are the easy and efficient automated submission process, the integrated cost estimating process and the RSS and email based workflow notification facility.

Podcasters only need to have the URL of their podcast RSS feed and a PayPal account to pay the transcription fee.

The transcription is done in a very short turnaround time and the service notifies you as soon as the transcription progress is completed while delivering your formatted transcription in three different final output formats.

CastingWords.com does not employ automated translation technology to transcribe podcasts but uses a fully human network of supporting editors.

What may not appear as exciting to the conscious and ethical online publisher is that, under this promising surface, CastingWords is also a beta podcast search engine which may eventually derive its key strength from the very service it provides for a fee.

That is: CastingWords asks me money to transcribe my podcast, but then it uses those text transcriptions as the core index database of a its podcast search engine, which is in turn ad-supported.

If in the near future CastingWords search engine will be free to all users, and it will share with podcasters, who have paid for transcriptions, some its possible future search ad revenues, then this a great approach to make audio content extend its reach while enabling more podcasters to do a better content publishing job. But if CastingWords wants to make extra money by leveraging user-paid-for content to create an ad-based search engine, then maybe someone should step up and say something to these guys.

For now, it is difficult to judge, as the CastingWords search facility is not even able to find my own podcast content when searching for Rome, and Italy, two pretty unique words I always use at the beginning and end of my audio interviews.

To help everyone get an idea of how useful this service can be, when compared to using a traditional editor to transcribe your audio podcasts, I have taken the time to test and compare CastingWords.com with my established human transcription resource, a professional TV broadcast transcriber.

Without telling either one that I was doing a comparison review, I have sent to both a 30-minute audio interview with Eric Goldstein of Clipmarks that I have very recently published.

Here is what I, with the help of assistant editor Matthew Guschwan have been able to find out:


another_waveform_sibaudio.jpg
Photo credit: Jake Levin

Transcription Comparison

JG Human vs. CW CastingWords

This summary comparison evaluation is based upon a few key comparison areas:

Editorial

1) Basic formatting

2) Use of punctuation

3) Spelling errors

4) Fidelity to original recording

Technical

5) Output formats

6) Costs

7) Turnaround time


1) Basic Formatting

CastingWords clearly indicated the change of speaker in the interview by using "headings" containing the name of the person speaking such as:
Robin Good:
Eric Goldstein:

JC indicated a change in speaker by simply starting a new paragraph.

The additional use of clearly labeled headings did save us time in getting the transcription online.



2) Use of Punctuation

There were several major differences in punctuation between the two services tested.

For list of items, CastingWords used commas (father, mother, nurse, babysitter) to separate the words while JG used a series of slashes (father/mother/nurse/babysitter).

For our purposes, we prefer the more traditional use of punctuation, which for us is commas.

on the other hand CastingWords did not use commas nearly as often as JG did. In transcribing spoken words, commas are very important in trying to recreate the conversation, and for clarifying points. In this regard, JG’s transcription was clearly superior.



3) Spelling Errors

Both services made several mistakes in spelling. It must be said, though, that audio podcasts are often not of the highest audio quality possible, and many of my interviewees, if not myself too, are not native English speakers and therefore our pronunciation of certain words may mislead the transcriber. Also, it is rare that transcribers are familiar with the topic and technologies you may be covering, causing a multitude of misinterpretations that at times, may be very difficult to catch. One that explains them all is "creative comments" transcribed in place of Creative Commons. It is evident that an editor unaware of CC could easily fall into this type of trap.

CW in particular passed along one word (maticion) that should have been easily caught in any standard English spell checker.

There were also instances in both transcriptions wherein words were misused, or spelled inconsistently.

For example, the spelling of homepage as ‘home page’ or bookmark as ‘book mark’ will not set off a basic spell checker, therefore the human element becomes more important.

These types of insidious errors can be more difficult to detect. As mentioned above, it is not common to have an editor that is up-to-date with new media and who knows how to properly write del.icio.us, or as in the case of CW, who did not capitalize “Flash.”

Other differences were that CW used “bookmark” throughout, yet used the terms “book marking” for “bookmarking” twice and “book marked” for “bookmarked” once. JG used “bookmarking” and “bookmarked” throughout.



4) Fidelity to original recording

There are subtle differences between the transcripts such as a change of phrasing and or a slight change of word order. These changes are harmless if they preserve the meaning of the statement, and can even be helpful if these changes improve the flow or the clarity of the original spoken statement.

Not the same case when even one or two words are unjustifiably dropped, as here, where JG wrote "Clipmarks is about allowing you to easily save the specific information" while CW omitted "easily" altogether from this sentence.

In writing instructions, it is useful to have the instruction within quotation marks. E.g.: "click here" vs. click here. JG always included quotation marks while CW did not.

A more serious error is when the transcription is too literal to the recording, and hesitations or repetitions by the original speaker are not automatically deleted.

Other instances showed quite differing transcriptions of the same short passage:
CW “You’ve been continuously you’ve been adding in the recent weeks”
JG “You have been continuously adding in the recent week?”
or where the essential meaning of the phrase suddenly changes for the incorrect placement of a verb or adjective. Here is an example where improper use of a ‘do’ instead of a ‘don’t’ creates a serious problem in the final transcription.

JG wrote: "If you don’t want your clip marks to be seen by other people, you simply don’t check the box..."

CW wrote: "If you don’t want your Clipmarks to be seen by other people you just click the checkbox..."



5) Output formats

JC has been providing me with a text-only output which gets posted to a private wiki workspace where we both have access to.

CastingWords automatically provides translated output in three different formats which are accessible online via standard URL:
1) RTF
2) HTML
3) Plain Text



6) Costs

CastingWords costs 42 cents a minute. Period.

JC, my fully human translator charges me $20 for anything under one hour. No matter if it is 25 mins or 55, she bills me $20 for each one.

So the cost difference for this specific article was the following:
CastingWords: $14.28
JC: $20.00

Both accept direct electronic payments via PayPal.



7) Turnaround time

In terms of turnaround time, is CastingWords having a clear edge. This test was purposely run on a weekend during the Christmas week and CastingWords reacted in record time making the full transcript ready for pick-up within less than 48 hours from submission.

The fully human solution with JC, took his normal time toll which, unless you request something urgent in writing, is generally between 3 and 5 days (excluding weekend days).



Summary Evaluation

Overall, there were errors from both the automated CastingWords and the fully human JG. As a matter of fact, CastingWords too is essentially a human translation service, only with the addition of a set of automated facilities which greatly enhance the speed and transparency of the overall translation process.

As far as the translation itself went, JC seemed to us to have an edge over CastingWords. So, from a writing standpoint, JG was better than CastingWords because of a general better use of punctuation and because of better construction of some phrases.

Both services made some annoying spelling mistakes, and both transcripts had to be checked closely for the proper use of technical terms and proper names. Neither one provided a transcript that could have been published right-away as it was.



Which one is best?

That will certainly depend on what you need to do, what your time frame is, what is the content topic of your audio, and how important are to you precision and quality transcription versus faster and more immediately presentable results.



Reference transcriptions

Here the links to the two transcriptions as I received them. This is the text-only format, which you can download and check yourself to see what you would get.

CastingWords: text-only transcription of Robin Good interview with Eric Goldstein (I don't know for how long this will remain accessible online)

JC: text-only transcription of Robin Good interview with Eric Goldstein saved in RTF format to maintain original paragraph spacing provided by human transcriber.

Audio MP3 of original interview


For a fully illustrated walk-through of the CastingWords podcast transcription service please see:
Walking Through the Casting Words Store

Robin Good and Matthew Guschwan -
Conversation Tags: , , , , , , ,
Readers' Comments    
2008-10-27 20:39:51

Audio Transcription

That is a big difference between the two. I would be interested to see what the difference would be between those and a rel=nofolow href="http:www.wescribeit.com"WeScribeIt.com a. We pride ourself on accuracy and customer service. WeScribeIt.com also does audio transcription for legal dictation, medical dictation, and financial dictation.



2007-01-20 15:50:45

Belle

Casting words use Amazon's mechanical turkers to do it's transcriptions for them.
Basically they pay their workers on average about $2 (this includes a bonus) per 9 minute podcast, less if errors are made.
The pittance they pay their outworkers enables them to deliver their service for such a low fee.
Visit www.mturk.com and you'll see lists and lists of the stuff they want transcribed.
I've just done a very technical transcription that has taken me the better part of 2 hours to complete, for a grand total of $1.27 plus a small bonus.



Related Articles



May 20, 2005
Where to Submit Your Podcasts: Best Podcast Search Engines and Directories


Podcasting is making audio files (most commonly in MP3 format) available online so that users can automatically download the files to listen whenever they want. "Podcasting is a way of publishing files to a website that allows users to subscribe to the site and receive new files... read more



December 14, 2005
Podcast Hijacking Is Here: What To Do, How To Avoid It


Podjack: (verb) – To create an alternate RSS feed to a podcast without the permission of the podcast’s owner. Photo credit: Carsten Reisinger I am writing this piece for the sake of giving podcasters information on how to protect themselves from similar podjackings. And I’m also going to... read more



December 8, 2005
Reading Lists: OPML-Based Dynamic Shareable Feeds Lists


As the ocean of information increases, more and more people and organizations will devote their time to filter, aggregate, select, and compile information packages that can best satisfy the needs of their clients niche interests. Photo credit: wynand van niekerk The size and reach of this process... read more



February 24, 2005
The Road To Powerful Instant Vertical Communities: Personal Media Aggregators


Personal Media Aggregators are the road to create instant-vertical-communities by way of becoming fulcrum points around which news, commentary, discussion, and networking opportunities around a very specific topic, brand, celebrity or writer can become a cohesive aggregating force. Photo credit: Sorin Brinzei Somewhere at the crossroad between... read more



February 28, 2005
Podcast Straight From Your Web Server: Podcastamatic


If executing a little Perl script doesn't scare you off, if you have your own web server running and if you're interested to turn your MP3 recordings into a podcast, then read on. Photo credit: by Bianca De Blok Kenward Bradley, a Chicago-based programmer with a passion for... read more



October 13, 2004
What Is Podcasting: Chris Pirillo Finds Out
Chris Pirillo has put together a great introduction to what podcasting is. If you have an audio enabled personal computer and are interested in the newest online trend you may really want to listen to this excellent 45-minute intro to podcasting. Podcasting is all the rage... read more



posted by Matthew Guschwan on Wednesday, January 11 2006, updated on Friday, March 24 2006


 

 

 

 

Understanding comes from exploration

Home | Subscribe | RSS Feeds | Site map | Syndicate
Consulting | Publications
About | Privacy | Contact

 

Creative Commons License
This work is licensed under a Creative Commons License.





View blog authority

 

4889