Digital North Carolina Blog

Digital North Carolina Blog

This blog is maintained by the staff of the North Carolina Digital Heritage Center and features highlights from the collections at DigitalNC, an online library of primary sources from institutions across North Carolina.

RSS Subscribe By Mail UNC Social Media Statement


DigitalNC from Home: Oral History Transcription

As all of us at the Digital Heritage Center carry on our work from home, we are continuing to utilize the time outside of our regular duties to enhance DigitalNC. One such project is adding transcriptions to our collection of North Carolina Oral Histories.

Transcriptions are the written text of audio files, which are, in our case, recordings of oral histories. The oral histories on DigitalNC vary in length, ranging from two minutes, to two hours, and beyond. Typing out transcriptions from scratch takes time- a lot of time. To help us out, we use the transcription software, Sonix. Once an audio file has been uploaded to Sonix, the software “listens” to and creates text of what it heard.

Screenshot of a Sonix transcript without edits. The text reads "(speaker): Okay, actually, I'd rather you sit there cuz that swings squeak squeaks. I want to show you so you got to cut off. No, it's running and we'll look at that when we finished. I will okay. (speaker): Okay, do you cook collard? Yes, I do. I want you to sit there soon. It's Queen this you tell me from start to finish exactly how you cook your collards. Okay. I'll get my put my meat in the pot. Yes, but if I got a smoked meat I put that in there and then I put a little Lord in them. Then I put a little sugar and salt and red pepper."

An example of a Sonix transcript before editing. This transcript came out relatively coherent, but needs speakers and will be assessed for faithful translation. For example, did the narrator say “Lord” or could it be “lard”?

Unfortunately, audio transcription software does not produce a faultless transcript. After Sonix creates the new text, we listen to the original audio and edit the errors. Edits include replacing or removing incorrectly heard words, adding in missed punctuation and paragraph spacing, and attributing the various speakers. We also remove speech fillers (think “um” and “er”) and note when speech is unclear with a bracketed question mark ([?]).

Editing also requires consistency. Here are some of the guidelines we follow to create dependable transcripts:

  • If the speaker does not stick to formal standards of grammar throughout the conversation, we do not correct it, but non-standard contractions are written fully (as in, goin’ becomes going)
  • If one speaker talks over another, we try to put them in order as it makes sense in the conversation
  • If a speaker expresses laughter, we enter that into the text using brackets ([laughs]).

This is where transcription work gets tricky. These guidelines may prompt questions during the editing process such as, How much laughter is enough to allow for [laughs]? or, What if the speaker has a regional accent that represents much of their personality and culture as expressed in the recording and I would like to point to it through non-standard contractions? There are no hard and fast answers to either question. Both rely on what the transcriptionist feels is most appropriate to faithfully represent the narrator’s story. This makes the transcription a participatory product, not just an automatic copy.

Respect for both the narrator’s speech and intent is the primary focus for a transcriptionist. In a perfect world, the interviewer would ask the narrator to look over the final document to approve of the content. However, because the Digital Heritage Center obtains all of our oral histories through our partners, plus the fact they are often recorded over 20 years ago, we are not able to consult either the interviewer or the narrator.

This leaves us to follow best practices, making sure to keep in mind our biases. Respecting the intersectionality of the narrator is an important dimension to this work. Many of the narrators in our Oral History Collection are Black and use African-American Vernacular English. Others speak with strong regional Southern dialects. As we draw up the final transcript, we have to take into consideration our own positionality and watch for editorializing and over-interpreting.

Screenshot of an edited Sonix transcription. The text reads Mary Lewis Deans: Tell me how you heard about it. Kermit Paris: I was working in the bakery in [?]. I don't know, just before Carolina Theater opened up [?], once before I used to live right there. We was on the railroad tracks when I heard it. And, sure enough, I reckon ten or fifteen minutes after then, some artillery had come down the train, I remember that, going north. Artillery and some tanks was going down. They had guards on the flat cars, I'd seen some soldiers on the flat cars at that time. I do remember that.

An example of a Sonix transcript after editing.

Why are we transcribing oral histories? Not only does adding text to the audio make the record accessible, but researchers are now able to scroll through interviews for relevant information without having to listen to the entire recording. The text within the transcripts is fully searchable when doing a full text search on DigitalNC, which makes them appear in many more searches than they would have with just a basic description. That being said, accessibility is a first step and we are looking forward to continually refining our transcripts and supplemental description work with an eye to equitability and transparency.

To take a look at all of the oral histories we have online, click here. And if you’re interested in glancing through the many oral histories with either original or newly made transcripts, click here.


Page Options