Moore County Historical Association has contributed 11 high school yearbooks for Pinehurst High School to DigitalNC, dating from 1951-1961. These are the first yearbooks for Pinehurst High School available on DigitalNC.
Catalogs and yearbooks are now online from our newest community college partner, Sandhills Community College in Pinehurst, Moore County, NC. Most community colleges had at least short runs of yearbooks produced during the 1960s and 1970s, and Sandhills has contributed 1968-1978. We’re also pleased to share catalogs dating from 1967, one year after classes began, through 2017.
We’ve now worked with 28 North Carolina community colleges to provide yearbooks, catalogs, photographs, and other documents related to community college history in North Carolina. Browse our contributor list or our college yearbook page for more information.
“I can see the word on the page, but when I search for it, no matches are found.”
“This item is searchable. Why can’t I read it with a screen reader?”
We get a lot of great questions like the ones above: the answer to all of them, in some way, is “OCR.”
What OCR Is
Optical Character Recognition (OCR) is amazing technology; with OCR software we are able to search image files for groups of pixels that look like text, guess what that text might be, and save the output in a way that we can feed into our search indexing systems. Even better, we’re sometimes able to overlay that text output on top of an image so that we can show you where we think a word might appear.
At the North Carolina Digital Heritage Center, we scan and store digital heritage materials as images. When we notice that an image contains printed text–documents, posters, ledgers, scrapbooks, and more–we also run it through OCR software. Without OCR, text shown in images is “locked” inside them; with OCR we can leverage the power of full text search to help people discover relevant images a little better than before.
What OCR Isn’t
No OCR method is without limitations. Whether OCR software can correctly “read” the text in an image depends on a few things:
The longer OCR takes, the better it is
The longer the OCR engine is allowed to puzzle over the pixels in an image, the better its output can be. At NCDHC we try to find the right balance between giving the OCR software enough time to produce useful results, and scanning more materials: letting OCR take too long would significantly reduce the amount of materials we’re able to add to DigitalNC each day.
OCR is less accurate with historic materials
Most of the materials we work with are difficult for OCR engines to interpret: compared with more modern materials, historic documents use fuzzier printing methods, display a lot of variation in letter forms, are deteriorating, or contain a mixture of printed and handwritten text. All of these things are likely to confuse even the best OCR software, producing text output that can differ from what’s visible on the screen.
OCR isn’t the same as a transcription
Without human intervention, it can be difficult for OCR software to interpret the layout of a document. By default, OCR software attempts to “read” an image from left to right. Even if it’s able to recognize all of the words on a page, it may not recognize the order in which the words were intended to be read; for example, the software might not be able to differentiate where one column ends and another begins in a newspaper clipping, or it might include the text of an advertisement in the middle of an article:
In contrast, transcriptions represent the text in an image as it’s meant to be read, and requires some amount of human labor to produce.
Summary, and a look ahead
OCR is a fantastic tool that enhances the way users are able to interact with the images available in DigitalNC collections, but its limitations prevent it from producing full, traditionally-readable transcriptions of image materials.
Even so, NCDHC looks forward to next-generation tools and methods for recognizing and searching for text within images. OCR software is constantly improving; the software we use today is faster and more accurate than it was five years ago, and OCR technology benefits from recent advances in machine learning and artificial intelligence.
If you have questions or concerns about searchable content on DigitalNC, or would like information on obtaining a copy of materials that is accessible to screen readers, please don’t hesitate to contact us.
Graham County Public Library, one of our westernmost partners, has contributed our first Graham County yearbooks to DigitalNC. There are now 11 yearbooks from Robbinsville High School (1950-1967) available online. In addition they provided two from Tri-County Community College (1979-1982) in Murphy, NC (Cherokee County).
We were delighted to visit the Graham County Public Library back in June 2018, when we scanned photographs from their collection.
In addition to these yearbooks, you can take a look at our list of available high school yearbooks, organized by county.
We are excited to welcome new partner Fuquay-Varina Museums to DigitalNC. Their first batch with us is a set of 20 yearbooks from Fuquay Varina area schools, Fuquay Springs High School, the white school, and Fuquay Consolidated High School, the African-American school for the town before integration. The schools were combined in 1969 to form Fuquay-Varina High School, which still operates today as part of the Wake County School system.
A new batch of items from The Grand Lodge of Ancient, Free and Accepted Masons of North Carolina are now available online. The recently digitized materials consist largely of minute books, account ledgers, and membership rolls from the Grand Lodge and various other Masonic lodges in North Carolina. Also included is a selection of twentieth-century scrapbooks, bylaws, historical sketches, and programs from several different lodges. The textual materials originate mainly from lodges in the Raleigh and Greensboro areas and date from the early 19th century to the 1960s.
Accompanying the textual materials are two groups of photographs, the first detailing various activities and features of the the Royal White Hart Lodge No. 2 of Halifax, NC in 1911. The second group of photographs documents a ball held on April 18, 1962 which celebrated the installation of Charles Carpenter Ricker as Grand Master of Phoenix Lodge No. 2 in Raleigh, NC. A single photo, taken circa. 1915, which details a gathering of Oasis Shriners in Charlotte, NC, accompanies the two larger sets.
Forty-five additional issues of The AC Phoenix are now available thanks to our partner, N.C. A&T University. These additions, from 1990 to 2006, share more news from North Carolina’s Triad region and beyond for readers. Based in Winston-Salem, The AC Phoenix provides an invaluable resource for Triad African American communities and has been an institution in the region since Rodney Sumler founded the paper in 1983.
These issues feature local, regional, and national content with an undercurrent of local priority. They feature photo spreads from local events, news about local schools, churches, and groups, and share information about the state of the community.
Some issues include special features, or additions in honor of a specific holiday or occasion. For example, the December 2004 issue was published with a special holiday songbook, shown below:
Despite The AC Phoenix‘s local emphasis, the paper covers a significant amount of national news as well. When Congress approved the Smithsonian’s National Museum of African American History and Culture, The AC Phoenix announced the plans to its readers.
DigitalNC is glad to provide increased access to The AC Phoenix. To view these issues of the paper and more, visit its DigitalNC page here. To learn more about N.C. A&T University, visit their website here or their partner page here. To view The AC Phoenix‘s website, go here.
We love being sent or just stumbling upon, projects on the web that utilize materials digitized through the North Carolina Digital Heritage Center. We thought since they have done such a great job highlighting us, it’d only be fair to turn around and highlight a few we’ve found recently.
The History Unfolded: US Newspapers and the Holocaust Project from the United States Holocaust Memorial Museum in Washington, D.C. is a project in which DigitalNC materials are just a small portion of a much bigger effort. According to the project’s website “asks students, teachers, and history buffs throughout the United States what was possible for Americans to have known about the Holocaust as it was happening and how Americans responded. Participants look in local newspapers for news and opinion about 37 different Holocaust-era events that took place in the United States and Europe, and submit articles they find to a national database, as well as information about newspapers that did not cover events.” The goal of the project is to build a crowd-sourced repository that scholars can use to better understand what those in the United States knew as the Holocaust was happening. Digitized newspapers are a key component of the project and many of the papers we have digitized through DigitalNC have been used by participants of the project to track knowledge of Holocaust related events in local NC newspapers. You can view everything that is from an NC newspaper here. The earliest articles come from 1933, including an article from the Journal Patriot out of North Wilkesboro, NC that has the headline “A Dangerous Policy” regarding the Nazis’ growing policies against the Jewish people in Germany.
The latest articles date to 1945 and focus on the evolving information being uncovered about the full extent of the Holocaust once the Nazis had been beaten in World War II. As History Unfolded is a crowdsourced project you can get involved and help the museum continue to track this information in US newspapers. To get involved yourself, visit here.
If you have a particular project or know of one that has utilized materials from DigitalNC, we’d love to hear about it! Contact us via email or in the comments below and we’ll check out. To see past highlighted projects, visit past posts here.
A new batch of materials from Central Piedmont Community College (CPCC) is now available online. The documents, stored in vertical files at CPCC’s archives, consist of school administrative documents and yearbooks from the 1950s and 1960s. The materials document the operation and administration of the predominantly black Carver College (later renamed Mecklenburg College) and the Central Industrial Education Center before their merger to form CPCC in 1963. The batch contains the entirety of both the Carver College Collection and the Central Industrial Education Center Collection from CPCC’s archives. For detailed finding aids on either collection, please follow the links.
A new batch of newspapers from Queens University of Charlotte is now online. The batch covers a 20 year span (1931-1951) of Queens Blues, the student newspaper for Charlotte’s Queens College. An all female liberal arts institution, Queens College began admitting male students after the Second World War and later became Queens University of Charlotte. The issues provide interesting insights into the world of young, educated women during a crucial period in American history – The Great Depression and World War II. The contents largely concern themselves with goings-on at the school itself, but touch upon wider events as well. The front page shown above, for example, illustrates how college students reacted to the Japanese attack on Pearl Harbor and America’s entry into World War II.
The newly digitized materials are an addition to the considerable amount of Queens University materials already online at DigitalNC. Visit their DigitalNC partner page here or head to the QUC Library website for more information.