Digital North Carolina Blog

Digital North Carolina Blog

This blog is maintained by the staff of the North Carolina Digital Heritage Center and features highlights from the collections at DigitalNC, an online library of primary sources from institutions across North Carolina.

RSS Subscribe By Mail UNC Social Media Statement


What Should You Do With Your Scanned Photos? What We Suggest for Libraries, Archives, and Museums

We frequently get asked by institutions “what should I do with my scanned photos/documents?” This is a great question but not an easy one – digitization/scanning is the easy part.

What these institutions are often asking is how they should keep track of the files they created during scanning (scans) and the information about what they scanned (metadata). In addition to tracking, they’d like to know what their options are for sharing the scans and metadata with an online audience.

When you see websites like ours with extensive collections of scans paired with metadata (like in the screenshot below), there’s usually a piece of software behind it that keeps track of the scans and the metadata and then matches them up for online display. That’s what a content management systems (CMS) does, if you’ve heard that term before. The benefit of using a CMS is that it makes sure the scans and their metadata remain paired over time, and often allows users to do fun things like search, sort, and filter.

Color photograph of a woman in a WWI uniform.

Screenshot of an item on DigitalNC, as presented by a content management system called TIND.

There are different types of CMSs for different types of industries. This post focuses on options for cultural heritage institutions, because CMSs made for cultural heritage institutions generally address the things we care about most. They make sure metadata is shareable, that scans can be described really well, and that you can express one-to-many relationships (think: many scans linked up to a single metadata record).

If your institution is considering implementing a CMS, here are the very first steps we suggest considering.

First, Plan 

  • Decide on your goals. Do you want your scans to be available online? Or are you just looking for software that will manage your scans and metadata locally? Who will use the end product – your staff, your patrons/users, or both? Your answer will help guide where you go next.
  • Do some prep work. Like any other service your institution wants to maintain, figure out (1) how much money you have to spend both now and on an ongoing basis, (2) who will need to be involved in installation and support, and (3) what staff expertise you already have related to technology.
  • Talk to your administration and coworkers. What are their goals and needs for scanning and sharing those scans, if any? It’s a lot harder to implement a system if you don’t have the buy-in of others where you work. 
  • Be realistic. Start small and build up your capacity. We’ve never heard of someone saying “our first scanned collection was too small,” but we have heard a lot of people say “I bit off way more than we could chew.”

Options for Keeping Track of Scans Locally

If you just need to keep track of scans and metadata locally for staff use, you can do this easily with a spreadsheet and a really consistent file naming structure. The spreadsheet could include things like a title or description, maybe a physical location, any other helpful keywords or dates, and the file or folder names for the scans. Staff can search the spreadsheet for what they need, and then find the file or folder name so they can pull up the scans from storage.

If you’d like something more sophisticated for keeping track of scans and metadata locally for staff use, there are programs that allow you to tag and describe scans that live locally. If you search for “photo management software” or “photo organizing software” online you’ll discover a number of options. We’re not terribly familiar with these; just be sure that you can export whatever you put into the software before committing.

Options for Putting Scans Online

If you decide you’d like to put your scans online, here are some choices you can consider.

A Content Management System Already in Place

Examples Include: LibGuides (screenshot below), library catalogs, museum databases

Screenshot of a public library's LibGuide site.

Screenshot of a LibGuide with extensive information about North Carolina maps.

Typically Chosen By: Institutions who already have a CMS that they can stretch to serve their needs.

The Positive Side: You may be able to start sharing your scans faster because the CMS is already adopted and paid for by your institution and familiar to staff and online users. 

Possible Challenges: LibGuides, library catalogs, and museum databases do not always follow best practices and standards for digital collections. For example, it may not allow you to attach multiple scans to a single record, or it may not export your metadata in a structured way. In other words, you may be fitting a “square peg into a round hole.” In addition, if the features you want to use are secondary to the system’s main purpose, the vendor or developer could drop those features later. 

Recommended? Depending on your resources and needs this can be the best solution. Just be aware of the possible down sides mentioned above.

A Social Media or Photo Sharing Website

Examples Include: Facebook, Flickr (Screenshot below), Tumblr

Screenshot of a yearbook cover photo on Flickr

Screenshot of an item on Flickr.

Typically Chosen By: Private individuals, small organizations with limited technical staff, institutions seeking to engage with broad communities where those communities already congregate online.

The Positive Side: These reach broad, built-in audiences. There is frequently no cost up front.

Possible Challenges: These do not adhere to best practices and standards for digital collections, which can cause a lot of work later on. Sites like these can shut down or change their terms of service with little or no regard for or warning to users. There are ads displayed near to your files, over which your organization has no control. It’s frequently impossible or extremely difficult to get your files and metadata back out of these sites.

Recommended? Not recommended as the main mechanism for managing and storing your files and metadata. These sites are best used only for outreach and engagement.

Hosting your Content on DigitalNC.org

Typically Chosen By: Institutions of all sizes who prefer not to host their own software, possibly due to local IT limitations or as a result of strategic priorities;  institutions who would like their scans and metadata searchable alongside others from around the state.

The Positive Side: Your content reaches a broad, built-in audience. It would be searchable with similar digital collections from around North Carolina. Currently no cost to institutions.

Possible Challenges: We do the uploading and editing for you, and it takes place within a broader schedule. We’d ask you to create images and metadata that follow our standards before we could upload. (These could be positives, depending on your perspective.)

Recommended? Sure! Depending on your resources and needs this can be a great option.

A Content Management System Hosted by an External Company

Examples Include: CONTENTdm (screenshot below), hosted Islandora, ArtStor’s JSTOR Forum, Omeka.net, Past Perfect Online, or TIND (which is what we use, see screenshot at the beginning of this post)

Photograph of a man and boy with two dogs, along with metadata below it.

Screenshot of a hosted instance of CONTENTdm.

Typically Chosen By: Institutions of all sizes who prefer not to host their own software, possibly due to local IT limitations or as a result of strategic priorities.

The Positive Side: Many systems like these are built with best practices like consistency, standards, and integration with other systems. They will allow users to search your metadata, and often offer things like filtering, file downloading, and other desired user services. Your organization does not have to set up or maintain the software locally. You can establish a brand and dedicated site for your digital collections.

Possible Challenges: They require staff with specialized training in the system, and the ability to pay a vendor both initially and on an ongoing basis. You’re limited to the services or features the vendor chooses to offer.

Recommended? Sure! Depending on your resources this can be a great option.

Hosting Your own Content Management System

Examples Include: Self-hosted Islandora, Omeka (screenshot below), Samvera, Collective Access

Screenshot of a colorful campus map along with metadata.

Screenshot of a self-hosted instance of Omeka.

Typically Chosen By: Institutions with programmers on staff, dedicated IT support, and collections that require a lot of customization.

The Positive Side: Like the hosted systems above, these are also often built with best practices like standards and interoperability. They will allow users to search your metadata, and often offer things like filtering, image downloading, and other user services. When you host your own system you can frequently customize more features.

Possible Challenges: They require staff with specialized training, and a robust and flexible IT support infrastructure. They’re more time intensive and costly to maintain.

Recommended? Sure! Depending on your resources and needs this can be a great option.

Final Thoughts

In the end, there isn’t much that’s an “always wrong” choice. There are only choices that have different consequences down the road. We encourage people to choose the systems that adhere to digital collections best practices, because those best practices come from people who’ve made choices they regretted. In the end, it’s most important to choose a solution that meets your needs and fits the resources you have now and those you anticipate having in the future. Above all, always be sure that your scans and metadata are backed up and can be extracted from the system you choose!

Did we miss anything? Leave us a comment below.

If you’re considering one or more of these and have questions, get in touch. We’re happy to give you advice for what to ask a vendor or point you to similar institutions who may have already adopted what you’re considering.


Pinehurst High School Yearbooks from Moore County Now Online

High schoolers sprawled out and collapsed around a chair, with the caption "gooney club"

Moore County Historical Association has contributed 11 high school yearbooks for Pinehurst High School to DigitalNC, dating from 1951-1961. These are the first yearbooks for Pinehurst High School available on DigitalNC.

You can also browse other yearbooks from Moore County, or take a look at our list of available high school yearbooks, organized by county.


Catalogs and Yearbooks Added from Sandhills Community College in Pinehurst

Two page yearbook spread with headshots of students and their names

Pages 34-35 of the Sandhills ’78 yearbook.

Catalogs and yearbooks are now online from our newest community college partner, Sandhills Community College in Pinehurst, Moore County, NC. Most community colleges had at least short runs of yearbooks produced during the 1960s and 1970s, and Sandhills has contributed 1968-1978. We’re also pleased to share catalogs dating from 1967, one year after classes began, through 2017. 

We’ve now worked with 28 North Carolina community colleges to provide yearbooks, catalogs, photographs, and other documents related to community college history in North Carolina. Browse our contributor list or our college yearbook page for more information.


Digital Collections OCR: What it is, and what it isn’t.

“I can see the word on the page, but when I search for it, no matches are found.”

“This item is searchable. Why can’t I read it with a screen reader?”

We get a lot of great questions like the ones above: the answer to all of them, in some way, is “OCR.”

What OCR Is

Optical Character Recognition (OCR) is amazing technology; with OCR software we are able to search image files for groups of pixels that look like text, guess what that text might be, and save the output in a way that we can feed into our search indexing systems. Even better, we’re sometimes able to overlay that text output on top of an image so that we can show you where we think a word might appear.

At the North Carolina Digital Heritage Center, we scan and store digital heritage materials as images. When we notice that an image contains printed text–documents, posters, ledgers, scrapbooks, and more–we also run it through OCR software. Without OCR, text shown in images is “locked” inside them; with OCR we can leverage the power of full text search to help people discover relevant images a little better than before.

What OCR Isn’t

No OCR method is without limitations. Whether OCR software can correctly “read” the text in an image depends on a few things:

The longer OCR takes, the better it is

The longer the OCR engine is allowed to puzzle over the pixels in an image, the better its output can be. At NCDHC we try to find the right balance between giving the OCR software enough time to produce useful results, and scanning more materials: letting OCR take too long would significantly reduce the amount of materials we’re able to add to DigitalNC each day.

OCR is less accurate with historic materials

Most of the materials we work with are difficult for OCR engines to interpret: compared with more modern materials, historic documents use fuzzier printing methods, display a lot of variation in letter forms, are deteriorating, or contain a mixture of printed and handwritten text.  All of these things are likely to confuse even the best OCR software, producing text output that can differ from what’s visible on the screen.

OCR isn’t the same as a transcription

Without human intervention, it can be difficult for OCR software to interpret the layout of a document. By default, OCR software attempts to “read” an image from left to right. Even if it’s able to recognize all of the words on a page, it may not recognize the order in which the words were intended to be read; for example, the software might not be able to differentiate where one column ends and another begins in a newspaper clipping, or it might include the text of an advertisement in the middle of an article:

Example of OCR text challenges

In contrast, transcriptions represent the text in an image as it’s meant to be read, and requires some amount of human labor to produce.

Summary, and a look ahead

OCR is a fantastic tool that enhances the way users are able to interact with the images available in DigitalNC collections, but its limitations prevent it from producing full, traditionally-readable transcriptions of image materials.

Even so, NCDHC looks forward to next-generation tools and methods for recognizing and searching for text within images. OCR software is constantly improving; the software we use today is faster and more accurate than it was five years ago, and OCR technology benefits from recent advances in machine learning and artificial intelligence.

If you have questions or concerns about searchable content on DigitalNC, or would like information on obtaining a copy of materials that is accessible to screen readers, please don’t hesitate to contact us.

 


DigitalNC’s First High School Yearbooks from Graham County Now Online

two photographs of students on a bike, and a student riding a horse

Senior superlatives from the 1964 The Robin yearbook.

Graham County Public Library, one of our westernmost partners, has contributed our first Graham County yearbooks to DigitalNC. There are now 11 yearbooks from Robbinsville High School (1950-1967) available online. In addition they provided two from Tri-County Community College (1979-1982) in Murphy, NC (Cherokee County).

We were delighted to visit the Graham County Public Library back in June 2018, when we scanned photographs from their collection.

In addition to these yearbooks, you can take a look at our list of available high school yearbooks, organized by county. 


New partner, Fuquay Varina Museums, adds 20 yearbooks in first batch

We are excited to welcome new partner Fuquay-Varina Museums to DigitalNC.  Their first batch with us is a set of 20 yearbooks from Fuquay Varina area schools, Fuquay Springs High School, the white school, and Fuquay Consolidated High School, the African-American school for the town before integration.  The schools were combined in 1969 to form Fuquay-Varina High School, which still operates today as part of the Wake County School system.

Photographs from Fuquay Consolidated Prom

Prom photographs from the 1953 Fuquay Consolidated yearbook

Cover of the Fuquay Springs High School yearbook showing women standing outside the school

Cover of the 1959 Fuquay Springs High School yearbook

 

To see more yearbooks digitized on DigitalNC, visit here.  And to learn more about our partner, visit their website here and their partner page here.


New items from the Grand Lodge of North Carolina now online at DigitalNC

Chorazin Chapter Royal Arch Masons 1914

A page from the Book of Marks of the Chorazin Chapter no. 13 of the Royal Arch Masons of Greensboro, NC, 1914

A new batch of items from The Grand Lodge of Ancient, Free and Accepted Masons of North Carolina are now available online. The recently digitized materials consist largely of minute books, account ledgers, and membership rolls from the Grand Lodge and various other Masonic lodges in North Carolina. Also included is a selection of twentieth-century scrapbooks, bylaws, historical sketches, and programs from several different lodges. The textual materials originate mainly from lodges  in the Raleigh and Greensboro areas and date from the early 19th century to the 1960s.

 

Colonial Masters Royal White Hart Lodge

Officers of the Order of Colonial Masters at the Royal White Hart Lodge no. 2, 1911

Accompanying the textual materials are two groups of photographs, the first detailing various activities and features of the the Royal White Hart Lodge No. 2 of Halifax, NC in 1911. The second group of photographs documents a ball held on April 18, 1962 which celebrated the installation of Charles Carpenter Ricker as Grand Master of Phoenix Lodge No. 2 in Raleigh, NC. A single photo, taken circa. 1915, which details a gathering of Oasis Shriners in Charlotte, NC, accompanies the two larger sets.

To see more materials from The Grand Lodge of Ancient, Free and Accepted Masons of North Carolina, visit their partner page or take a look at their website.


More issues of The AC Phoenix are available on DigitalNC

From the cover of The AC Phoenix, December 1991 issue

From the cover of The AC Phoenix, December 1991 issue

Advertisement from the March 2006 issue

Advertisement from the March 2006 issue

Forty-five additional issues of The AC Phoenix are now available thanks to our partner, N.C. A&T University. These additions, from 1990 to 2006, share more news from North Carolina’s Triad region and beyond for readers. Based in Winston-Salem, The AC Phoenix provides an invaluable resource for Triad African American communities and has been an institution in the region since Rodney Sumler founded the paper in 1983.

These issues feature local, regional, and national content with an undercurrent of local priority. They feature photo spreads from local events, news about local schools, churches, and groups, and share information about the state of the community.

Some issues include special features, or additions in honor of a specific holiday or occasion. For example, the December 2004 issue was published with a special holiday songbook, shown below:

Community Holiday Songbook 2004, from the December 2004 issue

Community Holiday Songbook 2004, from the December 2004 issue

"National Black History Museum Approved," from the January 2004 issue

“National Black History Museum Approved,” from the January 2004 issue

Despite The AC Phoenix‘s local emphasis, the paper covers a significant amount of national news as well. When Congress approved the Smithsonian’s National Museum of African American History and Culture, The AC Phoenix announced the plans to its readers.

DigitalNC is glad to provide increased access to The AC Phoenix. To view these issues of the paper and more, visit its DigitalNC page here. To learn more about N.C. A&T University, visit their website here or their partner page here. To view The AC Phoenix‘s website, go here.


How DigitalNC materials are being used across the web: History Unfolded Project at the US Holocaust Museum

We love being sent or just stumbling upon, projects on the web that utilize materials digitized through the North Carolina Digital Heritage Center.  We thought since they have done such a great job highlighting us, it’d only be fair to turn around and highlight a few we’ve found recently.

History Unfolded events page

The museum has selected various events from 1933-1945 for people to focus their research on finding articles about.

The History Unfolded: US Newspapers and the Holocaust  Project from the United States Holocaust Memorial Museum in Washington, D.C. is a project in which DigitalNC materials are just a small portion of a much bigger effort.  According to the project’s website “asks students, teachers, and history buffs throughout the United States what was possible for Americans to have known about the Holocaust as it was happening and how Americans responded. Participants look in local newspapers for news and opinion about 37 different Holocaust-era events that took place in the United States and Europe, and submit articles they find to a national database, as well as information about newspapers that did not cover events.”  The goal of the project is to build a crowd-sourced repository that scholars can use to better understand what those in the United States knew as the Holocaust was happening.  Digitized newspapers are a key component of the project and many of the papers we have digitized through DigitalNC have been used by participants of the project to track knowledge of Holocaust related events in local NC newspapers.  You can view everything that is from an NC newspaper here.  The earliest articles come from 1933, including an article from the Journal Patriot out of North Wilkesboro, NC that has the headline “A Dangerous Policy” regarding the Nazis’ growing policies against the Jewish people in Germany. 

screenshot of the History Unfolded Project

Article page on the History Unfolded project site showing an article from The Journal Patriot in 1933

The latest articles date to 1945 and focus on the evolving information being uncovered about the full extent of the Holocaust once the Nazis had been beaten in World War II.  As History Unfolded is a crowdsourced project you can get involved and help the museum continue to track this information in US newspapers.  To get involved yourself, visit here.     

If you have a particular project or know of one that has utilized materials from DigitalNC, we’d love to hear about it!  Contact us via email or in the comments below and we’ll check out.  To see past highlighted projects, visit past posts here


New school records from Central Piedmont Community College now online at DigitalNC

 

Mecklenburg College

Mecklenburg College master plan, circa 1961-1963

A new batch of materials from Central Piedmont Community College (CPCC) is now available online. The documents, stored in vertical files at CPCC’s archives, consist of school administrative documents and yearbooks from the 1950s and 1960s. The materials document the operation and administration of the predominantly black Carver College (later renamed Mecklenburg College) and the Central Industrial Education Center before their merger to form CPCC in 1963. The batch contains the entirety of both the Carver College Collection and the Central Industrial Education Center Collection from CPCC’s archives. For detailed finding aids on either collection, please follow the links.

The new materials are an addition to the CPCC memorabilia and yearbooks already hosted online at DigitalNC. Please visit their partner page or website for more information.