Digital North Carolina Blog

Digital North Carolina Blog

This blog is maintained by the staff of the North Carolina Digital Heritage Center and features highlights from the collections at DigitalNC, an online library of primary sources from institutions across North Carolina.

RSS Subscribe By Mail UNC Social Media Statement

Viewing entries tagged "howto"

What Should You Do With Your Scanned Photos? What We Suggest for Libraries, Archives, and Museums

We frequently get asked by institutions “what should I do with my scanned photos/documents?” This is a great question but not an easy one – digitization/scanning is the easy part.

What these institutions are often asking is how they should keep track of the files they created during scanning (scans) and the information about what they scanned (metadata). In addition to tracking, they’d like to know what their options are for sharing the scans and metadata with an online audience.

When you see websites like ours with extensive collections of scans paired with metadata (like in the screenshot below), there’s usually a piece of software behind it that keeps track of the scans and the metadata and then matches them up for online display. That’s what a content management systems (CMS) does, if you’ve heard that term before. The benefit of using a CMS is that it makes sure the scans and their metadata remain paired over time, and often allows users to do fun things like search, sort, and filter.

Color photograph of a woman in a WWI uniform.

Screenshot of an item on DigitalNC, as presented by a content management system called TIND.

There are different types of CMSs for different types of industries. This post focuses on options for cultural heritage institutions, because CMSs made for cultural heritage institutions generally address the things we care about most. They make sure metadata is shareable, that scans can be described really well, and that you can express one-to-many relationships (think: many scans linked up to a single metadata record).

If your institution is considering implementing a CMS, here are the very first steps we suggest considering.

First, Plan 

  • Decide on your goals. Do you want your scans to be available online? Or are you just looking for software that will manage your scans and metadata locally? Who will use the end product – your staff, your patrons/users, or both? Your answer will help guide where you go next.
  • Do some prep work. Like any other service your institution wants to maintain, figure out (1) how much money you have to spend both now and on an ongoing basis, (2) who will need to be involved in installation and support, and (3) what staff expertise you already have related to technology.
  • Talk to your administration and coworkers. What are their goals and needs for scanning and sharing those scans, if any? It’s a lot harder to implement a system if you don’t have the buy-in of others where you work. 
  • Be realistic. Start small and build up your capacity. We’ve never heard of someone saying “our first scanned collection was too small,” but we have heard a lot of people say “I bit off way more than we could chew.”

Options for Keeping Track of Scans Locally

If you just need to keep track of scans and metadata locally for staff use, you can do this easily with a spreadsheet and a really consistent file naming structure. The spreadsheet could include things like a title or description, maybe a physical location, any other helpful keywords or dates, and the file or folder names for the scans. Staff can search the spreadsheet for what they need, and then find the file or folder name so they can pull up the scans from storage.

If you’d like something more sophisticated for keeping track of scans and metadata locally for staff use, there are programs that allow you to tag and describe scans that live locally. If you search for “photo management software” or “photo organizing software” online you’ll discover a number of options. We’re not terribly familiar with these; just be sure that you can export whatever you put into the software before committing.

Options for Putting Scans Online

If you decide you’d like to put your scans online, here are some choices you can consider.

A Content Management System Already in Place

Examples Include: LibGuides (screenshot below), library catalogs, museum databases

Screenshot of a public library's LibGuide site.

Screenshot of a LibGuide with extensive information about North Carolina maps.

Typically Chosen By: Institutions who already have a CMS that they can stretch to serve their needs.

The Positive Side: You may be able to start sharing your scans faster because the CMS is already adopted and paid for by your institution and familiar to staff and online users. 

Possible Challenges: LibGuides, library catalogs, and museum databases do not always follow best practices and standards for digital collections. For example, it may not allow you to attach multiple scans to a single record, or it may not export your metadata in a structured way. In other words, you may be fitting a “square peg into a round hole.” In addition, if the features you want to use are secondary to the system’s main purpose, the vendor or developer could drop those features later. 

Recommended? Depending on your resources and needs this can be the best solution. Just be aware of the possible down sides mentioned above.

A Social Media or Photo Sharing Website

Examples Include: Facebook, Flickr (Screenshot below), Tumblr

Screenshot of a yearbook cover photo on Flickr

Screenshot of an item on Flickr.

Typically Chosen By: Private individuals, small organizations with limited technical staff, institutions seeking to engage with broad communities where those communities already congregate online.

The Positive Side: These reach broad, built-in audiences. There is frequently no cost up front.

Possible Challenges: These do not adhere to best practices and standards for digital collections, which can cause a lot of work later on. Sites like these can shut down or change their terms of service with little or no regard for or warning to users. There are ads displayed near to your files, over which your organization has no control. It’s frequently impossible or extremely difficult to get your files and metadata back out of these sites.

Recommended? Not recommended as the main mechanism for managing and storing your files and metadata. These sites are best used only for outreach and engagement.

Hosting your Content on

Typically Chosen By: Institutions of all sizes who prefer not to host their own software, possibly due to local IT limitations or as a result of strategic priorities;  institutions who would like their scans and metadata searchable alongside others from around the state.

The Positive Side: Your content reaches a broad, built-in audience. It would be searchable with similar digital collections from around North Carolina. Currently no cost to institutions.

Possible Challenges: We do the uploading and editing for you, and it takes place within a broader schedule. We’d ask you to create images and metadata that follow our standards before we could upload. (These could be positives, depending on your perspective.)

Recommended? Sure! Depending on your resources and needs this can be a great option.

A Content Management System Hosted by an External Company

Examples Include: CONTENTdm (screenshot below), hosted Islandora, ArtStor’s JSTOR Forum,, Past Perfect Online, or TIND (which is what we use, see screenshot at the beginning of this post)

Photograph of a man and boy with two dogs, along with metadata below it.

Screenshot of a hosted instance of CONTENTdm.

Typically Chosen By: Institutions of all sizes who prefer not to host their own software, possibly due to local IT limitations or as a result of strategic priorities.

The Positive Side: Many systems like these are built with best practices like consistency, standards, and integration with other systems. They will allow users to search your metadata, and often offer things like filtering, file downloading, and other desired user services. Your organization does not have to set up or maintain the software locally. You can establish a brand and dedicated site for your digital collections.

Possible Challenges: They require staff with specialized training in the system, and the ability to pay a vendor both initially and on an ongoing basis. You’re limited to the services or features the vendor chooses to offer.

Recommended? Sure! Depending on your resources this can be a great option.

Hosting Your own Content Management System

Examples Include: Self-hosted Islandora, Omeka (screenshot below), Samvera, Collective Access

Screenshot of a colorful campus map along with metadata.

Screenshot of a self-hosted instance of Omeka.

Typically Chosen By: Institutions with programmers on staff, dedicated IT support, and collections that require a lot of customization.

The Positive Side: Like the hosted systems above, these are also often built with best practices like standards and interoperability. They will allow users to search your metadata, and often offer things like filtering, image downloading, and other user services. When you host your own system you can frequently customize more features.

Possible Challenges: They require staff with specialized training, and a robust and flexible IT support infrastructure. They’re more time intensive and costly to maintain.

Recommended? Sure! Depending on your resources and needs this can be a great option.

Final Thoughts

In the end, there isn’t much that’s an “always wrong” choice. There are only choices that have different consequences down the road. We encourage people to choose the systems that adhere to digital collections best practices, because those best practices come from people who’ve made choices they regretted. In the end, it’s most important to choose a solution that meets your needs and fits the resources you have now and those you anticipate having in the future. Above all, always be sure that your scans and metadata are backed up and can be extracted from the system you choose!

Did we miss anything? Leave us a comment below.

If you’re considering one or more of these and have questions, get in touch. We’re happy to give you advice for what to ask a vendor or point you to similar institutions who may have already adopted what you’re considering.

Military and Veterans History on DigitalNC: Best Ways to Search

Group of Soldiers Posed with Firestone Officials, from the Gaston Museum of Art & History.

Group of Soldiers Posed with Firestone Officials, from the Gaston Museum of Art & History.

This Veterans Day, we thought we’d mention some best bets for finding and searching materials on DigitalNC related to military history. Some time periods and subjects have better representation than others, so we’ve focused on the five wars that have the most related materials.

Tip 1: Search by Subject

To isolate materials that are predominantly about a particular war, you can use the subject specific links listed below.

search_within_resultsAfter you click on one of the links above, if you’d like to search within the results, type your search term in the search box at the top of the page, leaving “within results” selected (see screenshot at right).

You can also do a full text search that combines (1) your research interest (perhaps a name, a topic, or an event) in conjunction with (2) the name of a particular war. This may yield a lot more results, depending on your research interest, but it could also zero in on your target faster. Here’s a link to an example that you can amend to your own use.

Only interested in photographs? Try this search, which is limited to photos that contain the word “military” or “soldiers” as a subject.

Tip 2: Search by Date Range

Another tactic is to search or browse items that were created during a particular war. These don’t always have that war as a subject term, but they often deal with wartime issues or society regardless. We’ve listed date specific links here:

A list of alumni and students killed or missing in action, from the 1944 UNC-Chapel Hill Yackety Yack yearbook, page 12.

A list of alumni and students killed or missing in action, from the 1944 UNC-Chapel Hill Yackety Yack yearbook, page 12.

Keep in mind that doing a full text search will be ineffective about 98% of the time when it comes to handwritten items on our site, as most do not have transcripts. This is just to let you know that you may need to read through handwritten items pulled up in one of the searches above if you believe they may contain information you’re interested in.

Our partners have shared a lot of yearbooks on DigitalNC and, while they may not be the first thing that comes to mind for military history, many colleges and universities recognized students who served. Especially for the Vietnam, Korean, Gulf, and Afghan wars, yearbooks document campus reactions and protests. You currently can’t search across all of the yearbooks available on DigitalNC, however if you’d like to browse through yearbooks published during a particular war, you can use this example link and just adjust the dates as needed. Currently, our site has high school yearbooks published up through the late 1960s, and college and university yearbooks and campus publications through 2015.

Tip 3: Newspapers!

Searching the student and community newspapers on DigitalNC can yield biographical information about soldiers, editorials expressing local opinions about America’s military action, as well as news and advertisements related to rationing and resources on the homefront.

The Newspapers Advanced Search is your friend here! You can target papers published during specific years. You can also narrow your search to specific newspaper titles.


Screenshot of the Newspapers Advanced Search page, with the search phrase “Red Cross” and limiting the results to papers published from 1914-1918.


We also wanted to call your attention to a couple of newspaper titles on DigitalNC that were published exclusively for service members or during one of these wars:

  • The Caduceus, published by the Base Hospital at Camp Greene (Charlotte, NC), 1918-1919
  • Cloudbuster, published at UNC-Chapel Hill to share news about the Navy pre-flight school held on campus, 1942-1945
  • The Home Front News, published by the Tarboro Rotary Club for servicemen from their city, 1943-1945
  • Hot Off the Hoover Rail, published by the community of Lawndale for servicemen from their city, 1942-1945
  • Trench and Camp, published by The Charlotte Observer for Camp Greene, 1917-1918

Bonus Resource: Wilson County’s Greatest Generation

One of the largest exhibits on our site is Wilson County’s Greatest Generation, an effort by the Wilson County Historical Association to document the service men and women of Wilson County, North Carolina who served in World War II. Documentation is organized by individual, and includes personal histories, photos, clippings, and other ephemera.

We hope this information can guide you through researching military history on DigitalNC. If you have any of your own tips or questions, please let us know by commenting below or contacting us.

Have Scans, Will Travel? Hosting Your Scans at DigitalNC

Moving Truck Transferring Family Possessions, from the Gaston County Museum of Art & History

Moving Truck Transferring Family Possessions, from the Gaston County Museum of Art & History

The Digital Heritage Center does a lot of scanning on some really versatile machines. It’s one of the practical sides to our mission, and we take pride in being able to provide that service.

What is perhaps less well known is that we also help cultural heritage institutions publish items they’ve scanned themselves. Many cultural heritage institutions have flatbed or book scanners as well as willing staff and volunteers, but lack the technical infrastructure to host those scans for the public.

We’ve helped institutions …

  • who needed to migrate from ailing databases or systems they can no longer support,
  • who wanted to be able to full-text search their materials, a function they couldn’t fulfill through their current website,
  • who offered their digital files to on-site users, but who were seeking a broader audience.

When we start this conversation, here are some of the questions we ask:

  • Tell us about the original physical objects* – does your institution still have them? are there any rights or privacy concerns to sharing these online? what kind of subject matter is represented?
  • Tell us about the digital files – who originally created them? how many are there? where do they live? what file types? how are they organized? is this an ongoing project? do you have any metadata already?

If the files are a good fit for DigitalNC, they get transferred to hard drives, metadata is created or amended, and items appear on the site alongside the scans we create here at the Center. If you work at a cultural heritage institution eligible to work with the Center, have or are currently creating scans, and are interested in adding these to DigitalNC, contact us. We may be able to give them a home.

* If there were any. We can help with born-digital items as well.

Suggestions for Viewing Scrapbooks on DigitalNC

Even for those of us who work at the Digital Heritage Center, browsing scrapbooks or other printed items on DigitalNC can be frustrating. The viewer for a single item, which displays yearbooks, photographs, and short booklets pretty well, can be cumbersome for longer and larger items. Here are a few features that may not be immediately apparent but that we hope might help.

This is a screenshot of the viewer, showing the page of a scrapbook.

Item page in CONTENTdmAt default, maybe about one third of the scrapbook page is showing (your screen may vary from mine). To the right, only a few thumbnails are visible at any one time. To move back and forth between pages, you’ll need to scroll through and click on each thumbnail one by one. If you want to see the full text for items, you have to toggle back and forth between tabs. So, what are your options?

Try Making the Scrapbook View Larger and Switching to “Content”

If you drag down the little toggle arrows at the bottom of the viewer, you’ll have more control over how much of the page is visible on your screen. You can also switch from “Thumbnails” to “Content” in the right-hand ribbon. This means more page links are visible at once, so you have to scroll less when moving from page to page.

Manipulate main CONTENTdm interface

Try “Page Flip View” for a Quick Browse

The second tip is to try Page Flip View. The button for Page Flip View is located over the top of the page image:

Page Flip View Button

We use this option if we want to browse an item fast. Sometimes the image quality isn’t that good (I won’t go into why here). However, Page Flip View can be helpful if you want to get a quick sense of what’s inside a scrapbook, or if you’re looking for something in particular. Here’s what Page Flip View looks like on my screen:

Page Flip View

To move back and forth, just click on the page you’d like to turn.

Try “View PDF & Text” for a Better Layout

A favorite way to view scrapbooks and similar items is to click the View PDF & Text button, located right next to the Page Flip View button. View PDF & Text brings up an alternative view that takes advantage of a lot more screen real estate. See below.

Viewing PDF image and text

With this view, you’re able to see more of each page. A lot more thumbnails are stretched out across the bottom of the screen, so you’ll scroll less. Full text (if it’s present) comes up on the left hand side with each page. If you’ve searched for text, as above, and there are hits on the page, you’ll see the highlight right away instead of having to switch back and forth between tabs. You can hide the full text by using the button in the upper left, if you’d like even more of the main image to show.

We hope these tips are helpful. If you have any questions about the interface or what we’ve mentioned, let us know.

North Carolina Newspaper Digitization Part 3: This is How We Do It

Greensboro Daily News Ad, March 2, 1934

Greensboro Daily News Ad, March 2, 1934

Like Jeopardy!, I want to tell you the answer before I get to the question.

Following a newspaper digitization and markup standard helps us plan for the future and makes it easier for us to work with vendors, open-source software, and other libraries and archives.

I say this up front, because when we explain how we digitize and share newspapers the frequent response is to ask why we do it the way we do. I think this is because our process is more labor intensive than people expect. It’s definitely not the only way, but we’re committed to this path for right now because it accommodates multiple formats (microfilm, print, born-digital), fits our current digitization capacity, and results in a system we think is flexible and extensible.

That standard I mentioned above comes out of the Library of Congress’ National Digital Newspaper Program (NDNP). All of our newspaper work is NDNP compliant, which means we follow that project’s recommendations for how to structure files, the type of metadata to assign to those files, and also the markup language that tells the computer where words are situated on each page (very helpful for full-text search).

I’ll give you a broad outline of our workflow and the tools we use. However, if you want more specific technical details, head over to our account on GitHub.

Screenshot of PaperBoy!

Screenshot of PaperBoy!

Let’s say one of our partners is interested in having us digitize a print newspaper. We’ll start by scanning each page separately on whichever machine works for the paper’s size. Because the NDNP standard requires page-level metadata, we’ve created a lightweight piece of software that helps us take care of some of that while we scan. Affectionately dubbed “PaperBoy,” this program allows the scanning technician to track page number, date, volume, issue, and edition for each shot. While it slows down scanning a little bit, it speeds up post-processing metadata work quite a lot.

Once the scanning’s complete, we process the files to create derivatives that serve different needs. We use ABBYY Recognition Server to get those multiple formats:

  1. a JPEG2000 image that’s excellent quality yet small in file size
  2. an XML file that includes computer-recognized text from the image along with coordinates that indicate the location of each word on that image
  3. a .pdf file that includes both the image and searchable text.

Now that we have the derivatives, we begin filling out a spreadsheet with page-level metadata. We first add the metadata created using Paperboy and then we run through the scans page by page, correcting any mistakes found in the Paperboy output and adding additional metadata. This also helps us quality control the scans and gives us a chance to find skipped pages.

How much metadata do we do? You can download a sample batch spreadsheet from GitHub, if you’re interested in the specifics, but it includes the PaperBoy output as well as fields like Title, our name (Digital Heritage Center) as batch-creators, and information about the print paper’s physical location. A lot of those fields stay the same across numerous scans or can be programmatically populated with a spreadsheet formula, to help make things go faster.

Once we have the spreadsheet and scans complete, scripts developed by our programmer (also available on GitHub) use those spreadsheets to figure out how to rearrange the files and metadata into packages structured just the way the NDNP standard likes them. The script breaks out each newspaper issue’s files into their own file folder, renaming and reorganizing the pages (if needed). The script also creates issue-level XML files, which tag along inside each folder. These XML files describe the issue and its relation to the batch, and include some administrative metadata about who created the files, etc.

Newspaper files before processing (left) and after (right).

Newspaper files before processing (left) and after (right).

The final steps are to load our NDNP-compliant batches into the software we use to present it online, and to quality control the metadata and scans.

If you think about it, newspapers have a helpfully consistent structure: date-driven volumes, issues, and editions. But there isn’t much else in the digital library world quite like them, so more common content management systems can leave something to be desired for both searching and viewing newspapers.  Because of this, and because there’s just so MUCH newspaper content, we use a standalone system for our newspapers: the Library of Congress’ open source newspaper viewer, ChronAm. It’s named as such because it also happens to be the one used for the NDNP’s online presence: the Chronicling America website.

While not perfect, this viewer does really well exploiting newspaper structure. It also allows you to zoom in and out while you skim and read, and it highlights your search terms (courtesy of those XML files created by ABBYY). Try it out on the North Carolina Newspapers portion of our site.

“Can’t you just scan the newspaper and put it online as a bunch of TIFs or JPGs?” Sure. That happens. But that brings me back around to the why question. We love newspapers (most of the time) and love making it as easy and intuitive to use them as we can. We think it’s important to exploit their newspapery-ness, because that’s how users think of and search them.

We also believe that standards like the one from NDNP are kind of like the rules of the road. While off-roading can be fun, driving en masse enables us to be interoperable and sustainable. Standards mean we have a baseline of shared understanding that gives us a boost when we decide we want to drive somewhere together.

This post’s bird’s eye view (perhaps a low-flying bird) doesn’t include more specific questions you may be asking (“What resolution do you use when you scan?” “You didn’t explain METSALTO!”) I also just tackled our print newspaper procedure, because it’s the most labor intensive. When we work with digitized microfilm and born-digital papers the procedure is truncated but similar.

I hope this post as well as part 1 and part 2 of this series give you a sense of what’s involved in our newspaper digitization process and why we do it the way we do. As always, we’re happy to talk more. Just drop us a line.

North Carolina Newspaper Digitization Part 2: The State of the State

Sign pointing microfilm users to different online resources. Taken in Wilson Library's North Carolina Collection Reading Room, UNC-Chapel Hill.

Sign pointing microfilm users to different online resources. Taken in Wilson Library’s North Carolina Collection Reading Room, UNC-Chapel Hill.

[This post updated July 2017.]

Newspaper digitization is challenging for a number of reasons (refer to our previous post). Although we’re biased, if you’re interested in accessing North Carolina newspapers online you’re actually pretty lucky; North Carolina is positioned well ahead of many other states. Below we’ve listed, in descending order of size, all of the major historic online newspaper databases sponsored by North Carolina institutions that are on our radar.

Dates: 1751-2000
Coverage: Statewide
Amount Online: 3,500,000+ pages
Details: The North Carolina Collection at UNC-Chapel Hill Library recently partnered with to digitize millions of pages of North Carolina newspapers. These are accessible for free at the State Archives of North Carolina or UNC-Chapel Hill’s Library, or you can view them anywhere at for a monthly fee. As of July 2017, NC LIVE also makes these papers available to member libraries and their card holders. While there are other vendors out there with historic North Carolina newspapers, this is the most comprehensive to date.

Name: The North Carolina Digital Heritage Center
Coverage: Statewide
Dates: 1824-2013
Amount Online: 640,000+ pages
Details: Each year we receive LSTA funding from the State Library of North Carolina to digitize newspapers. Part of that funding goes toward papers on microfilm, for which we ask for title nominations from libraries and archives. We also digitize some newspapers from print (mostly college and university student newspapers) as well as small runs of community papers that have not been microfilmed.

Name: The University of North Carolina at Chapel Hill, National Digital Newspaper Program Grant Award
Coverage: Statewide
Dates: 1836-1922
Amount Online: 100,000+ pages
Details: UNC-Chapel Hill is currently in its second round of providing selected historic newspapers for digitization and sharing through the Library of Congress’ Chronicling America website. These issues are searchable along with a selection of titles from other states.

Name: University of North Carolina at Greensboro Library / Greensboro Museum
Dates: 1826-1946
Coverage: Town of Greensboro and surrounding area
Amount Online: 5,000+ issues
Details: The Greensboro Historical Newspapers collection includes a variety of papers from that area, including World War II military base papers.

Name: The State Archives of North Carolina
Dates: 1752-1890s
Coverage: Statewide
Amount Online: 4,000+ issues
Details: The State Archives of North Carolina actively preserves, microfilms, and digitizes newspapers. While most of these are not currently available online, they have shared some of the earliest on their website.

Name: East Carolina University Library
Dates: 1887-1915
Coverage: Town of Greenville and surrounding area
Amount Online: 1,800+ issues
Details: ECU’s Digital Collections include The Eastern Reflector, a community paper published in Greenville.

While more focused, college and university papers (especially earlier issues) often included local community news. In addition to those featured on DigitalNC, here’s a list of other school papers online:

This isn’t to say others aren’t scanning their local newspapers – we know some heard of local entities (businesses and libraries) working toward that goal. But this post was intended to list the largest, statewide, and (mostly) freely searchable endeavors. Know of others? Tell us.

In Part 3 of this Newspaper Digitization series, we’ll get technical and describe how we digitize newspapers here at the Digital Heritage Center.

Two related notes:

  1. Looking for a newspaper that isn’t online (yet)? Through your local public library, you can most likely loan and view newspaper microfilm from the State Library of North Carolina. This Newspaper Locator may be helpful if you want to determine some of the titles published in a specific area.
  1. North Carolinians are heavily involved in efforts to preserve born-digital news. The Educopia Institute, located in Greensboro, is spearheading a conversation that brings in news producers and cultural heritage professionals to talk about our disappearing journalistic heritage.  At their website you can learn more about the Memory Hole events and read a white paper on Newspaper Preservation.

North Carolina Newspaper Digitization Part 1: Why Isn’t It All Online Already?

Carrier boy with newspapers. 1965, Courtesy East Carolina University Digital Collections

Carrier boy with newspapers. 1965, Courtesy East Carolina University Digital Collections

Here’s what we know:

  1. Researchers love newspapers.
  2. Libraries and archives love newspapers.
  3. North Carolina has produced a lot of newspapers.
  4. No, really. There. are. a. lot.

Well, we do know a little bit more than that, but those are the Cliff’s Notes of our newspaper story. Because we work with so many papers, we try and stay on top of what’s happening with newspaper digitization in the state and around the country. We thought we’d write a few blog posts to share some of what we’ve seen and are seeing in that area, and to help get the word out that there’s a lot happening in this space in North Carolina.

So, why is digitizing and sharing newspapers online so tough?


There are a lot of them. We’re saying it once more simply because it is the most costly factor in digitization and preservation. Let’s take, for example, a weekly newspaper published from 1870-1920. That’s over 2,500 issues. Say each issue is 8 pages long. Now we’re up to 20,000+ pages. And let’s say there’s one of those types of papers in every county. We’re already at 2 million pages for the state, for only 50 years. This is hugely conservative, considering many counties had more than one paper. And we didn’t even talk about papers published by schools, companies, or ambitious individuals. Or about dailies…

By our estimation, digitization of just the microfilmed newspapers located in the North Carolina Collection at UNC-Chapel Hill would result in over 40 million pages, which means 40 million digitized images. That could be upwards of 180 TB of data. For JUST storage (not including serving this up to the web, maintenance, staff) you’d pay a paltry $6,000 per month*.

We kid you not.


Beside quantity, the remaining challenges look petite. Broadside newspaper pages need a larger scanner than most institutions can afford, especially if the papers are bound. Tabloid sized pages won’t fit on typical flatbed scanners either, and we rarely recommend flatbeds for something like this because they’re just too slow.


Although uniform, which is a plus, historic newspapers can be fragile, friable, and fiddly. The more carefully you have to handle material when you digitize, the more time you’re going to need.


Having images of newspapers is really helpful. It’s portable, physically compact, and easier to copy. But the true advantage of a digital version is when it’s full-text searchable. Full-text searchability across large quantities of files requires indexing and search software, and enough IT infrastructure to make that happen.


While most newspapers published before 1923 can be safely shared online, those published in the years since can have attendant rights issues (pun intended). The massive changes in newspaper ownership over the last 20 years can make institutions wary about publishing a paper from 1924 or 1994.

Oh My.

Hopefully it’s clearer now why more historic newspapers aren’t yet freely available online. Albeit daunting, the challenges mentioned above are all surmountable with enough resources (money and expertise) and time. In our next blog post we’ll highlight where you can find historic North Carolina newspapers online right this very minute.

* We’re quoting Amazon S3 storage here, but YMMV.

Planning a Digital Project that Works (Hint: Digitization is the Easy Part)

At the North Carolina Digital Heritage Center we work on digital projects with cultural heritage institutions around the state. We’ve been at it since 2010 and have completed projects with more than 180 different institutions. In most cases, we provide digital library services, but we also serve in an advisory role, sharing our thoughts and experiences with libraries and museums who are interested in developing their own digital projects. In these conversations, a lot of common themes emerge. There are plenty of guides online talking about best practices for digital projects, and we often refer our colleagues to these, but I thought it would be helpful to share a few essential steps in planning a digital project that I hope will help libraries avoid some of the pitfalls that can lead to incomplete or unsustainable projects.

1. Don’t Worry About Equipment or Specifications (Yet). We see this happen over and over again: a library wants to get started on a digital project and all of the questions we get are related to digitization: What scanner should we buy? What DPI should we scan at? These are important questions that need to be answered, but not at first. There’s no point talking about how materials will be digitized until you know what you’re going to do with the digital files.

2. Before You Do Anything, Figure Out How You’re Going to Get Your Content Online. If digitization is the easy part, this is the hard part. This is what prevents many libraries with limited resources from successfully completing digital projects on their own. Unless you’re scanning materials only for patrons to use in the building, you’re going to need to figure out how to share the digital images and metadata online. This requires access to a content management system (like CONTENTdm or Islandora), a catalog that enables the addition of images or other digitized content (like SirsiDynix Portfolio),  a partnership with non-profit hosting service (like the Internet Archive), or a willingness to share library materials on commercial sites (like Facebook or Flickr). Until you know how you’re going to do this, there’s no point in talking about scanning.

3. Before You Do Anything Else, Figure Out How You’re Going to Keep Your Content Online. You put a lot of work into finishing a digital project and getting everything successfully shared online. Naturally, you’re going to want to make sure that it stays online. It is important for librarians — and especially library administrators — to understand that digital projects require a regular ongoing commitment of resources and staff time. Like purchasing a house or a car, the biggest investment might come at the beginning, but there are going to be maintenance costs over time. This is why grant funding cannot be the only answer for funding digital projects. Grants will provide resources for a year or two, but your library has to be willing to assume ongoing costs for keeping the digital project updated and accessible.

4. If You Don’t Have Dedicated IT Support, Use Somebody Else’s. Small libraries and museums are often in a tough position with IT support. Either they have limited support or they have to rely on support from a larger agency (like county government) with many competing demands. Hosting your own digital project is going to require significant IT support. How much? It depends on how large and how complex your project is going to be, but as a rule of thumb I’d say that if you don’t have at least two full-time IT staff members who have experience with digital library projects and who have the time available to support your project, then you’ll need to look outside your institution for help.

5. There’s Nothing Wrong With Letting Somebody Else Host Your Collection. Without substantial IT support, digital projects used to be out of reach for smaller institutions. Not anymore! Many vendors now offer digital collection hosting services: OCLC hosts CONTENTdm collections for many libraries, Lyrasis hosts Islandora collections and facilitates projects with the Internet Archive, and there are a variety of companies that offer Omeka hosting. This is a great option for smaller institutions, enabling them to get a digital project online quickly without having to invest in servers or staff time. Of course, you’ll have to pay for these services, and they get more expensive the more content you post online, but it’s still likely to be much cheaper than trying to do everything yourself. Keep in mind that this is not just a problem that small libraries are grappling with. With the increasing availability of cloud-based servers, lots of companies are deciding to outsource hosting. Even Netflix does it.

6. Get Help. There’s a lot of help out there: use it. In North Carolina, we have a statewide digital library program and lots of outstanding digital library programs at universities and state agencies. There’s no reason for a smaller institution to go it alone. Established programs can provide lots of guidance and advice, and they may also be able to help with digitization, hosting, and funding.

7. Be Wary of Vendors Who Make it Sound Easy (Especially if They Haven’t Worked With Libraries Before). This is important to understand: digital library projects are complicated, but to somebody who hasn’t worked on one before, they can look pretty easy from the outside. “All you want to do is put some scans online?” says a local vendor eager to get your business. “No problem. We can do it way cheaper than that big company you got a quote from.” This almost never ends well. Vendors who haven’t worked with libraries rarely understand our concerns about metadata, the need to effectively search digitized content, and preservation. If it sounds too good (and too cheap) to be true, it usually is.

8. Metadata is More than Keywords. Although many digital collections include fantastic images, people will still find these by typing words into a search box. Good metadata will make it easier for patrons to discover, understand, and use the materials you put online. For some collections (like a box of unidentified photos), metadata can be a lot of work. For others (like a collection of postcards), it can be pretty straightforward. Before you start scanning anything, make sure you have a plan (and staff available) for describing the materials you’re planning to put online.

9. Plan to Share. Once you get your collection online, don’t keep it to yourself. More people will find and use your materials if you share your metadata. The Digital Public Library of America harvests and hosts metadata from libraries around the country (including North Carolina) and presents it in a simple, easy-to-use interface. This doesn’t replace your digital collection — links from the DPLA will lead users back to your website. Many libraries share digital collections information in their local catalogs, or with national resources like WorldCat. Figuring out how you’ll share your metadata beyond what you present online on your site should be a part of your planning process.

Now, once all those questions are answered and you have an achievable and sustainable plan in place (and know how you’re going to pay for it), it’s time to get down to the details and finally answer those questions about equipment and scanning. Good luck!




More Than Portraits: Possibilities High School Yearbooks have for Historical Research

As the school year comes to a close across the state, it seems like a good time to take a more in-depth look at the wealth of information that can be found in the more than 1,600 high school yearbooks that we have scanned and made accessible on DigitalNC in the past year.  While the most obvious use of these yearbooks is for genealogical purposes, they contain much more than just portraits and can tell a lot about the towns and time periods they come from.

As our high school yearbooks are only available through the year 1964, there is not a lot of integration of North Carolina schools evident in the yearbooks.  However, the yearbooks available in DigitalNC do come from both white and black schools, often in the same towns, dating back to the early 1900s.  This can allow comparison of how the schools operated and a view into life in segregated schools in North Carolina.  For example, in Tarboro, there was Tarboro High School, the white school, and Pattillo High School, the black school.  Our yearbooks from both cover the 1940s-1950s.

from 1949 Chapel Hill High School yearbook "Hillife"

from 1949 Chapel Hill High School yearbook “Hillife”

In many of the yearbooks in the North Carolina High School Yearbooks collection there are extensive sections dedicated to both the clubs and the athletics at the school.  These sections, with many group portraits, action shots, and sometimes even added explanation, provide a glimpse into what extracurricular activities students participated in throughout the years.  For example in the 1949 Chapel Hill High School yearbook  there is a babysitter’s club pictured, and in the 1929 R.J. Reynolds High School Black and Gold yearbook, there is a photograph of the “Salemanship club”.  Beyond being interesting in their own way, this information shows how priorities for school age children and the expected responsibilities they have shift over time.

from 1929 R.J. Reynolds High School yearbook "Black and Gold"

from 1929 R.J. Reynolds High School yearbook “Black and Gold”

Most of the yearbooks contain information on the teachers at the school and the courses and subjects they taught.  Again, like the clubs, this information provides insight into how subject emphasis in school has changed over time.  The page below from the 1963 Lion yearbook from P.W. Moore Junior-Senior High School in Elizabeth City includes photographs from classes that are not often seen anymore, including agriculture, typing, and guidance class.

Some of the classes offered at P.W. Moore Junior-Senior High School in 1963

Some of the classes offered at P.W. Moore Junior-Senior High School in 1963

The yearbooks also contain a lot of images of events that occurred at the schools.  A few weeks ago we pointed out the wonderful May Day images from across the decades.  Other events such as prom, homecoming, and school specific traditions are included in the yearbooks.  Below is a schedule of events from the 1941-1942 school year at Hickory High School.

1941-1942 Hickory High School schedule, from the "Hickory Log."

1941-1942 Hickory High School schedule, from the “Hickory Log.”

Current events of the day are also featured in these yearbooks.  For example, those published during World War II often have heavy patriotic themes and some, such as the High Point High School yearbook from 1945, have whole spreads dedicated to those lost from High Point, particularly fellow classmates, in the war.

Dedication page to those killed in World War II from High Point High School, from the 1945 Pemican

Dedication page to those killed in World War II from High Point High School, from the 1945 Pemican

The advertising section at the back of the yearbooks offer a glimpse at the businesses of the town the school is in, which can be particularly useful for small towns that may not have had their own city directories.  The listings usually include addresses for the businesses, and sometimes, as is the case in the 1960 Pittsboro High School yearbook, photographs of the businesses themselves.  These photographs can be the only images of businesses that shut down years ago.


City Electronics Shop ad in Pittsboro High School's 1960 The Dragonian

City Electronics Shop ad in Pittsboro High School’s 1960 The Dragonian


Henry’s Restaurant ad, in Pittsboro High School’s 1960 The Dragonian

C.E. Jones Co. Bridal Headquarters ad, in the Pittsboro High School 1960 The Dragonian

C.E. Jones Co. Bridal Headquarters ad, in the Pittsboro High School 1960 The Dragonian

As graduation approaches for high-schoolers across the state, spend some time looking through our high school yearbook collection  and take a peek into life as a high school student fifty years or more ago.  If you know of high school yearbooks at a local institution in North Carolina that are not currently included in our collection, go here to learn more about how to get them included on DigitalNC.

New Feature: On This Day in North Carolina History

Head over to the North Carolina Newspapers collection for a new feature: This Day in North Carolina.  Users can now easily pull up all of the newspapers from the collection that were published on this day in years past.  Today’s search — December 5 — brings up ten different issues, ranging in date from 1826 to 1997.  It’s fascinating reading.  Here’s a sample of what we found:
Many people may not realize that North Carolina has been in the lottery business for centuries.  On December 5, 1826, the Catawba Journal published an announcement for a lottery to fund a history of North Carolina.
On December 5, 1941, just two days before the attack on Pearl Harbor, the Southern Pines Pilot published an exclusive interview with the Prime Minister of Canada, talking about the Canadian war effort.
On December 5, 1942, the Carolina Times led with a big headline announcing that the city of Raleigh had just hired on African American policeman.
On December 5, 1988, Black Ink published an analysis of Jesse Jackson’s presidential candidacy.
Stay tuned for news and updates on the North Carolina Newspapers project, including a NC Newspapers Twitter feed to be released in January.