Digital North Carolina Blog

Digital North Carolina Blog

This blog is maintained by the staff of the North Carolina Digital Heritage Center and features highlights from the collections at DigitalNC, an online library of primary sources from institutions across North Carolina.

RSS Subscribe By Mail UNC Social Media Statement


Viewing entries tagged "behind the scenes"


Loray Digital Archive expanded to include 1980s union efforts at Firestone Mill

We were excited this past semester to partner with the AMST 475H, Documenting Communities class here at UNC to show them how a digitization project works from star to finish.  This is a guest post from the class.

Written by: Dani Callahan and Lucas Kelley

New material that documents the unionization of the Gastonia’s Firestone Mill have been added to DigitalNC’s existing collection on the mill: the Loray Digital Archive. The Gaston County Museum of Art and History provided the materials for digitization, and UNC-Chapel Hill students in Professor Robert Allen’s Documenting Communities course scanned the material, researched the unionization movement, and added metadata to the documents.

The unionization of the Firestone Mill occurred in the late 1980s and was particularly contentious both within the mill community and throughout the region. The violent unionization efforts of the 1920s, exemplified in the Loray strike of 1929, had left deep wounds within Gastonia, and area residents and workers had traditionally distrusted subsequent unionization attempts. The widespread economic downturn in the textile industry in the 1980s, however, meant harsher conditions and less pay for the workers at Firestone, and some workers hoped the United Rubber Workers Union could provide protection from the difficult economic climate.

 Pro-union pamphlet distributed to employees at Firestone Mill in the late 1980s. It was produced by the AFL-CIO.

The materials added to the Loray Digital Archive document the pro-union and anti-union campaigns. Each side sought to attract workers to their cause with flyers, posters, stickers, buttons, and pamphlets. Initially, the anti-union forces held off the unionization attempt in 1987. Widespread media coverage turned the referendum into a political circus and leaders of the pro-union movement could not overcome area residents’ distrust. Yet a year later, Firestone workers voted to join the union in a campaign that was much more subdued. The success of pro-union forces was due in large part to the diligence of the union’s committee members working inside the mill. While the 1987 vote had turned into a regional and even national media circus, the 1988 vote remained an internal debate housed within Firestone itself. When the workers at the Firestone Mill voted on April 14th, 1988 to join the United Rubber, Cork, Linoleum and Plastic Workers by a narrow margin, it was a victory nearly sixty years in the making.  Click the link view all the materials from the 1980s union effort.


Students help bring new light to the Wilmington riots of 1898

In July, the North Carolina Digital Heritage Center was pleased to welcome a group of middle school students from Williston Middle School and Friends School Of Wilmington. With them were writers Joel Finsel and John Jeremiah Sullivan and staff from the Cape Fear Museum, all of whom worked with the students over the past semester.  This visit was the culmination of a project for the students who had studied the Wilmington riots of 1898 and worked specifically with original copies of the Daily Record, held by the Cape Fear Museum. 

Original issues of the Record, which was the black-owned newspaper in Wilmington in the late 1890s, are incredibly hard to find: their offices were destroyed during the riots.  (Learn more about the riots on NCpedia.)  The museum staff brought along their copies of the paper, as well as original copies of the reaction to the riots as found in both black-owned and white-owned papers across the country.  We scanned all of the materials on site with help from UNC-Chapel Hill Libraries’ Digital Production Center staff. Students watched and got to learn more about our work.  Now all of those materials are online not only for future students to work with, but for anyone from the general public to access.  

To learn more about the students’ work, read this great article from the Wilmington Star News . As the article states: “The project is still looking for any more copies of the Record that might turn up… Anyone who finds one is urged to email dailyrecordproject@gmail.com.”

And to view more newspapers on our site, visit our newspaper site here


200 Partner Institutions – A Digital Heritage Center Milestone

 

Celebrating 200 Partners

When people ask me to sum up the Digital Heritage Center, I usually tell them what we do. We provide digitization and digital publishing services to cultural heritage institutions throughout North Carolina. And DigitalNC.org has some pretty healthy stats to back it up.

2.7 million scans online

87,590 total objects, including…

Over 57,000 newspaper issues

More than 6,100 college and high school yearbooks

16,000+ photographs

505 scrapbooks

Beyond this, the site receives about 280,000 pageviews per month, 58% of which come from users in North Carolina. That’s a lot of our state’s history being shared online, 24/7.

But really, the heart of the Digital Heritage Center is people. It’s the hard work and expertise of our staff making North Carolina’s history available online. It’s the guidance and support we get from the staff at the State Library of North Carolina, which provides most of our funding, and the University of North Carolina at Chapel Hill’s Wilson Library, which hosts the Center and its technology. Above all of this, it’s the partnerships we have with cultural heritage professionals from Bryson City to Ocracoke. That’s why I’m so pleased to announce that this month, we add the following number:

200 partner institutions

in 119 communities,

in 73 counties

Since opening its doors in 2009, the Digital Heritage Center is showing the nation that North Carolina has a strong and collaborative cultural heritage community. This state has so many deep, rich, compelling — and quirky collections. They are stewarded by staff who have a passion for preservation, and a genuine love of providing access to users near and far. We are proud to be a part of that community, offering many institutions the opportunity to bring their collections to a broader audience for the first time.

We hope you will take the chance to explore the map above, and DigitalNC.org. And we hope that you’ll find a contributing institution in your area and stop in. Thanks for reading, and for your support. And here’s to 200 more.

Cheerleaders, From Western Carolina University's 1940 edition of the Catamount yearbook.

Cheerleaders, From Western Carolina University’s 1940 edition of the Catamount yearbook.

 


Have Scans, Will Travel? Hosting Your Scans at DigitalNC

Moving Truck Transferring Family Possessions, from the Gaston County Museum of Art & History

Moving Truck Transferring Family Possessions, from the Gaston County Museum of Art & History

The Digital Heritage Center does a lot of scanning on some really versatile machines. It’s one of the practical sides to our mission, and we take pride in being able to provide that service.

What is perhaps less well known is that we also help cultural heritage institutions publish items they’ve scanned themselves. Many cultural heritage institutions have flatbed or book scanners as well as willing staff and volunteers, but lack the technical infrastructure to host those scans for the public.

We’ve helped institutions …

  • who needed to migrate from ailing databases or systems they can no longer support,
  • who wanted to be able to full-text search their materials, a function they couldn’t fulfill through their current website,
  • who offered their digital files to on-site users, but who were seeking a broader audience.

When we start this conversation, here are some of the questions we ask:

  • Tell us about the original physical objects* – does your institution still have them? are there any rights or privacy concerns to sharing these online? what kind of subject matter is represented?
  • Tell us about the digital files – who originally created them? how many are there? where do they live? what file types? how are they organized? is this an ongoing project? do you have any metadata already?

If the files are a good fit for DigitalNC, they get transferred to hard drives, metadata is created or amended, and items appear on the site alongside the scans we create here at the Center. If you work at a cultural heritage institution eligible to work with the Center, have or are currently creating scans, and are interested in adding these to DigitalNC, contact us. We may be able to give them a home.

* If there were any. We can help with born-digital items as well.


North Carolina Newspaper Digitization Part 3: This is How We Do It

Greensboro Daily News Ad, March 2, 1934

Greensboro Daily News Ad, March 2, 1934

Like Jeopardy!, I want to tell you the answer before I get to the question.

Following a newspaper digitization and markup standard helps us plan for the future and makes it easier for us to work with vendors, open-source software, and other libraries and archives.

I say this up front, because when we explain how we digitize and share newspapers the frequent response is to ask why we do it the way we do. I think this is because our process is more labor intensive than people expect. It’s definitely not the only way, but we’re committed to this path for right now because it accommodates multiple formats (microfilm, print, born-digital), fits our current digitization capacity, and results in a system we think is flexible and extensible.

That standard I mentioned above comes out of the Library of Congress’ National Digital Newspaper Program (NDNP). All of our newspaper work is NDNP compliant, which means we follow that project’s recommendations for how to structure files, the type of metadata to assign to those files, and also the markup language that tells the computer where words are situated on each page (very helpful for full-text search).

I’ll give you a broad outline of our workflow and the tools we use. However, if you want more specific technical details, head over to our account on GitHub.

Screenshot of PaperBoy!

Screenshot of PaperBoy!

Let’s say one of our partners is interested in having us digitize a print newspaper. We’ll start by scanning each page separately on whichever machine works for the paper’s size. Because the NDNP standard requires page-level metadata, we’ve created a lightweight piece of software that helps us take care of some of that while we scan. Affectionately dubbed “PaperBoy,” this program allows the scanning technician to track page number, date, volume, issue, and edition for each shot. While it slows down scanning a little bit, it speeds up post-processing metadata work quite a lot.

Once the scanning’s complete, we process the files to create derivatives that serve different needs. We use ABBYY Recognition Server to get those multiple formats:

  1. a JPEG2000 image that’s excellent quality yet small in file size
  2. an XML file that includes computer-recognized text from the image along with coordinates that indicate the location of each word on that image
  3. a .pdf file that includes both the image and searchable text.

Now that we have the derivatives, we begin filling out a spreadsheet with page-level metadata. We first add the metadata created using Paperboy and then we run through the scans page by page, correcting any mistakes found in the Paperboy output and adding additional metadata. This also helps us quality control the scans and gives us a chance to find skipped pages.

How much metadata do we do? You can download a sample batch spreadsheet from GitHub, if you’re interested in the specifics, but it includes the PaperBoy output as well as fields like Title, our name (Digital Heritage Center) as batch-creators, and information about the print paper’s physical location. A lot of those fields stay the same across numerous scans or can be programmatically populated with a spreadsheet formula, to help make things go faster.

Once we have the spreadsheet and scans complete, scripts developed by our programmer (also available on GitHub) use those spreadsheets to figure out how to rearrange the files and metadata into packages structured just the way the NDNP standard likes them. The script breaks out each newspaper issue’s files into their own file folder, renaming and reorganizing the pages (if needed). The script also creates issue-level XML files, which tag along inside each folder. These XML files describe the issue and its relation to the batch, and include some administrative metadata about who created the files, etc.

Newspaper files before processing (left) and after (right).

Newspaper files before processing (left) and after (right).

The final steps are to load our NDNP-compliant batches into the software we use to present it online, and to quality control the metadata and scans.

If you think about it, newspapers have a helpfully consistent structure: date-driven volumes, issues, and editions. But there isn’t much else in the digital library world quite like them, so more common content management systems can leave something to be desired for both searching and viewing newspapers.  Because of this, and because there’s just so MUCH newspaper content, we use a standalone system for our newspapers: the Library of Congress’ open source newspaper viewer, ChronAm. It’s named as such because it also happens to be the one used for the NDNP’s online presence: the Chronicling America website.

While not perfect, this viewer does really well exploiting newspaper structure. It also allows you to zoom in and out while you skim and read, and it highlights your search terms (courtesy of those XML files created by ABBYY). Try it out on the North Carolina Newspapers portion of our site.

“Can’t you just scan the newspaper and put it online as a bunch of TIFs or JPGs?” Sure. That happens. But that brings me back around to the why question. We love newspapers (most of the time) and love making it as easy and intuitive to use them as we can. We think it’s important to exploit their newspapery-ness, because that’s how users think of and search them.

We also believe that standards like the one from NDNP are kind of like the rules of the road. While off-roading can be fun, driving en masse enables us to be interoperable and sustainable. Standards mean we have a baseline of shared understanding that gives us a boost when we decide we want to drive somewhere together.

This post’s bird’s eye view (perhaps a low-flying bird) doesn’t include more specific questions you may be asking (“What resolution do you use when you scan?” “You didn’t explain METSALTO!”) I also just tackled our print newspaper procedure, because it’s the most labor intensive. When we work with digitized microfilm and born-digital papers the procedure is truncated but similar.

I hope this post as well as part 1 and part 2 of this series give you a sense of what’s involved in our newspaper digitization process and why we do it the way we do. As always, we’re happy to talk more. Just drop us a line.


Looking Back at DigitalNC.org in 2014

Title page from the 1956 Buccaneer, from East Carolina College, the most popular item on DigitalNC.org in 2014.

Title page from the 1956 Buccaneer, from East Carolina College, the most popular item on DigitalNC.org in 2014.

The North Carolina Digital Heritage Center had a great year in 2014. We continued to work with partners around the state on digitization projects and added a wide variety of material to DigitalNC.org, making it easier than ever for users to discover and access rare and unique materials from communities all over North Carolina.

As we look back on our work over the past year, I wanted to share some of what we’ve learned by looking at our website usage statistics. Like many libraries, the Digital Heritage Center uses Google Analytics to capture information about what’s being used on our website, who’s using it, and how they got there. While there are still lots of questions remaining about usage of DigitalNC, these stats do give us a lot of valuable information.

In 2014, more than 250,000 users visited DigitalNC.org, resulting in more than 1.8 million pageviews. While people visited our website from computers located all over the world, the greatest number by far came from North Carolina. That’s what we expected and hoped to see. More than 200,000 sessions originated in North Carolina, with the users coming from 388 different locations, ranging from over 18,000 sessions in Raleigh and Charlotte to a single visit from the town of Bolivia in Brunswick County (user location is determined by the location of their internet service provider, so this may not tell us exactly where our users are located, but it’s going to be close in most cases).

What did people use on DigitalNC? We were not surprised to find that the most popular collection remains our still-growing library of yearbooks. The North Carolina Yearbooks collection received more than 125,000 pageviews alone, followed by newspapers (44,000) and city directories (11,000). And we were pleased to learn that at least somebody is reading this blog, which received nearly 2,500 pageviews last year. The most popular blog post was our announcement about the digitization of a large collection of Wake County high school yearbooks.

We were also curious to see what single items were the most popular over the past year. The winner, with 438 pageviews, was the 1956 yearbook from East Carolina University. The second most popular was also from East Carolina, the 1930 Tecoan, followed by the 1961 yearbook from the Palmer Memorial Institute and the 1922 yearbook from Appalachian State University.

Lake Hideaway, ca. 1950s, the most popular photo on DigitalNC.org in 2014.

Lake Hideaway, ca. 1950s, the most popular photo on DigitalNC.org in 2014.

The most popular image on our site was from the Davie County Public Library:  a black-and-white photo from the 1950s showing swimmers at Lake Hideaway in Mocksville. Other popular photos included a postcard showing the American Tobacco Company plant in Reidsville, N.C., a group of Stanly County students from 1912, and a portrait of Charles McCartney, the infamous “Goat Man” from the 1950s.

The variety of subjects, locations, and time periods in these photos is representative of the wide-ranging content available in North Carolina’s cultural heritage institutions and on DigitalNC.org. We are honored and excited to have a role in making this content accessible to everyone and look forward to sharing even more of North Carolina’s history and culture online in 2015.


Moving Image Digitization Project, 2014

Moving Image Digitization LogoThe North Carolina Digital Heritage Center is launching a pilot project to help preserve and improve access to historic films and videos in North Carolina’s libraries, archives, and museums. Working with its partners around the state, the Center will select a small number of films and videos, which will then be sent to a vendor to be digitized. The resulting digital files will be published online at DigitalNC.org where they will be made freely available to all users. The original films or videos will be returned to the institutions that contributed them.

We are inviting our existing partners, as well as cultural heritage organizations that have not yet worked with the Center, to nominate moving images from their collections. (See http://www.digitalnc.org/about/participate/ to determine if your organization is eligible.) The Center will evaluate all of the nominations (see evaluation criteria). in an effort to select a variety of content in different formats and which represents the cultural and geographic diversity of North Carolina.

Contact the Digital Heritage Center at digitalnc@unc.edu or (919) 962-4836 if you are interested in suggesting material to digitize or if you have any questions.

Why Is this Just a Pilot Project?

Digitization and online streaming of historic films and videos is complicated and expensive. This project is an effort to determine the cost and viability of providing moving image digitization services to North Carolina Digital Heritage Center partners.

Why Is Everything Being Digitized by a Vendor?

Right now, the Digital Heritage Center has neither the equipment nor the expertise necessary to handle and digitize historic moving images. Working with an experienced vendor will be the most efficient and most affordable way for us to make this content available to users.

How Will the Vendor Be Chosen?

State laws require that we open up this project to a bidding process. While we do not know what vendors will bid and what prices they will offer, we will require that the work is done by a vendor that has experience working with rare and fragile materials.

What If I’m Not Comfortable Sending Materials From My Collection to a Vendor?

We understand that not every institution will want to send unique and fragile materials off site. However, for this project, we have decided that working with an experienced vendor is the best way for us to provide access to this content. Materials that cannot be sent to a vendor will not be selected for digitization as part of this project.

I’ve Got Films That Are in Pretty Bad Shape. Can I Still Suggest Those?

Yes. We understand that many of the historic films in libraries and archives are in poor condition. That’s part of why we want to provide a service like this. We will make sure that we work with a digitization vendor that has experience evaluating the condition of historic films and we will not proceed with digitization if the conversion process is going to harm the original.

What About Copyright?

We will work with each institution to help determine the copyright status of the items nominated for digitization. For films that were created by individuals or companies, we will ask the nominating institution to make an effort to get permission to have the film digitized and shared online.

How Long Will This Take?

We don’t know. That’s part of what we are going to determine as we work on this project. You should expect your materials to be off site for at least a few months.

How Many Films or Videos Will Be Digitized?

It depends. Format, condition, and length are all factors that will contribute to the cost of digitizing historic moving images. We will prioritize the films and videos we’ve selected and digitize as many as we can with what we’ve budgeted for this project.

Selection Criteria for the Moving Image Digitization Project, 2014

  • Is the film or video believed to be unique to your collection, or are there copies at other institutions?
  • Do you have equipment available to play the film or video?
  • Is the media believed to be at least 40 years old?
  • Are you willing to have the film or video sent to a vendor to be digitized?
  • Is there a catalog record or anything describing the content of the film or video?
  • Does the media cover a time period of historical significance?  (For example: Civil War, Great Depression, World War II).
  • Was the film or video created by, or does it contain significant content by or about one of North Carolina’s historically underrepresented communities?
  • Is the media from a county or region that is already represented on DigitalNC.org or other digital library projects?
  • Is there a demonstrated demand for online access to the film or video?  If so, are there examples, such as requests from users or community members?
  • If this media is digitized, is the contributing institution willing to promote the media through press releases and other announcements or programs?