Archival Repositories: Where’s the Data?

“Where can we find a comprehensive list of archival repositories in the United States?” This is a question Eira Tansey (University of Cincinnati) and I (Ben Goldman, Penn State University) asked in early 2016 when we started a project to map the vulnerabilities of American archives locations to the future impacts of climate change. With the amazing help of geospatialists at Penn State (Nathan Piekielek, Geospatial Services Librarian) and Tara Mazurczyk (Ph.D. Candidate, Department of Geography) we explored how sea level rise, storm surge, temperature fluctuations and increased precipitation might effect 1,232 archival locations in the continental United States. Eira and I shared initial findings at the Research Forum of the 2017 Society of American Archivists’ Annual Conference, and submitted (with our collaborators) a manuscript for publication this past July.

While we are excited to share the results of this project with our archival colleagues and comrades, one of our lingering disappointments with this effort has been the lack of a comprehensive dataset to work with. The best option we could find came from OCLC’s ArchiveGrid (thanks to Bruce Washburn and Merrilee Proffit), which provided a useful dataset to conduct our research, but clearly did not fully reflect the size of the archival community. By contrast, IMLS’s Museum Universe data file contains over 30,000 entries. It became clear to us that in order to fully understand the future impacts of climate change on documentary heritage in the U.S., we needed better data.

Now, thanks to the Society of American Archivists Foundation, we can begin to pull together a better dataset. Eira and I were awarded a $5,000 Strategic Growth grant from SAA Foundation in May, and over the course of one year (July 2017 – June 2018) are attempting to find, aggregate, standardize and openly share a vast dataset on archival repository locations, with help from an amazing Research Assistant, Whitney Ray.

We hope to use this blog to share our progress and highlight interesting or useful information related to this effort. We welcome the wisdom and comments of our archival colleagues everywhere, so please don’t hesitate to reach out if you have any thoughts or ideas!




Dear Archivist, or, How I learned to stop worrying and cold-called the U.S. archival community

Hi! My name is Whitney, and I’m very excited to be the Research Assistant helping Ben and Eira with RepoData. Our goal is to create a standardized, centralized, and interoperable data set of archival repositories in the United States. We will use this data set to create a map that depicts possible effects of climate change on archival repositories and their particular vulnerabilities. My job is to gather information about the existence and location of archival repositories.

In early September I began to reach out to archival organizations. I used the Directory of Archival Organizations on the website of the Society of American Archivists (SAA) and list of groups in the Regional Archivists Associations Consortium (RAACs). Using the most up-to-date contact information I could find, either on the SAA site or on the websites of the organizations, I emailed contacts for the archival organizations and RAACs. Since many overlapped, I used my best judgment in first contacting the overarching groups and then, if there appeared to be a gap in our data collection, the sub-groups.

I also began to reach out to state archivists from the Directory of State Archives and representatives from State Historical Records Advisory Boards (SHRABs). The two directories above formed the core of my outreach. However, I also reached out to groups that I found through either recommendation by the organizations or through links on their websites.

In total, I’ve contacted 111 archival groups and SHRAB affiliates. As a team, Ben, Eira, and I have collected information on about 18,000 repositories, although we suspect that a lot of these repositories are duplicate entries: more on this, and what we plan to do about it, in a different blog post.

For me, this has been a learning opportunity in outreach and research on the profession. (My apologies again to the archivist in Oklahoma City whom I emailed twice as representative for two organizations!) It’s been great to see how resourceful archivists are in getting the word out about their collections through affiliation with a group and through descriptive information on their websites.

Next I’ll normalize data fields, categorize repositories, and find latitudinal and longitudinal coordinates. Our plan includes distinguishing between mailing addresses and the physical locations of repositories, particularly since mapping them can tell different stories about vulnerabilities to climate change.