Written by Alison T. Miner. This semester I’m an intern at ICFA, working with the metadata about photographs, in several legacy databases. Along with the Metadata and Cataloging Specialist, Anne-Marie Viola, I’m doing research and planning so that we can move more than 70,000 records from 5 different databases into the new collections management software, ICA-AtoM.
For the past month or so I have been working on creating an XML file from an existing spreadsheet of place names so that I can import the terms as a SKOS taxonomy to use in our new database. Location has always been central to accessing ICFA’s Byzantine photograph collection. The thousands of photographs of art and architecture in our collections – acquired to enable scholars to compare styles and study a range of works from Italy to Russia – have been physically arranged first by type of work (e.g. architecture, metalwork, mosaics) and then by location.
But just to make things more difficult, ICFA’s collections deal particularly with ancient sites and places. Many of these have changed names over the millennia, or are spelled differently in different languages. ‘Istanbul’ and ‘Constantinople’ is the classic example of this. But also the ‘Hagia Sophia‘ can legitimately be called ‘Aya Sofya,’ ‘St. Sophia,’ ‘Santa Sophia,’ or ‘The Church of the Holy Wisdom’.
So our previous subject specialist, Günder Varinlioğlu, made a list of the common places referenced in our photograph collections, along with their usual alternate names. The list is more than two thousand items long – but in the realm of controlled vocabularies, this list is actually very small! (To see a massive one, look at the Pleiades website. They are trying to address name changes and distinctions between modern cities and ancient sites by creating a comprehensive map and webpage for each ancient city, and recording the dates of different names or occupations.)
But there are further nuances to the difficulties in cataloging locations. The same place can be referred to at many different levels:
Broadly – by country or area:
The country of Turkey, or Anatolia:
Specifically – by city or region:
The Princes Islands, which are officially neighborhoods in the city of Istanbul:
Heybeli island, one of the Princes Islands:
Even more precisely – by building or excavation site:
Catalogers of images must decide what the appropriate level of detail is for their collections, and then try to be consistent.
ICA-AtoM, the collections management software we are implementing, organizes place terms within a hierarchical taxonomy using an RDF schema called SKOS. SKOS uses an XML format that creates records for each term, with fields for a preferred name, alternate names, and related terms. In order to make the terms and their relationships appear correctly in our new AtoM database, I need to encode Günder’s list according to SKOS rules by adding tags in XML form, making a document that looks a bit like the HTML used for web pages.
The first challenge in making our SKOS file goes back to that earlier issue that I described – how do we make connections between different levels of description? How do we make sure that Heybeli island shows up when someone looks at sites in “Turkey?” SKOS is made to be simple, so it creates taxonomies by allowing you to categorize terms as ‘broader’ or ‘narrower’ than others. So I can use the following SKOS fields to create a hierarchy:
The second challenge is making the document without typing the SKOS XML tags two thousand times. There are lots of taxonomy tools out there to help you create the hierarchical relationships, like genus and species trees, that our SKOS file requires. However, many of these tools are more complicated than we need for our simple list, so we chose to use an ingenious technique described in the “Container List Encoding Workflow” from the Northwest Digital Archives, which uses MS Word’s Mail Merge function to paste the values from our spreadsheet into a XML template set up with our SKOS tags.
Once our template was ready, we created a mailmerge document identifying where each column of data should appear and then ran the program to create a new place term record from each row of our Excel document. Here’s the code for the terms, “Burgaz,” “Princes Islands,” and “Heybeliada.”
Now we have a file, written with proper SKOS terms, in proper RDF/XML format.
Next we upload it to AtoM, and we have a list of terms for the countries, regions and sites with which to catalog our images!
Perhaps this seems like a lot to go through for a list of locations. I write about it to illustrate how things that seem relatively straightforward (“and we’ll have a dropdown list of common locations!”) actually require much careful planning and research by your humble archivists. Things that Google lets us take for granted on the web can be months-long projects in an archive or library.
But all the work and discussion is worth it, if more people can learn about and use our collections. Eventually, we hope that the AtoM database will allow our users to do just that.