The Family History Metadata Working Group (FHMWG), Savemetadata.org, is a collaboration between leaders in the family history software industry. We are working together to improve the process of preserving critical information about family photos in order to make them more valuable to the family history community. The technical term for this critical information is metadata: the who, what, where, when and how about a photo. Together, our group has developed a set of recommendations for providers in the photo software industry. These recommendations improve how the most relevant family history metadata is captured, transferred, and preserved.
Our recommendations were carefully crafted in order to preserve and leverage what photo software is already designed to do. We make use of existing standards for how metadata is organized and formatted, but we make suggestions about what specific data should be selected for the purpose of family history. Most importantly, we are concerned with the preservation of this data. We believe that this information should be saved directly within the photo files themselves. This important step in the metadata journey is called embedding metadata.
Embedded metadata is the digital equivalent of a handwritten note on the back of a physical photo. Anyone who has ever attempted to research their family history knows the value of these notes. The idea that information about a digital photo would exist within the photo file itself may seem obvious on the face of it, but more often than not, this information is stored outside of the file and never embedded into it. The reason for this state of affairs in the photo software industry is a little complicated, so let’s travel the metadata journey together to unpack it.
The Metadata Journey
Digital photos have many origin stories – scans, screenshots, photoshops – so for simplicity’s sake, let’s start with a fresh born digital photo. A photo you just took with your smartphone is rich with metadata: your precise GPS coordinates of your device, the date and time the photo was taken, the model of phone used, and potentially even information about the people or objects in the photo, sourced from object recognition software. Let’s take a look at what happens when you upload this photo to the internet via a social media app, a cloud storage system, or some other photo management service.
- RETAIN: The first thing a software system must decide is whether to retain the metadata in your photo or strip it. There are many reasons for a business to choose to strip the metadata, from reducing file size, to protecting privacy, to claiming ownership.
- READ: Whether a platform has chosen to retain or strip the photo metadata, the next step is to read some or all of the metadata into a database. At this step, the metadata is no longer bound to the photo. Using our analogy from before, the notes have been copied off the back of the photo and into a separate notebook.
- PRESENT: The system can now present the metadata stored in the database to the user. The system can be designed to pick and choose what data to present and in what format. The metadata can also be used to create interesting photo presentations, whether sorted in albums, pinned to a map, or organized into a timeline, for example.
- EDIT: Some platforms allow users to make changes to the metadata, however, these changes are typically saved back to the database and not to the original photo file. This is a practical approach driven by factors such as system performance and computational cost.
- SEARCH: Some platforms allow users to search for photos based on fields like date, location, or people. Metadata-based search is an essential tool for family history research in digital platforms.
The metadata journey up to this point largely determines how useful a photo platform might be for a typical consumer in the family history community. The features listed above have clear benefits most of us are quite familiar with. However, there are two critical steps left in this journey that are less common but essential for both long-term user benefit and digital preservation purposes.
- EXCHANGE: Some photo platforms allow users to import and/or export metadata in one of several typical file formats: xls, csv, json, etc. These files contain metadata that was saved to the database and are linked to the photos but are not embedded in them. Some platforms give the user the ability to authorize the release of metadata through a direct integration with another platform.
- EMBED: In this final step along the journey, a platform has the ability to write metadata from their database back into the file itself. If the metadata is embedded then when a user downloads the photo it travels with the photo. This is the ultimate step for digital preservation.
The goal of the FHMWG is to grow the ecosystem of platforms that support embedded photo metadata.
Choose Your Own Adventure
If you are a vendor in the family history software industry, no matter how you choose to implement the FHMWG standards, you are helping to enhance the value of digital media for family history.
The simplest thing any vendor can do is retain metadata. This requires nothing more than to keep original file metadata undisturbed after upload. Despite being the first and simplest step it is absolutely critical to the preservation mission of the FHMWG and is the foundation for the rest of the journey.
The FHMWG recommendations for metadata fields become most meaningful in the read, present, edit, and search steps of the journey. The recommendations provide categories for grouping fields in existing standards like Dublin Core (DC), Adobe XMP (Extensible Metadata Platform), IPTC (International Press Telecommunications Council) and others. This mapping would be implemented in the following ways at each step:
- READ: the recommendations specify which fields should be read.
- PRESENT: whether your system is reading all the recommended fields or not, metadata can still be organized and presented to users under the categories specified in the recommendations.
- EDIT: the database schema can be organized using the recommended categories.
- SEARCH: search options and results can be organized using the recommended categories.
It is not necessary to achieve all 4 of the steps above in order to implement any one of them independently. Providers hoping to implement the FHMWG recommendations might pick any of the steps as a starting point and expand from there.
By taking the first step you are joining a community dedicated to preserving family history. Our ultimate goal is to develop an ecosystem where metadata can be easily exchanged between photo platforms and embedded into photos as they journey to and from the internet.