Introducing Data Models for Human(itarian) Services

This was originally posted at Sarapis

Immediately after a disaster, information managers collect information about who is doing what, where, and turn it into “3W Reports.” While some groups have custom software for collecting this information, the most popular software tool for this work is the spreadsheet. Indeed, the spreadsheet is still the “lingua franca” of the humanitarian aid community, which is why UNOCHA’s Humanitarian Data Exchange project is designed to support people using this popular software tool.

After those critical first few days, nonprofits and government agencies often transition their efforts from ad hoc emergency relief and begin to provide more consistent “services” to the affected population.

The challenge of organizing this type of “humanitarian/human services” information is a bit different than the challenges associated with disaster-related 3W reports, and similar to the work being done by people who manage and maintain persistent nonprofit services directories. In the US, these types of providers are often called “211” because you can dial “211” in many communities in the US to be connected to a call center with access to a directory of local nonprofit service information.

During the ongoing migrant crisis facing Europe, a number of volunteer technical communities (VTCs)  in the Digital Humanitarian Network engaged in the work of managing data about these humanitarian services. They quickly realized they needed to come up with a shared template for this information so they could more easily merge data with their peers, and also so that during the next disaster, they didn’t have to reinvent the wheel all over again.

Since spreadsheets are the most popular information management tool, the group decided to focus on creating a standard set of column headers for spreadsheets with the following criteria:

To create this shared data model, we analyzed a number of existing service data models, including:

  • Stand By Task Force’s services spreadsheet
  • Advisor.UNHCR services directory
  • Open Referral Human Service Data Standard (HSDS)

The first two data models came from the humanitarian sector and were relatively simple and easy to analyze. The third, Open Referral, comes from a US-based nonprofit service directory project that did not assume that spreadsheets would be an important medium for sharing and viewing data.

To effectively incorporate Open Referral into our analysis, we had to convert it into something that could be viewed in a single sheet of a spreadsheet (we call it “flat”). During the process we also made it compliant with the Humanitarian Exchange Language (HXL), which will enable Open Referral to collaborate more with the international humanitarian aid community on data standards work. Check out the Open Referral HSDS_flat sheet to see the work product.

We’re excited about the possibility that Open Referral will take this “flat” version under their wing and maintain it going forward.

Once we had a flat version of Open Referral, we could do some basic analysis of the three models to create a shared data model. You can learn about our process in our post “10 Steps to Create a Shared Data Model with Spreadsheets.”

The results of that work is what we’re calling the Humanitarian Service Data Model (HSDM). The following documents and resources (hopefully) make it useful to you and your organizations.

We hope the HSDM will be used by the various stakeholders who were involved in the process of making it, as well as other groups that routinely manage this type of data, such as:

  • member organizations of the Digital Humanitarian Network
  • grassroots groups that come together to collate information after disasters
  • big institutions like UNOCHA who maintain services datasets
  • software developers who make apps to organize and display service information

I hope that the community that came together to create the HSDM will continue to work together to create a taxonomy for #service+type (what the service does) and #service+eligibility (who the service is for). If and when that work is completed, digital humanitarians will be able to more easily create and share critical information about services available to people in need.

* Photo credits: John Englart (Takver)/Flickr CC-by-SA

Creating a Shared Data Model with a Spreadsheet

Over the last year, a number of clients have tasked me with bringing datasets from many different sources together. It seems many people and groups want to work more closely with their peers to  not only share and merge data resources, but to also work with them to arrive at a “shared data model” that they can all use to manage data in compatible ways going forward.

Since spreadsheets are, by far, the most popular data collection and management tool, using spreadsheets for this type of work is a no-brainer.

After doing this task a few times, I’ve gotten confident enough to document my process for taking a bunch of different spreadsheet data models and turning them in a single shared one.

Here is the 10-step process:

  1. Create a spreadsheet. First column is for field labels. You can add additional columns for other information you’d like to analyze about the field such as its data type, database name and/or reference taxonomies (i.e. HXL Tag).
  2. Place the names of the data models you’ve selected to analyze in the column headers to the right of the field labels.
  3. List all the fields of the longest data model on the left side of the sheet under the “Field Label” heading.
  4. Place an “x” in the cells of the data model that contain the field to indicate it contains all the fields documented in the left hand column.

    How to Create a Shared Data Model with Spreadsheets
    This is a sheet comparing three different data models with a set of field labels and a “taxonomy convention”.
  5. Working left to right, place an  “x” to indicate when a data model has a field label contained therein. If the data model has that field but uses a different label, place that label in the cell(4a). If it doesn’t have that field, leave the cell blank. Add any additional fields not in the first data model to the bottom of the Field Labels column (4b).

  6. Do the same thing for the next data models.
  7. Once you have all the data models documented in this way, then you can look and see what the most popular fields are by seeing which have the most “x”s. Drag those rows to the top, so the most popular fields are on the top, and the least popular fields are on the bottom. I like to color code them, so the most popular fields are one color (green), the moderately popular ones are another (yellow) and the least popular but still repeated fields are another (red).
  8. Once you have done all this, you should present it to your stakeholder community and ask them for feedback. Some good questions are: (a) If our data model were just the colored fields, would that be sufficient? Why or why not? What fields should we add or subtract? (b) Data model #1 uses label x for a field while data model #2 uses label y. What label should we use for this and why?

    How to Create a Shared Data Model with Spreadsheets (1)
    Give people a “template” they can use to actually manage their data.
  9. Once people start engaging with these questions, layout the emerging data model in a new sheet, horizontally in the first row. Call this sheet a “draft template”. Bring the color coding with it to make it easier for people to recognize that the models are the same. As people give feedback, make the changes to the “template” sheet while leaving the “comparison” sheet as a reference. Encourage people to make their comment directly in the cell they’re referencing.
  10. Once all comments have been addresses and everyone is feeling good about the template sheet, announce that sheet is the “official proposal” of a shared data model/standard. Give people a deadline to make their comments and requests for changes. If no comments/changes are requested – congratulations: you have created a shared data model! Good luck getting people to use it. 😉

Do you find yourself creating shared data models? Do you have other processes for making them? Did you try out this process and have some feedback? Is this documentation clear? Tell me what you’re thinking in the comments below.