Introducing Data Models for Human(itarian) Services

This was originally posted at Sarapis

Immediately after a disaster, information managers collect information about who is doing what, where, and turn it into “3W Reports.” While some groups have custom software for collecting this information, the most popular software tool for this work is the spreadsheet. Indeed, the spreadsheet is still the “lingua franca” of the humanitarian aid community, which is why UNOCHA’s Humanitarian Data Exchange project is designed to support people using this popular software tool.

After those critical first few days, nonprofits and government agencies often transition their efforts from ad hoc emergency relief and begin to provide more consistent “services” to the affected population.

The challenge of organizing this type of “humanitarian/human services” information is a bit different than the challenges associated with disaster-related 3W reports, and similar to the work being done by people who manage and maintain persistent nonprofit services directories. In the US, these types of providers are often called “211” because you can dial “211” in many communities in the US to be connected to a call center with access to a directory of local nonprofit service information.

During the ongoing migrant crisis facing Europe, a number of volunteer technical communities (VTCs)  in the Digital Humanitarian Network engaged in the work of managing data about these humanitarian services. They quickly realized they needed to come up with a shared template for this information so they could more easily merge data with their peers, and also so that during the next disaster, they didn’t have to reinvent the wheel all over again.

Since spreadsheets are the most popular information management tool, the group decided to focus on creating a standard set of column headers for spreadsheets with the following criteria:

To create this shared data model, we analyzed a number of existing service data models, including:

  • Stand By Task Force’s services spreadsheet
  • Advisor.UNHCR services directory
  • Open Referral Human Service Data Standard (HSDS)

The first two data models came from the humanitarian sector and were relatively simple and easy to analyze. The third, Open Referral, comes from a US-based nonprofit service directory project that did not assume that spreadsheets would be an important medium for sharing and viewing data.

To effectively incorporate Open Referral into our analysis, we had to convert it into something that could be viewed in a single sheet of a spreadsheet (we call it “flat”). During the process we also made it compliant with the Humanitarian Exchange Language (HXL), which will enable Open Referral to collaborate more with the international humanitarian aid community on data standards work. Check out the Open Referral HSDS_flat sheet to see the work product.

We’re excited about the possibility that Open Referral will take this “flat” version under their wing and maintain it going forward.

Once we had a flat version of Open Referral, we could do some basic analysis of the three models to create a shared data model. You can learn about our process in our post “10 Steps to Create a Shared Data Model with Spreadsheets.”

The results of that work is what we’re calling the Humanitarian Service Data Model (HSDM). The following documents and resources (hopefully) make it useful to you and your organizations.

We hope the HSDM will be used by the various stakeholders who were involved in the process of making it, as well as other groups that routinely manage this type of data, such as:

  • member organizations of the Digital Humanitarian Network
  • grassroots groups that come together to collate information after disasters
  • big institutions like UNOCHA who maintain services datasets
  • software developers who make apps to organize and display service information

I hope that the community that came together to create the HSDM will continue to work together to create a taxonomy for #service+type (what the service does) and #service+eligibility (who the service is for). If and when that work is completed, digital humanitarians will be able to more easily create and share critical information about services available to people in need.

* Photo credits: John Englart (Takver)/Flickr CC-by-SA

Creating a Shared Data Model with a Spreadsheet

Over the last year, a number of clients have tasked me with bringing datasets from many different sources together. It seems many people and groups want to work more closely with their peers to  not only share and merge data resources, but to also work with them to arrive at a “shared data model” that they can all use to manage data in compatible ways going forward.

Since spreadsheets are, by far, the most popular data collection and management tool, using spreadsheets for this type of work is a no-brainer.

After doing this task a few times, I’ve gotten confident enough to document my process for taking a bunch of different spreadsheet data models and turning them in a single shared one.

Here is the 10-step process:

  1. Create a spreadsheet. First column is for field labels. You can add additional columns for other information you’d like to analyze about the field such as its data type, database name and/or reference taxonomies (i.e. HXL Tag).
  2. Place the names of the data models you’ve selected to analyze in the column headers to the right of the field labels.
  3. List all the fields of the longest data model on the left side of the sheet under the “Field Label” heading.
  4. Place an “x” in the cells of the data model that contain the field to indicate it contains all the fields documented in the left hand column.

    How to Create a Shared Data Model with Spreadsheets
    This is a sheet comparing three different data models with a set of field labels and a “taxonomy convention”.
  5. Working left to right, place an  “x” to indicate when a data model has a field label contained therein. If the data model has that field but uses a different label, place that label in the cell(4a). If it doesn’t have that field, leave the cell blank. Add any additional fields not in the first data model to the bottom of the Field Labels column (4b).

  6. Do the same thing for the next data models.
  7. Once you have all the data models documented in this way, then you can look and see what the most popular fields are by seeing which have the most “x”s. Drag those rows to the top, so the most popular fields are on the top, and the least popular fields are on the bottom. I like to color code them, so the most popular fields are one color (green), the moderately popular ones are another (yellow) and the least popular but still repeated fields are another (red).
  8. Once you have done all this, you should present it to your stakeholder community and ask them for feedback. Some good questions are: (a) If our data model were just the colored fields, would that be sufficient? Why or why not? What fields should we add or subtract? (b) Data model #1 uses label x for a field while data model #2 uses label y. What label should we use for this and why?

    How to Create a Shared Data Model with Spreadsheets (1)
    Give people a “template” they can use to actually manage their data.
  9. Once people start engaging with these questions, layout the emerging data model in a new sheet, horizontally in the first row. Call this sheet a “draft template”. Bring the color coding with it to make it easier for people to recognize that the models are the same. As people give feedback, make the changes to the “template” sheet while leaving the “comparison” sheet as a reference. Encourage people to make their comment directly in the cell they’re referencing.
  10. Once all comments have been addresses and everyone is feeling good about the template sheet, announce that sheet is the “official proposal” of a shared data model/standard. Give people a deadline to make their comments and requests for changes. If no comments/changes are requested – congratulations: you have created a shared data model! Good luck getting people to use it. 😉

Do you find yourself creating shared data models? Do you have other processes for making them? Did you try out this process and have some feedback? Is this documentation clear? Tell me what you’re thinking in the comments below.

DIY Databases are Coming

“The software revolution has given people access to countless specialized apps, but there’s one fundamental tool that almost all apps use that still remains out of reach of most non-programmers — the database.” AirTable.com on CrunchBase

Database technology is boring but immensely important. If you have ever been working on a spreadsheet and wanted to be able to click on the contents of a cell to get to another table of data (maybe the cell has a person’s name and you want to be able to click it to see their phone #, photo, email, etc), then you’ve wished for a DIY database.

I’ve been waiting for this technology for many years and am happy to report that it’s nearly arrived. Two startups are taking on the DIY database challenge from different sides:

Screenshot from 2016-01-20 18:20:27
If you can make a spreadsheet you can make a map with Awesome-Table.


Awesome-Table is a quick and easy tool for creating visualizations of data inside Google Sheets. It offers a variety of searchable, sortable, filterable views including tables, cards, maps and charts. They’re easy to embed so they are great for creating and embedding directory data onto websites. Here’s an awesome table visualization of
worker coops in NYC.


AirTable is a quick and easy way to create tables that connect to and reference each other. This allows for multi-faceted systems you can travel through by clicking on entities. For example, you can define people in one table, organizations in another, and offices in a third, and then connect them all together so a user can browse a list of people, click on an individual’s organization, and then see all that organization’s information, including its many offices. Pretty useful!

The progress of these two startups leads me to believe we’re less than a year or two away from truly lightweight, easy to use, free of cost, DIY database building systems, and an open source one not too long after that.

The increasing accessibility of database technology has a lot of implications. The most obvious one is that it will enable people to build their own information management systems for common use cases like contact directories, CRM systems and other applications that just can’t be done with existing spreadsheet technology. This will make a wide variety of solutions more accessible to people – so if you want to start or run a business, manage common information resource, or just organize personal information better, you’ll enjoy DIY databases very much.

More interesting to me is the implication that they can have for people trying to reform and democratize institutions.

If you spend time in the type of information management systems used by institutions big and small – whether it’s government agencies like the sanitation department or educational ones like high schools or universities, you’ll quickly notice that many of their most useful and critical tools are nothing more than a set of data tables (directories) and visualizations of the data contained therein (search/filterable tables, cards and maps of that data.)

Screenshot from 2016-01-20 18:19:07
Turn spreadsheets into searchable/filterable directories with Awesome-Table.

These very rudimentary but widely used internal software systems not only define the information people within that institution can access and share, but also limits them to very specific workflows that are implicitly or explicitly defined in the software. Since workflows define the work people actually do, the people who control the workflow are also people who control the workers.

If you want to change how an institution does things, you have to be able to change its information management systems. Since current database technology requires specialized software coding skills, changing these systems often turns into a bureaucratic nightmare filled with bottlenecks. First, a specific group of pre-approved people need to agree to design and fund a change, then another specific group of people need to program and implement the change, and yet another group is often tasked with training and supporting users who then have to use the updated system. That creates a lot of potential bottlenecks: executives who don’t know a change is needed or don’t care enough to fund the work; managers who don’t want to get innovated out of a job or don’t know how to design good software; technologists who don’t have the time to implement a change or don’t have the motivation to do the job right. With all those potential bottlenecks it’s easy to see why so many well funded institutions have such crappy software and archaic workflows.

When people try to improve institutions, they are often trying to improve workflows so more can get done with less time and resources. Unfortunately, the people who actually know what changes need to be made are rarely in a position to control the architecture of the databases they use to get things done.

With DIY databases, people within institutions can circumvent all these bottlenecks simply by making superior systems themselves. This can change a lot more than simply the type of information people have access to – it allows them to explore news ways of being productive.  What they’ll inevitably discover, particularly if they’re in an institution that spends a lot of time managing information, is that they can do a better job managing information than many of their bosses.

DIY databases are enabling the type of horizontal and bottom-up innovation essential not just for better functioning institutions, but also more democratic ones. Databases are the “means of production” for many information workers. When they can build and own their own ones, they’ll be able to achieve more ownership of their own work and take another big step towards being able to manage themselves.

Of course, as technology improves and creating you own databases becomes easy, the hard part will certainly become getting peers to use them.  That’s a topic for another day.