How to Measure the Health of Your Data

February 18, 2020 by Melissa Crowe

How to Measure the Health of Your Data

Photo credit: Adrian Clark/Flickr

What are two words that scare any data analysts? Dirty data.

As governments ramp up data programs, they realize that managing their data is just as important as how they use it. After all, open data programs, performance management programs, finance dashboards, and other data uses all depend on the reliability of the data behind them.

The biggest factors to improving the health of an organization’s data come down to two factors: metadata and data quality. Check out the methods government organizations across the U.S. have developed to help them monitor what’s going on with their data.

Data leaders in Chattanooga recognize that their open data program and their performance management measures depend on the reliability of their data. That’s why they created a dashboard to track the quality of datasets. This public-facing site helps them monitor workflows, data updates, and engagement.

The state of Maryland takes a similar approach with the Data Freshness Dashboard. This dashboard provides daily updates of when data is updated and whether it meets the appropriate updating cycles. It provides a quick way for users to see how current the data is.

Both approaches come down to smart use of metadata. Metadata is only useful if the software application and the people who use it can understand it. As an alternative to an ad hoc convention for describing data, using a metadata standard can enable a dataset to be organized with others and ensure data owners have a complete, standard set of information about each part of their data.

How San Francisco Built Its Metadata Standards

The Wharf San Francisco 2014
Photo credit: Mobilus in Mobili/Flickr

San Francisco’s open data policy requires a metadata standard. That standard provides the information that should be included with published data to help users find data and understand it. The city went through an extensive process to research and develop its metadata standard.

One strategy San Francisco uses to bridge the concerns from helping users understand data to helping them use it is by requiring contact information to be available for every dataset. Clear and specific standards ensure the city’s data can be effectively used and shared.

The city publishes dataset alerts that include, among other items, issues and discontinuation notices. It also publishes daily analytics on its data portal to show the top datasets, referrers, search terms, and embeds.

How New York State Built Its Metadata Standards

The Wharf San Francisco 2014
Photo credit: Yoann Jezequel/Flickr

The metadata elements for the state of New York’s Open Data Handbook align with the Dublin Core and also include some additional elements.

Data.NY.com requires agencies to submit metadata and supplemental documentation with each dataset — for example, data dictionaries and overview documents — to ensure data are fully described. In addition to filling in the fields, agencies are instructed to develop a one-page document with a full explanation of the data collection process, and any limitations.

Open New York uses a common metadata taxonomy to not only maximizes the public’s understanding and interpretation of the data, but also to convey and increase discoverability of high-value datasets.

Related Content