Data as hero

The hype over data has become deafening. Small non-profits are obsessed with leveraging big data, and foundations are on the hunt to find the one evaluative metric to rule them all.

We need to take a breath.

Data offers some exciting opportunities in our sector. Predictive algorithms are really good at helping Amazon.com figure out when to sell me things I do not need. So too can we use predictive analysis for social good, like trying to predict the likelihood that a low-income teen will become homeless.

But data is not the end all be all. It is not our savior, and machines do not (and should not) make decisions.

Data, regression, machine learning, etc. are all tools that we can (and should) be using. And while these tools can help illuminate the path forward, they are not the path forward.

The debate around data has become unnecessarily dichotomous and fractious. There should be no debate between people versus data, man versus machine. There is no Skynet.

Those who argue that data is all that matters are equally as wrong as those who say it is all about the people. In the social sector, it ought to be all about achieving social outcomes, and toward that end you need good people and good data.

Importantly, we need to do a better job of teaching our people the promise and pitfalls of data. As we look to incorporate more data into social sector decision making, our organizations need to be better integrated than simply having the so-called data-nerds in one room and social sector specialists in the other.

If we are to become savvy users of data, it is essential that we raise the level of data-literacy for all organizational decision makers.

As things stand now, with so much data-hype, it is far too easy for anyone with a spreadsheet to win an argument, even if they are wrong. Indeed, I often worry that my customers do not question my own analysis enough, or that they assume causation where there is none.

Data is no hero. But people can be. We are more likely to be heroes if we work with the data, but data alone will not save us from anything. Used incorrectly, it has the potential to make things worse.

The trouble with benchmarks

I just got back from the annual Independent Sector conference, which brings together non-profits, foundations, and self-promoting consultants (sorry about that) to discuss the direction of philanthropy.

The theme of the conference echoed that of the online philanthro-sphere, namely, most every session and discussion had something to do with data. I had some nice chats with folks about the emerging Markets for Good platform, attended a session on using communal indicators to drive collective impact, and heard one too many pitches about how this or that consulting firm had the evaluation game pretty much locked up.

What many of these conversations had in common was a focus on setting benchmarks to compare progress against. Benchmarking is a quick and dirty tool for trying to estimate an effect over time, but it can be misleading, a fact that was not discussed in any of the sessions I attended.

Benchmarks essentially require one to measure an indicator level at an initial point in time, using that initial measure as a baseline for the future.

For example, a workforce development program might measure the percentage of people in its programs who found work in the last year, using that percentage as a baseline to compare future employment rates against. A year later, the program would look at this year’s employment rate and compare it against last year’s benchmarks.

In this simplistic scenario, one might assume that if the employment rate is better this year than last year’s baseline, then the program is doing better, and if this year’s employment rate is below the baseline then it is doing worse. But, as the title of this post gives away, there are some things to consider when using baselines.

As I have written in the past, social sciences are particularly complex because there are so many external factors outside our program interventions that affect the lives of those we aim to serve. In the employment baseline example, a worsening economy is likely to have a larger effect than the employment services themselves, all but assuring that the next year’s employment rate will be below the baseline, even if the program was more effective in its second year.

Under the collective impact benchmarking model, we would collectively flog ourselves for results outside of our control. Likewise, we can also see the opposite effect, whereby we celebrate better outcomes against a previous benchmark when the upward swing is not attributable to our own efforts.

So, is benchmarking useless? No, but it also should not be confused with impact. A mantra I preach to my customers and brought up in many conversations at the Independent Sector conference is that it is equally important to understand what your data does say, as well as to understand what it does not.

The simple difference in two outcomes from time A to time B is not necessarily program impact, and cannot necessarily be attributed to our awesomeness.

Benchmarking tells us if an outcome is higher or lower than it was in the previous period. But subtraction is not really analysis. The analysis is trying to tease out the “why”. Why did an outcome go up or down. Was it the result of something we did or some external factors? If the change is attributable to external factors, what are those factors?

In short, benchmarks can help you figure out which questions to ask, but benchmarking itself does not provide many answers.

I’m encouraged by the buzz around using data and analytics, but am cautious that data is kind of like fire, and it’s important that we know what we are doing with it, lest we set ourselves ablaze.