Too many indicators means a whole lot of nothing

Organizations have a tendency to want to collect every data point under the sun. I cannot tell you how many agencies I have contracted with that aim to collect things like social security numbers and criminal histories when these data points carry no decision relevance, and don’t factor anywhere into the services they offer.

Even if organization executives are not concerned with putting those they serve through exhaustive questionnaires, they should be concerned about how overburdening front-line staff with administering lengthy intakes decreases data integrity. I have long advised my customers to keep their survey instruments short and to the point. The shorter your intake, the more likely you are to have every question answered. And if you are only asking a limited number of questions, every question should have been well thought out and be clearly relevant to decision making.

I’m in the process of working with some organizations to redesign their intake forms. One organization I’m working with was attempting to track over 300 indicators. Back in the original intake design phase the thinking was (as is common) that the more data you have the better. In hindsight, my customer realized that trying to collect so many indicators overlooked the implementation reality; it’s a lot easier to say what you want than to go out and get it.

The following histogram shows the number of questions on the y-axis by the number of times those questions were answered on the x-axis over a year for this particular organization. Half of the questions were answered about ten times, with one-third of questions never being used.

To be clear, this is not a case where the front-line staff was not collecting any data at all. There were a handful of questions with around 3,000 answers, and a reasonable number between 500 and 1,500 answers. The questions with the most answers were indicators that every front-line staffer found important, such as race and sex. The reason the question answers varies so greatly is that with so many questions to answer, no staffer was going to answer them all. Therefore, each staff person used her or his own judgment as to which questions were important to answer.

With so many holes in this data set, it’s hard to draw much insight. To avoid running into this problem, organizations should tie each question directly to an outcome in their impact theories. This discipline helps prevent “question-creep”, where new questions are asked out of curiosity rather than what actions can be taken with that feedback. Second, get front-line staff involved in the intake design process to ensure that all the data they need is being collected and that the questions, as worded, are practical and collectable.