When I last visited the topic of Big Data (BD) and Analytics I proposed that Big Data could easily become a
wasteland for health providers and the next EHR boondoggle that could generate wads of cash for system
vendors. I noted a large investment in Big Data could easily go for naught if we do not pay attention to at
least two key issues. They were; employing bad data as a foundation, and blindly accepting analytics or
mathematical models that do not correctly represent your world.
I received several responses to that piece some stating that I was opposed to Big Data and Analytics. Not
true, as a onetime practitioner of analytics, back when it was called operations research in commercial
industry, I saw firsthand the value of BD, but also the very large expense and pitfalls. At the close of my
first writing I promised to follow up with a list of safety checks you should employ to avoid drowning in the
big data ocean. Here they are.
1. Bad data. Big data and bad data do not mix. Before you jump in you should get clear answers to
these questions. Do you thoroughly understand what is in your data? How old is it? Where and how
it was originally generated? What coding structures were used? How has the coding structures
changed over time? How many system conversions and mutations has the data gone through? What
is the consistency and integrity of your data?
Scrubbing your data, particularly if it goes back several years and/or transcends different
information systems is critical. A recent HISTalk piece written by Dan Raskin, MD covered this topic
well. If you can’t answer these questions before you apply analytics, then all the conclusions you
draw from your sophisticated analytics will be on a foundation of quick sand. And be aware,
scrubbing historical data can be very time consuming and costly, which leads us to the next safety
2. Focus. Keep your focus as narrow as possible. When you jump in the BD ocean keep your eyes
on that floating life preserver. If you do not, you’ll get overwhelmed and sink fast. Most big data
projects will fail because you tried to do too much, or you were too broad in our goals which led to
loss of control, missed target dates and over budget situations. It’s very easy
to fall into this rip tide. For example, with a sea of data at our disposal we
surely should be able to predict census or institution wide patient volumes
for the next five or ten years. The complexity of such an analytical model
could easily overwhelm. As an alternative try something more restricted and
focused. For example, maybe just trying to predict volumes of a narrow
specialty practice, or identifying the three primary causes of re-admits. With
a narrow focus the probability of your model being useful will be far greater,
which takes us to our next safety check.
3. Validate your model. Run simulations against past time periods with known outcomes. Did you
get the answer you expected? If not revise or replace the algorithm(s). Smaller models are easier to
validate, apply basic common sense against any prediction. Remember the end user, usually an
executive or physician group, must buy-in to the model logic and have full trust in the data before
they can accept any predictions. If they do not understand it, they will not trust the forecasts and it
the model will never be used. Once smaller models are validated you can link multiple ones together
to create larger organizational wide models.
4. Change can sink your analytics. One of the primary reasons to apply models to big data is to
predict change, then, use that new knowledge to deal with the change before it becomes a problem.
Unfortunately there are some changes that your historical big data can’t predict. You need to
understand them and factor them into any decisions you make. For example can your model
anticipate changes within the practice of medicine? Medical protocols change almost every month
due to new research and new technologies. Hardly a week goes by without reading about a new
protocol for medications, diagnostic testing, and chronic decease management. Your ocean of big
data cannot predict these changes and yet if you are planning a new medical service you need to
somehow factor in these elements.
Another very unpredictable element is government regulations. A good
deal of industry change will be driven by what party wins each election.
Today it’s MU, ACOs, P4P, value based purchasing, and many other
regulations that did not exist five years ago, tomorrow it will be something
else. If you can predict those changes you probably would do better
in another profession. The analytics and models you build will only
reflect past practices and governmental policies and like they say on
Wall Street, past performance may not be indicative of future results.
In modeling building these are known as ad-hoc or exogenous variables.
You take the model’s output then make a one-time ‘swag’ adjustment to reflect your best guess for
5. Pick the low hanging fruit first. There are two major kinds of analytics; strategic models, and
operational models. Strategic analytics try to predict enterprise wide outcomes and volumes five to
ten years out. They focus on questions such as; What are the population trends in our market?
What patient programs should we be moving towards? Can they be financially viable? Where should
they be located? What are the competitive factors?
Operational models deal with more immediate issues, such as; How can we handle higher patient
volumes using less resources? What can we do to reduce re-admits? What is the ROI on a large
capital investment? They are by nature near term and usually address efficiency questions.
Due to their complexity and time horizon strategic analytics are tough to measure in terms of
efficacy. Operational models are far easier to measure, while strategic models are more ‘sexy’ and
costlier to build. Until you have had repeated good results with operational models you should stay
away from strategic models. The low hanging fruit are in operational analytics. Moreover there are a
myriad of them that could quickly generate real ROI and may only require ‘little data’.
6. Paralysis by analysis. You could spend a long time drifting in the big data ocean and paralysis
by analysis could easily set in. Remember, there will always be flaws in your historical data, and no
model can be perfect so do not let perfection become the enemy of good. This is not an academic
exercise and you do not have an unlimited budget. All analytics need to be improved, so do it
incrementally. Lastly after many iterations and revisions and based on your real life experiences if
the model still does not make sense to you toss it out and move on.
7. Educate and understand. What problems are you really trying to solve? Many organizations
waste time and money building models for problems they really do not have or understand. Due to
‘hype’ department managers come to believe the model will fix operational problems. Department
managers need to be trained in how to use and interpret these powerful tools. Understand what the
tool can and can’t do, and what the real limitations of the model are. This step must come first or
analytics projects can easily run amok
If you use outside resources make sure they understand the health care industry and your particular
venue. Being expert in quantitative tools is not enough, having a sound footing in the complex
relationships that drive the delivery of patient care is critical to the success of employing analytical
The annual budget is an excellent example of an operational model. Before you jump into BD, take this
test. How effective is your organization at budgeting? How close do you routinely come to hitting budget
targets? Have you used variable budgeting successfully? If you can’t answer these questions positively
you are not ready to swim in the BD ocean. Big data and analytics can be powerful tools when used with
foresight and care. Applying BD without clearly identifying your objectives, being familiar with the
weaknesses of your data, and not understanding the limits of mathematical modeling or analytical tools will
be a costly and fruitless exercise.
This article first published in HISTalk 11/21/2013
The Kelzon Group
All rights reserved
Seven Safety Checks before You Dive into the Big Data Ocean
Meet the author, Mr. Poggio at HIMSS2015 in Chicago and hear his presentation:
"Seven Safety Checks before You Dive into the Big Data Ocean"
To be presented April 15, 2015 - 2:30PM HIMSS Education Session #181