Big Data –

                      The Next HIT/EMR Boondoggle?

Here we are on the back side of the HiTech wave. EMR vendors can see the government
sponsored manna will soon end, so IT marketers have been prospecting for the next gold mine.
They found it and it’s called “Big Data and Analytics”. It really makes perfect sense. After they
install the deca-million dollar EMR systems that capture and track mountains of health care
operational data and send it to the government, what else can they do with it? Analyze it!
Analyses for clinical and administrative purposes, analysis for planning, analysis for diagnosis, for
prognosis, for best practices, for financial management, growth strategies, market penetration,
and more. To paraphrase an old cliché, “there’s got to be a pony in all that data somewhere”.

It is often repeated, and rarely challenged, that health care providers are a decade behind
commercial industry when it comes to IT tools and implementation. That clearly is the case when
we look at Big Data (BD). But before healthcare jumps into this numerical ocean, maybe we can
learn something from commercial industry and bypass many of the hurdles and errors private
industry hit during its initial foray into the world of analytics.  

First a little history. Today it’s called Big Data and Analytics, but in the 1960’s it was called
Operations Research, Management Science or Quantitative Analysis. Operations research was
actually an outgrowth of WWII. The Defense Department asked mathematicians to identify more
effective and efficient methods for bombing the enemy. The British used modeling and data
analysis to improve submarine warfare.

After the war, these sophisticated mathematical tools were applied to volumes of operational data
captured by many business transaction systems of the 1970s. The focus was on optimizing
production and improving forecasting in order to reduce the risk imbedded in strategic decision
making.  The former used mathematical models such as linear programming and queuing theory
aimed at maximizing through-put and minimizing costs. The latter was typically done with
regression analysis, probabilistic models, and Monte Carlo simulation to assess and minimize risk.
In the 1980’s and 90’s more sophisticated tools such as random walks, chaos theory and fuzzy
logic, were developed and applied to financial and other business problems.

Today the thinking in health care is with our ever expanding sea of Big Data, we should start
applying these same tools to help address the health care cost crises. Not at a bad idea. But
before we spend billions searching for our ‘pony’, we should at least learn something from the
sins of our brothers in the commercial world. During the ’70s and ‘80s, commercial industry spent
billions trying to apply these concepts with only marginal benefit. It has only been in the last ten or
fifteen years that analytics in commercial industry has really paid off with leaps in improved
logistics and productivity, while the jury is still out on management, strategic and predictive
applications. It took decades for commercial industry to see measurable benefits from BD. Here’s
two of the reasons why and their implications for health care.

Bad or insufficient data.  Thirty years ago when commercial firms crunched big wads of financial
data they found that there were significant problems correlating econometric data with accounting
data and more so with tax/government data. Earlier in my career I worked for GE in one of their
OR groups and we found that merging or correlating the data originally captured for the different
audiences rendered unusable results. Much time and effort had to be spent reclassifying financial
data to make it sync properly with econometric and government data. In addition, we came to
realize that volume and statistical data not captured at the source was fraught with errors and mis-
classifications. Thousands of hours were spent normalizing, scrubbing and disaggregating data
before we could make reliable correlations for decision making.

Healthcare has some very similar challenges. The issue with accounting
data versus econometric data is the same, but the disparities between
reimbursement data (tax) and business operation or econometric data is
far greater. As an example, commercial industry had to invest billions in
sophisticated product/service costing systems while today in healthcare
many institutions still rely on Medicare cost analyses which any financial
manager would classify as nothing better than gross approximations.
Many of the BD analytics will incorporate and be driven by cost comparisons. Medicare cost
analysis is a long way from a true product/service cost accounting system.

Merging clinical data and financial data is currently the rage, but another big ‘hole’ will be using
billing documents, charges or RVUs as a basis for analysis. Provider charges are not related to
service cost because they have been warped by decades of government policy and payment
nuances. They are as far from financial reality as we are from the Sun. In addition, the coding and
classifications imbedded in billing documents have been twisted to meet the objectives of payors
and payment rules. Everybody agrees that ICD9 coding is inadequate, if not inaccurate, yet no
doubt it will be a core element in many of the BD analytics clinical /financial models.  
Reality versus the ‘model’. After several decades commercial firms came to realize that many of
the mathematical models they employed only loosely fit the real world. Models are far simpler
representations of the real world and typically model builders fill in the blanks and more complex
parts  with assumptions.

The real world keeps changing. Yet many of the predictive tools we use such as regression
analysis are based heavily on past performance and have limited ability to reflect change.
Medicine is in constant change. Hardly a week goes by without a new research report that retires
an old protocol and replaces it with a new one, while new drugs, modalities and technologies are
introduced almost every day.

The practice of medicine is both science and art. It is difficult to properly model the science part,
let alone the art component. The same can be said for management, science or art? It took
decades and millions of dollars before commercial industry realized the limitation of many of the
predictive models they applied and how sensitive the predictions were to the underlying
assumptions. Correctly modeling the subjective judgment component of management and medical
decision making will be a very expensive task.

Clearly the old GIGO rule applies to Big Data as much as it applies to our day-to-day EMR
transaction systems. The significant difference will be in the investment needed for BD just to get
past level one GIGO. When we implement a transaction system we can see if it works effectively
or has bugs in a matter of days. With Big Data and Analytics measuring the efficacy and impact
can take years, be very expensive, and a financial boondoggle for vendors.  

Next Up: Seven Safety Checks before diving into the Big Data ocean.

September 10, 2013
Frank Poggio
The Kelzon Group
Copyright 2013, All Rights Reserved
Oriiginally Published in HISTalk 09/10/13
Meet the author, Mr. Poggio at HIMSS2015 in Chicago and hear his presentation:
            "Seven Safety Checks before You Dive into the Big Data Ocean"

            To be presented April 15, 2015 - 2:30PM                      HIMSS Education Session #181