- Data development and analytics within commercial life sciences enterprises are moving from a dictatorship to a democracy – a shift with serious implications for enterprise data management. How does a life sciences CIO make sure that democratizing data will improve the system it is replacing? You can’t stop it from happening, so how do you manage it?
Looking back at the data revolution can give us some useful perspective on how to handle the data democracy.
Agile Analytics Overturns Data Warehouses
The late 90′s through 2010 can be called the “data warehouse” era. Commercial systems and syndicated third party data were integrated through tightly controlled technical processes to deliver bullet-proof information on results from sales reporting to distributor inventory levels. The data warehouse was predicated on starting with a known question, creating a data model for that question, and locking down the information once it was developed. The data warehouse was centrally controlled by IT and required substantial resources to modify.
The data warehouse delivered information that was revolutionary and new to the enterprise. But it could only take people so far. Sales, Marketing, and Operations started asking what was driving these results. What is causing sales to spike in this territory? How are patient treatment protocols affecting the use of my brand? What is causing inventory levels to rise? Even more important than understanding the drivers after the fact, executives wanted to get ahead of the curve and start proactively addressing positive and negative conditions before they affected operations.
Enter the era of agile analytics. Agile analytics start with the data. What can it tell me? What are all the potential drivers? Which is the primary cause? What do I think will happen based on historical and current conditions? Agile analytics requires all of the data from all of the relevant data sources to create the best answer. Meanwhile, analysts across the enterprise are constantly generating new insights. They are creating their own applications, blurring the traditional lines between the technical expertise of IT and the business. And they found that agile analytics results do not always need to be 100% correct to be useful as new data is constantly being added.
The data warehouse was not built for agile analytics. As people started asking for more data and access, the data warehouse groups pushed back. According to them, the required data had to be built into the data model and fully tested before it could be used. This meant that analysts would have to wait 9-12 months and cough up $1M + before they even started exploring their hypothesis. We can’t blame the analysts for starting to search for alternative means to reach the data, and more importantly, the answer they were looking for. Especially because they succeeded.
Data Democracy in Action
One life sciences marketing analytics VP ran into that situation. He could choose to work within the data warehouse dictatorship or seek alternatives. What did he have to lose? His executives were asking for information that he could not begin to start looking at for twelve months. Facing a data warehouse group and IT department that grabbed everything having to do with a relational database, he turned to big data to investigate the truth behind the hype.
While it would take some specialized expertise and he would have to purchase through consulting, Hadoop had the capabilities he was looking for at a minimal cost — and it could stand up in a rented, or hosted, environment. The total cost with consulting was one-fifth that of the initial data warehouse estimate, none of which was a capital expenditure. All of it could be funded from the department’s current year budget. If it failed, he could shut everything down and move on. And in 2-3 months, he could have an analytics data environment that could handle all the data he ever wanted, fully under his control, with a very low risk profile if it failed. This scenario is starting to be played out in many departments across commercial life sciences organizations. Some call it Big Data.
Benefits and Dangers of Data Democratization
As with any change, there is the potential for good and bad outcomes. The good outcomes generally take care of themselves, including: new and better answers to business questions based on complete data; faster response times; and predictive analytics. The bad outcomes are where organizations really need to focus, like large, redundant data repositories and analytic applications sprouting up across the organization and outside the company data center. Or inconsistent data quality. And many versions of the truth.
The data warehouse dictatorship, for all its shortcomings, runs a tight ship. No one can access data unless it is in the warehouse and that data has been fully vetted. Under a data democracy, everyone has the opportunity to create their own data repository. These are not your father’s “spreadmarts” that the data warehouse was designed to address. They are multi-terabyte data stores that can house hundreds of data sources and hundreds of thousands of data elements.
Some analysts will be diligent, but left to their own devices, most won’t. Data quality will vary and each analyst will have their own definition of the source data that they need. But this brings us back to why these data repositories are being created in the first place — there were overriding business needs not being addressed. As we learned, the data warehouse’s limited scope and flexibility cannot meet all of those demands. For life sciences, the time for Data Democracy is now.
Rich Sokolosky is Partner, Life Sciences Practice Leader for NewVantage Partners, a consulting firm that provides expertise and guidance to Fortune 1000 business and technology executives who are seeking to leverage data and analytics to gain business insights and derive business value.