Big data-is-the-future-of-healthcare


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Big data-is-the-future-of-healthcare

  1. 1. • Cognizant 20-20 InsightsBig Data is the Future of HealthcareWith big data poised to change the healthcare ecosystem, organizationsneed to devote time and resources to understanding this phenomenon andrealizing the envisioned benefits. Executive Summary inexpensively and will radically change healthcare delivery and research. Leveraging big data will Big data is already changing the way business certainly be part of the solution to controlling decisions are made — and it’s still early in the spiraling healthcare costs. game. However, because big data exceeds the capacity and capabilities of conventional storage, Simply by witnessing how big data has trans- reporting and analytics systems, it demands new formed consumer IT, it is clear that the promise problem-solving approaches. With the conver- of big data in healthcare is immense (think gence of powerful computing, advanced database Google, Facebook and Apple’s Siri, which all rely technologies, wireless data, mobility and social on processing and transmitting massive amounts networking, it is now possible to bring together of data). While its potential in healthcare has not and process big data in many profitable ways. been fulfilled, the question is not if, but when. Big data solutions attempt to cost-effectively This white paper will define big data, explore solve the challenges of large and fast-growing the opportunities and challenges it poses for data volumes and realize its potential analytical healthcare, and recommend solutions and tech- value. For instance, trend analytics allow you to nologies that will help the healthcare industry figure out what happened, while root cause and take full advantage of this burgeoning trend. predictive analytics enable understanding of why it happened and what is likely to happen in the What Is Big Data? future. Meanwhile, opportunity and innovative A large amount of data becomes “big data” analytics can be applied to identifying opportuni- when it meets three criteria: volume, variety and ties and improving the future. velocity (see Figure 1). Here is a look at all three: All healthcare constituents — members, payers, providers, groups, researchers, governments, • Volume: Big data means there is a lot of data — ­ terabytes or even petabytes (1,000 terabytes). etc. — will be impacted by big data, which can This is perhaps the most immediate challenge predict how these players are likely to behave, of big data, as it requires scalable storage and encourage desirable behavior and minimize less support for complex, distributed queries across desirable behavior. These applications of big data multiple data sources. While many organiza- can be tested, refined and optimized quickly and cognizant 20-20 insights | september 2012
  2. 2. tions already have the basic capacity to store video streams and Web content. While standard large volumes of data, the challenge is being techniques and technologies exist to deal with able to identify, locate, analyze and aggregate large volumes of structured data, it becomes a specific pieces of data in a vast, partially significant challenge to analyze and process a structured data set. large amount of highly variable data and turn it into actionable information. But this is also• Variety: Big data is an aggregation of many where the potential of big data potential lays, types of data, both structured and unstruc- as effective analytics allow you to make better tured, including multimedia, social media, decisions and realize opportunities that would blogs, Web server logs, financial transactions, not otherwise exist. GPS and RFID tracking information, audio/What Big Data Looks Like THE WORLD’S INFORMATION IS DOUBLING EVERY TWO YEARS, with a collossal 1.8 zettabytes to be created and replicated in 2011. New information being created in 2011 also includes replicated information such as shared documents or duplicated DVDs. In terms of sheer volume, 1.8 ZB of data is equivalent to: Every person in the Over Unites Stated tweeting or 200 billion HD movies 3 tweets per minute Each 120 minutes long 4,320 tweets per day per person it would take one person for 26,976 years non-stop 47 million years of 24/7 viewing to watch every movie Storing 1.8 ZB of information would take: 57.5 billion 32 GB Apple iPads With that many iPads we could build a mountain of iPads that is 25-times higher than Mount Fuji Mount Fuji 3,776 miles Mount iPad 94,400 milesSource: “Extracting Value from Chaos,” IDC Universe study, 2011;, 1 cognizant 20-20 insights 2
  3. 3. • Velocity: While traditional data warehouse Bringing the Patient into the Loop analytics tend to be based on periodic — daily, The healthcare model is undergoing an inversion. weekly or monthly — loads and updates of data, In the old model, facilities and other providers big data is processed and analyzed in real- or were incented to keep patients in treatment — near-real-time. This is important in healthcare that is, more inpatient days translated to more for areas such as clinical decision support, revenue. The trend with new models, including where access to up-to-date information is vital accountable care organizations (ACO), is to for correct and timely decision-making and incent and compensate providers to keep patients elimination of errors. Current data is needed to healthy. support automated decision-making; after all, you can’t use five-minute-old data to cross a At the same time, patients are increasingly busy street. Without current data, automated demanding information about their healthcare decisions cannot be trusted, forcing expensive options so that they understand their choices and time-consuming manual reviews of each and can participate in decisions about their care. decision. Patients are also an important element in keeping healthcare costs down and improving outcomes.Big Data = Big Opportunities Providing patients with accurate and up-to-dateBig data has many implications for patients, information and guidance rather than just dataproviders, researchers, payers and other will help them make better decisions and betterhealthcare constituents. It will impact how these adhere to treatment programs.players engage with the healthcare ecosystem,especially when external data, regionalization, In addition to data that is readily available,globalization, mobility and social networking are such as demographics and medical history,involved (see Figure 2). another data source is information that patientsSectors Positioned for Greater Gains from Big DataHistorical productivity growth in the U.S., 2000-2008. 24.0 23.5 22.5 Computer and electronic products 9.0 Manufacturing Information services 3.5 Administration, support and Wholesale trade Transportation 3.0 waste management and warehousing Real estate and rental Percent Growth 2.5 Professional services Finance and 2.0 insurance Healthcare providers 1.5 1.0 Utilities Government 0.5 Retail trade 0 Accommodation and food -0.5 -1.0 Arts and entertainment -1.5 Management of companies Natural resources -2.0 -2.5 Other services Educational services -3.0 Construction -3.5 Low High Big data value potential index  Cluster A  Cluster B  Cluster C  Cluster D  Cluster E Bubble sizes denote relative size of GDP Clusters reflect big data scale as measured by industry segment.Source: U.S. Bureau of Labor Statistics; McKinsey Global InstituteFigure 2 cognizant 20-20 insights 3
  4. 4. divulge about themselves. When combined with healthcare system. While integrating externaloutcomes, high-quality data provided by patients data poses similar challenges to integratingcan become a valuable source of information for internal data, there are also additional challenges,researchers and others looking to reduce costs, such as privacy, security and legal concerns, asboost outcomes and improve treatment. Several well as questions about authenticity, accuracychallenges exist with self-reported data: and consistency.• Accuracy: People tend to understate their As an example, external data about healthy weight and the degree to which they engage people holds immense potential value for in negative behaviors such as smoking; research and the future delivery of healthcare. meanwhile, they tend to overstate positive Typical healthcare data includes only people behaviors, such as exercise. These inaccuracies visiting doctors and hospitals, which biases that can be accounted for by adjusting these biases data toward people seeking treatment. Adding and — through big data processing — improve anonymous data from large numbers of healthy accuracy time. people could help establish baselines, draw cor- relations and help with understanding the nature• Privacy concerns: People are generally of illnesses. More data, effectively used, leads reluctant to divulge information about to better information and decisions, and more themselves because of privacy and other meaningful efforts. concerns. Creative ways need to be found to encourage and incent them to do so without Implications of Regionalization, Globalization adversely impacting data quality. Effective External data will come from different medical mechanisms and assurances must be put into systems in various regions and countries. Effec- place to ensure the privacy of the data that tively working across these disparate data reposi- patients submit, including de-identification tories can help identify local knowledge and prior to external access. best practices and leverage them regionally and• Consistency: Standards need to be defined and globally. Aggregating data regionally and globally implemented to promote consistency in self- also provides healthcare researchers with larger reported data across the healthcare system populations for clinical studies, trending and to eliminate local discrepancies and increase disease monitoring for epidemics, as well as early the usefulness of data. Usage guidelines follow detection and the potential for improved results. standards. As data becomes less local and more regional and• Facility: Mechanisms based on e-health global, the quality of both data and metadata will and m-health — such as mobility and social improve over time as a result of increased data networking — need to be creatively employed to scrutiny and the efforts and contributions of big ease members’ ability to self-report. Providing data innovators across the broader healthcare access to some de-identified data can simulta- data ecosystem. At the same time, sharing data on neously improve levels of self-reporting as a a global basis will lead to security challenges, as community develops among members. well as issues resulting from different standards,Improving Quality with External Data terminology and language barriers.As progress is made toward initiatives such as Information Demands Drive Mobilityelectronic health records (EHR), more and more In many domains, mobility is a solution lookingexternal data will become available, and this for a problem. Big data changes that. Demandwill become an integration challenge. External for ubiquitous access to information mandatessources include the National Health Information mobility and other technologies that provideNetwork (NHIN), health information exchanges access on demand. As data becomes more(HIE), health information organizations (HIO) and current, it will be necessary to get informationregional health information organizations (RHIO). into the hands of people with an immediate needAs sources and volume of information increase, for it, such as for clinical decision support. Usersso will expectations. will also demand access to this data so they haveIn addition to integrating data within the precise and complete information to make the besthealthcare system, there are many potential possible healthcare decisions. Quality of care andbenefits of integrating data from outside of the improved outcomes will be the ultimate benefits. cognizant 20-20 insights 4
  5. 5. Big Data, Social Media and Healthcare providers, labs, ancillary vendors, data vendors,Social media will increase communication standards organizations, financial institutionsbetween patients, providers and communities — and regulatory agencies. Solutions for big datae.g., patients with similar conditions and providers will break the traditional model, in which all datawith similar specialties. This will not only work to is loaded into a warehouse. Data federation willglobalize and democratize healthcare, but it is emerge as a solution in which the big data archi-also a potentially important source of big data. tecture is based on a collection of nodes withinSocial networking data poses challenges such as and outside the enterprise and accessed throughvolume, lack of structure and velocity, as well as a layer that integrates the data and challenges around integration and accuracy. The biggest obstacle to effective use of big dataFor example, if a group of patients is discussing is the nature of healthcare information. Payers,quality of care about a provider, there will likely providers, research centers and other constitu-never be 100% consensus. Patient experiences ents all have their own silos of data. These arewill be different, and there will be biases based on fundamentally difficult to integrate becauseaccidents, misunderstandings and other factors. of concerns about privacy and propriety, theThe challenge will be to create useful information complex and fragmented nature of the data, asout of this collection of data to provide informa- well as the different schemas and standardstion such as provider ratings and improvement underlying the data and lack of metadata withinguidance. each silo. Even if everyone shared their data, there would be enough challenges integrating itBig Data = Big Challenges within the silo, much less outside it.The problem in healthcare isn’t the lack of data Although groups such as HIE, RHIO and NHIN arebut the lack of information that can be used to working to facilitate the exchange of healthcaresupport decision-making, planning and strategy. data, adoption has been slow, as they have facedAs an example, a single patient stay generates numerous challenges.thousands of data elements, including diagnoses,procedures, medications, medical supplies, lab Securityresults and billing. These need to be validated, The entire healthcare system can realize benefitsprocessed and integrated into a large data source from democratizing big data access; for example,to enable meaningful analysis. Multiply this by all researchers can more easily collaborate, engagethe patient stays across the system and combine in peer review and eliminate duplication of with the large number of points where data is Researchers will also be able to more readilygenerated and stored, and the scope of the big identify opportunities where they can contributedata challenge begins to emerge. And this is only and collaborate.a small part of the healthcare data landscape. The cloud makes exposing and sharing big dataOutlined below are some of the specific challenges easy and relatively inexpensive. However, sig-of healthcare big data, including healthcare as a nificant security and privacy concerns exist,technology laggard, data fragmentation, security, including the Health Insurance Portability andstandards and timeliness. Accountability Act (HIPAA). A credentialingHealthcare as a Technology Laggard process could facilitate and automate this access, but there are complexities and challenges. SinceHealthcare is notoriously slow to redefine and providers, patients and other interested partiesredesign processes and tends to be a laggard in such as researchers need secure access, dataadopting technology that impacts the healthcare access should be controlled by group, role andsystem, outside of some specific areas such function. Finally, the security of the data onceas care delivery and research. In addition, the it leaves the cloud also needs to be assured. Bighealthcare technology landscape includes vast data can be used to identify patterns and irregu-areas of legacy technology, causing further com- larities indicating and preventing security threats,plications. as well as other types of fraud.Fragmentation StandardsIn healthcare, big data challenges are compounded Dealing with the myriad of standards (and lackby the fragmentation and dispersion of data thereof) creates interoperability challenges, atamong the various stakeholders, including payers, least through the medium term. Big data solution cognizant 20-20 insights 5
  6. 6. architectures have to be flexible enough to cope Teradata, Vertica (HP) and Netezza (IBM). All ofwith not only the additional sources but also the these solutions tend to have low time to valueevolution of schemas and structures used for and maintenance but relatively high total cost oftransporting and storing data. To ensure analytics ownership.are meaningful, accurate and suitable, metadataand semantic layers are needed that accurately Cloud-hosted software as a service (SaaS) solu-define the data and provide business context and tions can help reduce the barriers of participatingguidance, including appropriate and inappropri- in the big data arena. Google and Amazon imple-ate uses of the data. This evolution of standards ment MapReduce-based solutions to processwill eventually improve data quality. huge datasets using a large number of computers — e.g., terabytes of data on thousands of comput-Timeliness ers. MapReduce algorithms take large problemsData timeliness is a challenge in various healthcare and divide them into a set of discrete tasks thatsettings, such as clinical decision support, can then be distributed to a large number of com-whether for making decisions or providing infor- puters for processing and the results combinedmation that guides decisions. Big data can make into a problem solution. Other cloud-based solu-decision support simpler, faster and ultimately tions include Tableau, which supports visualiza-more accurate because decisions are based on tion.higher volumes of data that are more current and Open-source Hadoop is a framework used byrelevant. In some cases, there is a very limited many companies as a high-performance, scalablewindow for clinical decision support — significant- and relatively low-cost option for dealing with bigly smaller than the time it takes to run a report data. Training, professional services and supportor analytic query. Careful attention to data and are needed to effectively deploy Hadoop solutionsquery structure, scope and execution is needed using the open source framework. Vendors suchto ensure that the constraints of the processing as Greenplum (a division of EMC), Microsoft, IBMwindows are observed while still obtaining the and Oracle have commercialized Hadoop andbest possible answer. aligned and integrated it with the rest of theirIn other cases, streams of data containing database and analytic offerings.complex and varied events without an overarch- SaaS is an important technology for democratiz-ing structure need to be mined. In this case, ing the results of big data. SaaS-based solutionsthose events have to be turned into meaningful allow healthcare entities that control subsetsmeasures in real time that are, in turn, suitable for of data to expose access through services thatrapid analysis. In many cases, the only practical eliminate some of the aggregation and integra-solution is to discard most of the data after tion challenges. Additional services that facilitateanalyzing it and selectively store those results. analytics, both basic and advanced, can be madeIt’s a tradeoff between the competitive advantage part of the overall offering.gained from the shorter feedback loop and thequality of the information that is being fed back. RecommendationsCapturing only processed data, streaming orotherwise, results in a loss of data at the expense To successfully identify and implement big dataof creating information. The underlying principle solutions and benefit from the value that big dataof big data is to keep everything, but in some can bring, healthcare organizations need to devotecases that’s just not practical or even useful — time and resources to visioning and planning. Thissometimes the hoarder reflex has to be checked will provide the foundation needed for strongand rational decisions made. execution. Without this preparation, organizations will not realize the envisioned benefits of big dataBig Data = Technology Choices and will risk being left behind competitors.There are numerous technology solutions for Our recommendations for healthcare organiza-dealing with big data, ranging from on-site to tions looking to leverage big data include:cloud and from open source to proprietary.On-site options that can tame big data include • Establish a business intelligence center of excellence with a focus on big data. cognizant 20-20 insights 6
  7. 7. • Decide on an appropriate big data strategy • Work with a partner that understands the full based on the organization’s current and target range of big data technologies and implica- business and technological maturity and tions, including trends, security, internal and objectives. external system integration, hosting and devel- opment platforms, and application and solution• Assess the various big data initiatives that development. can be deployed to meet overall corporate objectives, focusing initially on quick wins.About the AuthorBill Hamilton is a Principal with Cognizant Business Consulting’s Healthcare Practice, with nearly 20years of experience in management and IT consulting across various industries. Bill has extensiveexperience in health plan strategy, operations and program management in the areas of transfor-mation, modernization, information management and regulatory compliance. Technical areas ofexpertise include enterprise system design and development, mobile and distributed computing,database and data warehouse design and development, service-oriented architecture, cloud computingand software development lifecycle management and governance. Bill has written seven books andpublished articles about software development and database technologies. He can be reached CognizantCognizant (NASDAQ: CTSH) is a leading provider of information technology, consulting, and business process out-sourcing services, dedicated to helping the world’s leading companies build stronger businesses. Headquartered inTeaneck, New Jersey (U.S.), Cognizant combines a passion for client satisfaction, technology innovation, deep industryand business process expertise, and a global, collaborative workforce that embodies the future of work. With over 50delivery centers worldwide and approximately 145,200 employees as of June 30, 2012, Cognizant is a member of theNASDAQ-100, the S&P 500, the Forbes Global 2000, and the Fortune 500 and is ranked among the top performingand fastest growing companies in the world. Visit us online at or follow us on Twitter: Cognizant. World Headquarters European Headquarters India Operations Headquarters 500 Frank W. Burr Blvd. 1 Kingdom Street #5/535, Old Mahabalipuram Road Teaneck, NJ 07666 USA Paddington Central Okkiyam Pettai, Thoraipakkam Phone: +1 201 801 0233 London W2 6BD Chennai, 600 096 India Fax: +1 201 801 0243 Phone: +44 (0) 20 7297 7600 Phone: +91 (0) 44 4209 6000 Toll Free: +1 888 937 3277 Fax: +44 (0) 20 7121 0102 Fax: +91 (0) 44 4209 6060 Email: Email: Email:©­­ Copyright 2012, Cognizant. All rights reserved. No part of this document may be reproduced, stored in a retrieval system, transmitted in any form or by anymeans, electronic, mechanical, photocopying, recording, or otherwise, without the express written permission from Cognizant. The information contained herein issubject to change without notice. All other trademarks mentioned herein are the property of their respective owners.