Big data: tendências e oportunidades - Palestrante: Cezar Taurion


Published on

Rio Info 2013
Tecnologias Inovadoras
17 de setembro - 14h às 18h
Big data: tendências e oportunidades
Palestrante: Cezar Taurion

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Um estudo global feito pela IBM em 2008, chamado Global CEO Study, mostrou 5 pilares que desenharão a empresa do futuro. Esse estudo global foi baseado em entrevistas presenciais com 1.130 CEOs e presidentes de 40 países e 32 setores da economia, e desenhado para capturar insights sobre como os desafios enfrentados atualmente pelos CEOs impactarão o futuro dos negócios. O estudo tem um nome sugestivo: “A Empresa do Futuro”. Acho que a conclusão mais intrigante é que os CEOs demonstram um nível surpreeendente de otimismo ao reportarem as mudanças como oportunidades para obter vantagem competitiva. No geral, 83% dos CEOs entrevistados esperam mudanças substanciais no futuro – 68% dos executivos brasileiros compartilham essa expectativa –, um crescimento de 28% em relação a 2006. Entretanto, os CEOs admitem que suas habilidades para a gestão efetiva das mudanças está crescendo a um passo mais devagar. Ou seja, a mudança é o maior desafio e a maior oportunidade para os líderes das empresas. As discussões com os CEOs sobre a empresa do futuro revelaram que estão: Ávidos de mudança - Um número maior de CEOs antevê mudanças significativas e planeja ações audaciosas para reagir a elas; Inovativos além da imaginação dos clientes - Estão aproveitando as demandas dos novos consumidores, que estão mais informados e mais colaborativos que antes; Integrados globalmente - Estão reconfigurando suas empresas para se tornarem integrados globalmente; Desbravadores por natureza - Estão implementando novos modelos de negócios que são colaborativos e desbravadores; Genuínos - não somente generosos. Estão mais atentos à responsabilidade social corporativa. Em última análise, empresas com melhor desempenho estão agindo com mais audácia, criando organizações mais globais, colaborativas e desbravadoras que seus pares na indústria. As 4 primeiras tendências apontam que as organizações cada vez mais dependerão da capacidade de se reinventarem e inovarem. E isso indica que o sucesso das organizações do futuro dependerão completamente do seu CAPITAL INTELECTUAL.
  • So what makes today’s big data activities different? Some organizations have already been handling big data for years. A global telecommunications company, for example, collects billions of detailed call records per day from 120 different systems and stores each for at least nine months. An oil exploration company analyzes terabytes of geologic data, and stock exchanges process millions of transactions per minute. For these companies, the concept of big data is not new. However, two important trends make this era of big data quite different: The digitization of virtually “everything” now creates new types of large and real-time data across a broad range of industries. Much of this is non-standard data: for example, streaming, geospatial or sensor-generated data that does not fit neatly into traditional, structured, relational warehouses. Today’s advanced analytics technologies and techniques enable organizations to extract insights from data with previously unachievable levels of sophistication, speed and accuracy. Big data embodies new data characteristics created by this new digitized marketplace, and is most often defined by “the three V’s:” volume, variety and velocity. And while they cover the key attributes of the data itself, we believe organizations need to consider an important fourth dimension: veracity . Inclusion of veracity as the fourth big data attribute emphasizes the importance of addressing and managing for the uncertainty inherent within some types of data The convergence of these four dimensions helps both to define and distinguish big data: Volume: The amount of data. Perhaps the characteristic most associated with big data, volume refers to the mass quantities of data that organizations are trying to harness to improve decision-making across the enterprise. Data volumes continue to increase at an unprecedented rate. However, what constitutes truly “high” volume varies by industry and even geography, and is smaller than the petabytes and zetabytes often referenced. Just over half of respondents consider datasets between one terabyte and one petabyte to be big data, while another 30 percent simply didn’t know how big “big” is for their organization. Still, all can agree that whatever is considered “high volume” today will be even higher tomorrow. Variety: Different types of data and data sources. Variety is about managing the complexity of multiple data types, including structured, semi-structured and unstructured data. Organizations need to integrate and analyze data from a complex array of both traditional and non-traditional information sources, from within and outside the enterprise. With the explosion of sensors, smart devices and social collaboration technologies, data is being generated in countless forms, including: text, web data, tweets, sensor data, audio, video, click streams, log files and more. Velocity: Data in motion. The speed at which data is created, processed and analyzed continues to accelerate. Contributing to higher velocity is the real-time nature of data creation, as well as the need to incorporate streaming data into business processes and decision making. Velocity impacts latency – the lag time between when data is created or captured, and when it is accessible. Today, data is continually being generated at a pace that is impossible for traditional systems to capture, store and analyze. For time-sensitive processes such as real-time fraud detection or multi-channel “instant” marketing, certain types of data must be analyzed in real time to be of value to the business. Veracity: Data uncertainty. Veracity refers to the level of reliability associated with certain types of data. Striving for high data quality is an important big data requirement and challenge, but even the best data cleansing methods cannot remove the inherent unpredictability of some data, like the weather, the economy, or a customer’s actual future buying decisions. The need to acknowledge and plan for uncertainty is a dimension of big data that has been introduced as executives seek to better understand the uncertain world around them. Ultimately, big data is an amalgam of these characteristics that creates an opportunity for organizations to gain competitive advantage in today’s digitized marketplace. It enables companies to transform the ways they interact with and serve their customers, and allows organizations – even entire industries – to transform themselves. Not every organization will take the same approach toward engaging and building its big data capabilities. But opportunities to utilize new big data technology and analytics to improve decision-making and performance exist in every industry.    
  • The term “big data” is pervasive, and yet still the notion engenders confusion. Big data has been used to convey all sorts of concepts, including: huge quantities of data, social media analytics, next generation data management capabilities, real-time data, and much more. Whatever the label, organizations are starting to understand and explore how to process and analyze a vast array of information in new ways. In doing so, a small, but growing group of pioneers is achieving breakthrough business outcomes. Many organizations have figured out how to tap into their great natural resource – data. Here are 6 examples (shown on the slide). -- A retailer reduced the time to run analytic queries by 80%. How did they do it? They moved from a general-purpose data warehouse to a purpose-built data warehouse appliance for deep analytics. They are running deep analytic queries on inventory levels and models which require heavy computations. -- A stock exchange company cut the time to run deep trading analytics from 26 hours to 2 minutes. They also moved from a general-purpose data warehouse to a purpose-built appliance. Again they were running deep analytic queries that required significant data access and computation. -- A telco cut the cost of hardware and storage by over 90% by moving to stream computing. By analyzing data as it streamed off the network, it was able to identify valuable data to be persisted and to persist only what is necessary. -- A government agency utilized stream computing to reduce the analysis of 250 TB of acoustic data from hours to 70 milliseconds. This resulted in significant cost savings, as well as the ability to react to potential threats quickly. -- A utility provider was able to predict and avoid power outages by analyzing up to 10 PB of data utilizing a combination of stream computing and a deep analytics data warehouse appliance. -- And a hospital was able to detect and intervene in potential life-threatening conditions up to 24 hours earlier, which makes a huge difference in the outcome of the patient. They did this by analyzing streaming data of various monitors and vitals indicators.
  • The Hospital for Sick Children (SickKids) Business Need: The rapid advance of medical monitoring technology has done wonders to improve patient outcomes. Today, patients are routinely connected to equipment that continuously monitors vital signs, such as blood pressure, heart rate and temperature. This kind of equipment signals alerts when any vital sign goes out of the normal range, prompting hospital staff to take action immediately. When seconds count, that can save a life. But many life-threatening conditions don’t reach that critical level right away. Often, signs that something is wrong begin to appear long before the situation becomes serious. A skilled, experienced nurse or physician may be able to spot and interpret these trends in time to avoid serious complications. Unfortunately, the warning indicators are sometimes so hard to detect that it’s nearly impossible to identify and understand their implications until it’s too late, and caregivers are challenged to solve a new problem instead of acting proactively to avoid it. One example of this type of hard-to-detect problem is nosocomial infection—that is, an infection contracted while in the hospital that is especially life-threatening to fragile patients such as premature infants. According to physicians at the University of Virginia, an examination of retrospective data reveals that, starting 12 to 24 hours before there is any overt sign of trouble, almost undetectable changes in the vital signs of infants who have contracted such an infection begin to appear. In this case, the indication is a pulse that is within acceptable limits, but not varying as it should—heart rates normally rise and fall throughout the day. In a baby where infection has set in, this doesn’t happen as much and the heart rate is too regular over time. So, while the information needed to spot the infection is present, the indication is very subtle and can easily be missed by even the most experienced clinicians. The monitors continuously generate information that can give early warning of infection, but the data speeds by so quickly that nurses are forced to ignore most of it. Consequently, information that might prevent an infection from escalating to life-threatening status is often lost. To better detect subtle warning signs of complications, clinicians needed to gain greater insight into the moment-by-moment condition of patients. “The challenge we face is that there’s too much data,” says Dr. Andrew James, staff neonatologist at the Hospital for Sick Children in Toronto. “In the hectic environment of the neonatal intensive care unit, we simply don’t have the ability to absorb and reflect upon everything presented to us, so we may miss the significance of trends.”   Solution: A first-of-its-kind, stream-computing platform was developed to capture and analyze real-time data from medical monitors, alerting hospital staff to potential health problems before patients manifest clinical signs of infection or other issues. The significance of the data overload challenge was not lost on Dr. Carolyn McGregor, Canada research chair in health informatics at the University of Ontario Institute of Technology. “As someone who has been doing a lot of work with data analysis and data warehousing, I was immediately struck by the plethora of devices providing information at high speeds—information that went unused,” she says. “Information that’s being provided at up to 1,000 readings per second is summarized into one reading every 30 to 60 minutes, and it typically goes no further. It’s stored for up to 72 hours, and is then discarded. I could see that there were enormous opportunities to capture, store and utilize this data in real time to improve the quality of care for neonatal babies.” With a shared interest in providing better patient care, Drs. McGregor and James partnered to find a way to make better use of the information produced by monitoring devices. Dr. McGregor visited researchers at the IBM T.J. Watson Research Center’s Industry Solutions Lab (ISL), who were extending a new stream-computing platform to support healthcare analytics. A three-way collaboration was established, with each group bringing a unique perspective—the hospital focus on patient care, the university’s ideas for using the data stream, and IBM providing the advanced analysis software and information technology expertise needed to turn the vision into reality. The result was Project Artemis, part of IBM’s First-of-a-Kind (FOAK) program, which pairs IBM’s scientists with clients to explore how emerging technologies can solve real-world business problems. Project Artemis is a highly flexible platform that aims to help physicians make better, faster decisions regarding patient care for a wide range of conditions. The project—which in its earliest iteration focused on real-time detection of the onset of nosocomial infection—effectively reproducing the nosocomial infection results identified retrospectively by researchers at the University of Virginia. Project Artemis is based on IBM InfoSphere™ Streams, a new information processing architecture that enables near-real-time decision support through the continuous analysis of streaming data using sophisticated, targeted algorithms. The IBM DB2® relational database provides the data management required to support future retrospective analysis of the collected data. Because the Hospital for Sick Children is a research institution, moving the project forward was not difficult. “The hospital sees itself as involved in the generation of new knowledge. There’s an expectation that we’ll do research. We have a research institute and a rigorous research ethics board, so the infrastructure was already there,” Dr. James notes. Nevertheless, Project Artemis was a very different kind of project for the hospital. “To gain its support, we needed to do our homework very carefully and show that all the bases were covered. The hospital was cautious, but from the beginning it wanted us to proceed.” Even with the support of the hospital, there were challenges to be overcome. Because Project Artemis is about information technology rather than more traditional clinical research, new issues needed to be considered. For example, the hospital CIO became involved, because the system had to be integrated into the existing network without impacting it in any way. Regulatory and ethical concerns such as privacy were also key issues—the data had to be more carefully safeguarded and restricted than usual because it was being transmitted to both the University of Ontario Institute of Technology and to the IBM T.J. Watson Research Center. Once the overarching concerns were dealt with, the initial tests could begin. Two infant beds were instrumented and connected to the system for data collection. To ensure safety and effectiveness, the project is being deployed slowly and carefully, notes Dr. James. “We have to be careful not to introduce new technologies just because they’re available, but because they really do add value,” says Dr. James. “It is a stepwise process that is still ongoing. It started with our best attempt at creating an algorithm. Now we’re looking at its performance, and using that information to fine-tune it. For example, we’re now looking at things that affect the signal, like the impact of moving the baby or changing its diaper. When we can quantify what those activities do to the data stream, we’ll be able to filter them out and get a better reading.” The ultimate goal is to create a robust, valid system fit to serve as the basis for a randomized clinical trial.  Solution Components Software · IBM InfoSphere™ Streams · IBM DB2® Research · IBM T.J. Watson Research Center Services: GBS BAO: Business Analytics and Optimization Strategy  Benefits of the Solution: · Gives clinicians the unprecedented ability to interpret vast amounts of heterogeneous data in real time, enabling them to spot subtle trends · Combines physician and nurse knowledge and experience with technology capabilities to yield more robust results than can be provided by monitoring devices alone · Provides a flexible platform that can adapt to a wide variety of medical monitoring needs Instrumented:  Patient’s vital-sign data is captured by bedside monitoring devices up to 1,000 times per second. Interconnected:  Monitoring-device data and integrated clinician knowledge are brought together in real time for an automated analysis using a sophisticated, streamlined computing platform. Intelligent:  Detecting medically significant events even before patients exhibit symptoms will enable proactive treatment before the condition worsens, eventually increasing the success rate, potentially saving lives. The initial test of the Project Artemis system captured the data stream from bedside monitors and processed it using algorithms designed to spot the telltale signs of nosocomial infection. Those algorithms are the essential difference between the Artemis system and the existing alarms built into bedside monitors. “What we’ve built is a set of rules that reflects our best understanding of the condition. We can change and update them as we learn more, or to account for variations in individual patients. With a piece of hardware like a monitor, it’s fixed and inflexible, which limits us. Artemis represents a whole new level of capability,” James notes. The truly significant aspect of the Project Artemis approach is how it brings human knowledge and expertise together with device-generated data to produce a better result. The system’s outputs are based on algorithms developed as a collaboration between the clinicians themselves and programmers. This inclusion of the human element is critical, because good patient care cannot be reduced to mere data points. A great deal of it has to do with medical knowledge, judgment, skill and experience. Artemis also holds the potential to become much more sophisticated. For example, it may eventually be able to integrate a variety of data inputs in addition to the streaming data from monitoring devices—from lab results to observational notes about the patient’s condition to the physician’s own methods for interpreting information. In this way, the knowledge, understanding and even intuition of physicians and nurses becomes the basis of a system that enables them to do much more than they could on their own. “In the early days, there was a lot of concern that computers would ‘replace’ health care providers,” Dr. James says. “A lot of us didn’t fully understand the applications of these sorts of technologies. But now there’s an understanding that human beings cannot do everything—it’s quite possible to develop tools that do not replace, but actually enhance and extend physicians’ and nurses’ capabilities. I look to a future where I’m going to receive an alert that provides me with a comprehensive, real-time view of the patient, allowing me to make better decisions on the spot.” The flexibility of the platform means that in the future, any condition that can be detected through subtle changes in the underlying data streams can be the target of the system’s early-warning capabilities. And, while its obvious application is in hospital settings where monitoring technology is already in widespread use—such as intensive care units—increasing remote “telehealth” capabilities point to a system that can benefit patients wherever they are. “I think the framework would also be applicable for any person who requires close monitoring; children with leukemia, for example,” says Dr. James. “These kids are at home, going to school, participating in sports—they’re mobile. It leads into the whole idea of sensors attached to or even implanted in the body and wireless connectivity. Theoretically, we could ultimately monitor these conditions from anywhere on the planet.” 
  • Big data: tendências e oportunidades - Palestrante: Cezar Taurion

    1. 1. © 2013 IBM Corporation IM AR O papel do Big Data na Transformação da Sociedade, Negócios e Governo Cezar Taurion Chief Evangelist
    3. 3. © 2013 IBM Corporation IM AR
    4. 4. © 2013 IBM Corporation IM AR
    5. 5. © 2013 IBM Corporation IM AR
    6. 6. © 2013 IBM Corporation IM AR
    7. 7. © 2013 IBM Corporation IM AR
    8. 8. © 2013 IBM Corporation IM AR
    9. 9. © 2013 IBM Corporation IM AR
    10. 10. © 2013 IBM Corporation IM AR Thomas Kuhn em The Structure of Scientific Revolutions (1962): Think of a Paradigm Shift as a change from one way of thinking to another. It's a revolution, a transformation, a sort of metamorphosis. It just does not happen, but rather it is driven by agents of change. A mudança de paradigma já está acontecendo… Thomas Samuel Kuhn (1922-1996)
    11. 11. © 2013 IBM Corporation IM AR Law of Disruption Social, political, and economic systems change incrementally, but technology changes exponentially. (Larry Downes)
    12. 12. © 2013 IBM Corporation IM AR 2012: 2800 Bilhões de GB!
    13. 13. © 2013 IBM Corporation IM AR Processa mais de 24 petabytes de dados por dia Upload 10 milhões de novas fotos por hora
    14. 14. © 2013 IBM Corporation IM AR 2007 7% 93% 2000 75% 25% 2013 2% 98% Digitalização dos dados e informações cresce em ritmo acelerado!
    15. 15. © 2013 IBM Corporation IM AR
    16. 16. © 2013 IBM Corporation IM AR Caracteristicas do Big data 16 Caracteristicas do big data Source: IBM methodology
    17. 17. © 2013 IBM Corporation IM AR
    18. 18. © 2013 IBM Corporation IM AR Big data é prioridade de negócios – inspira novos modelos e processos, e mesmo possibilita criar novas industrias 18
    19. 19. © 2013 IBM Corporation IM AR 19
    20. 20. © 2013 IBM Corporation IM AR 20 'Big Data' está ainda no canto da tela do radar dos CIOs/CEOs/Gestores… Adoção de Big data Some are just starting to explore 'Big Data' Most are already debating/ evaluating/ considering 'Big Data' Adoção A few are already/ still implementing 'Big Data' Several plan to implement w/in the near futureOnly a minority has not looked/ won't look into it Ignorants Early Explorers Heavy Explorers Planners Implementors
    21. 21. © 2013 IBM Corporation IM AR Mais e mais transparência...conflitos e tensões! Information management tensions Information management tensions
    22. 22. © 2013 IBM Corporation IM AR Novas maneiras de avaliar/analisar dados e informações para tomada de decisões Traditional Approach New Approach Instinct and intuition Fact-driven Corrective Directive Efficient Optimized Years, months, weeks Hours, minutes, seconds Human insight Applied semantics Decision support Action support
    23. 23. © 2013 IBM Corporation IM AR Obrigado pela Atenção Cezar Taurion =en @ctaurion Facebook e Linkedin