What is AI without Data?


Presented at InnoTech San Antonio 2018.

Published in: Technology
  1. 1. 4/16/2018 1 What is AI without Data? The New Convergence of Data; the Next Strategic Business Advantage David Smith DATA is the central asset of your company. The growth of data has accelerated beyond even the opportunistic forecast of a few years ago. The new definition of convergence is very different from even a decade ago. The new trends of Big Data, Data Science, Cloud, A I, Mobility and IoT are changing how organizations are using data. It is now a critical business asset. New business processes will revolve around the data and it will soon become even more intensive through massive streaming data coming from ubiquitous sensors in the Internet of Things. Variety, not volume or velocity will drive the investments. During this session you will see how the data has become a strategic business advantage and its value will only increase in the next decade. David Smith CEO What is AI without Data? The New Convergence of Data; the Next Strategic Business Advantage Copyright 2018 All Rights reserved May not be distributed without permission David Smith Copyright 2018 David Smith All Rights Reserved
  2. 2. 4/16/2018 2 Why bother with the future? "If you think that you can run an organization in the next 10 years as you've run it in the past 10 years you're out of your mind.“ CEO, Coca Cola The Age of Data In the last two years we have generated more data than in the history of mankind Data is expected to double in size every two years through 2020, exceeding 40 zettabytes (40 trillion gigabytes) 2020 2012 - 2014 The Beginning – 2011 The Economist: digital information increases10 times/5 years! 2016 - 2017 Copyright 2018 David Smith All Rights Reserved
  3. 3. 4/16/2018 3 Forecast of Data Growth zettabytes (ZB) – 1 of which accounts for 1 billion terabytes (TB) Copyright 2018 David Smith All Rights Reserved
  4. 4. 4/16/2018 4 Business Problem More than half of business and IT executives, 56 percent, report they feel overwhelmed by the amount of data their company manages. Many report they are often delayed in making important decisions as a result of too much information. Surprisingly, 62 percent of C- level respondents – whose time is considered the most valuable in most organizations – report being frequently interrupted by irrelevant incoming data. Copyright 2018 David Smith All Rights Reserved
  5. 5. 4/16/2018 5 Entering the Age of Data Data is THE central business asset: – “Data are an organization’s sole, non-depletable, non- degrading, durable asset. Engineered right, data’s value increases over time because the added dimensions of time, geography, and precision.” (Peter Aitken) Data generation has changed forever – Instrumentation of All businesses, people, machines Data is born digitally and flows constantly – “All things are flowing..” (Heraclitus, 500 BC) DATA Copyright 2018 David Smith All Rights Reserved
  6. 6. 4/16/2018 6 Types of Data Copyright 2018 David Smith All Rights Reserved
  7. 7. 4/16/2018 7 Today most data is retrospective, there is a need for real-time and predictive Retrospective Real-time Predictive Today's Cycle Where is Real Time? Copyright 2018 David Smith All Rights Reserved
  8. 8. 4/16/2018 8 Volume Variety Velocity ……….. Volume Volume is increasing at incredible rates. With more people using high speed internet connections than ever, plus the growth of IoT and always on devices these are causing this tremendous increase in Volume. Copyright 2018 David Smith All Rights Reserved
  9. 9. 4/16/2018 9 Variety Next in breaking down Data into easily digestible bite-size chunks is the concept of Variety. Take your personal experience and think about how much information you create and contribute in your daily routine. Your voicemails, your e-mails, your file shares, your TV viewing habits, your Facebook updates, your LinkedIn activity, your credit card transactions, etc. Whether you consciously think about it or not the Variety of information you personally create on a daily basis which is being collected and analyzed is simply overwhelming. Variety •FB generates 10TB daily •Twitter generates 7TB of data Daily •IBM claims 90% of today’s stored data was generated in just the last two years. Copyright 2018 David Smith All Rights Reserved
  10. 10. 4/16/2018 10 Variety Big Data isn't just numbers, dates, and strings. Big Data is also geospatial data, 3D data, audio and video, and unstructured text, including log files and social media. Traditional database systems were designed to address smaller volumes of structured data, fewer updates or a predictable, consistent data structure. Streaming data and real-time analysis includes different types of data Velocity The speed at which data enters organizations these days is absolutely amazing. With mega internet bandwidth nearly being common place anymore in conjunction with the proliferation of mobile devices, this simply gives people more opportunity than ever to contribute content to storage systems. Copyright 2018 David Smith All Rights Reserved
  11. 11. 4/16/2018 11 Velocity • Clickstreams and ad impressions capture user behavior at millions of events per second • High-frequency stock trading algorithms reflect market changes within microseconds • Machine to machine processes exchange data between billions of devices • Infrastructure and sensors generate massive log data in real-time • On-line gaming systems support millions of concurrent users, each producing multiple inputs per second. But I Believe These are the Real Four Copyright 2018 David Smith All Rights Reserved
  12. 12. 4/16/2018 12 The Structure of Data  Structured • Most traditional data sources  Semi-structured • Many sources of big data  Unstructured • Video data, audio data 23 Historical Development of Database Technology Early Database Applications: The Hierarchical and Network Models were introduced in mid 1960’s and dominated during the seventies. A bulk of the worldwide database processing still occurs using these models. Relational Model based Systems: The model that was originally introduced in 1970 was heavily researched and experimented with in IBM and the universities. Relational DBMS Products emerged in the 1980’s. Copyright 2018 David Smith All Rights Reserved
  13. 13. 4/16/2018 13 Historical Development of Database Technology Object-oriented applications: OODBMSs were introduced in late 1980’s and early 1990’s to cater to the need of complex data processing in CAD and other applications. Data on the Web and E-commerce Applications: Web contains data in HTML (Hypertext markup language) with links among pages. This has given rise to a new set of applications and E-commerce is using new standards like XML (eXtended Markup Language). Extending Database Capabilities New functionality is being added to DBMSs in the following areas: – Scientific Applications – Image Storage and Management – Audio and Video data management – Data Mining – Spatial data management – Time Series and Historical Data Management – IoT – Streaming The above gives rise to new research and development in incorporating new data types, complex data structures, new operations and storage and indexing schemes in database systems. Copyright 2018 David Smith All Rights Reserved
  14. 14. 4/16/2018 14 Top10 Time Series Databases • DalmatinerDB • InfluxDB • Prometheus • Riak TS • OpenTSDB • KairosDB • Elasticsearch • Druid • Blueflood • Graphite (Whisper) Copyright 2018 David Smith All Rights Reserved
  15. 15. 4/16/2018 15 Copyright 2018 David Smith All Rights Reserved
  16. 16. 4/16/2018 16 The Intelligence is in the Connections Connections between people ConnectionsbetweenInformation Email Social Networking Groupware Javascrip t Weblogs Databases File Systems HTTP Keyword Search USENET Wikis Websites Directory Portals 2010 - 2020 Web 1.0 2000 - 2010 1990 - 2000 PC Era 1980 - 1990 RSS Widgets PC’s 2020 - 2030 Office 2.0 XML RDF SPARQLAJAX FTP IRC SOA P Mashups File Servers Social Media Sharing Lightweight Collaboration ATOM Web 3.0 Web 4.0 Semantic Search Semantic Databases Distributed Search Intelligent personal agents Java SaaS Web 2.0Flash OWL HTML SGML SQL Gopher P2P The Web The PC Windows MacOS SWRL OpenID BBS MMO’s VR Semantic Web Intelligent Web The Internet Social Web Web OS Source: Gartner, Cisco, DSmith Big Challenge 24/7 Streaming Data It seems that everything in 2018 will have a sensor that sends information back to the mothership. Copyright 2018 David Smith All Rights Reserved
  17. 17. 4/16/2018 17 The Ubiquity of Data Opportunities With vast amounts of data now available, companies in almost every industry are focused on exploiting data for competitive advantage. In the past, firms could employ teams of statisticians, modelers, and analysts to explore datasets manually, but the volume and variety of data have far outstripped the capacity of manual analysis. At the same time, computers have become far more powerful, networking has become ubiquitous, and algorithms have been developed that can connect datasets to enable broader and deeper analyses than previously possible. The convergence of these phenomena has given rise to the increasing widespread business application of data science principles and data mining techniques. 33 Data Science as a strategic asset “85% of eBay’s analytic workload is new and unknown. We are architected for the unknown.” Oliver Ratzesberger, eBay Data exploration – data as the new oil  The exploration for data, rather than the exploration of data  Uncovering pockets of untapped data  Processing the whole data set, without sampling  eBay’s Singularity platform combines transactional data with behavioral data, enabled identification of top sellers, driving increased revenue from those sellers 34 Copyright 2018 David Smith All Rights Reserved
  18. 18. 4/16/2018 18 Data as a strategic asset “Groupon will not be the first or last organization to compete and win on the power of data. It’s happening everywhere.” Reid Hoffman and James Slavet Greylock Partners Data harnessing – data as renewable energy  Harnessing naturally occurring data streams  Like harnessing raw energy to be converted into usable energy  Conversion of raw data into usable data 35 Emergence of a Fourth Research Paradigm: Data Science Thousand years ago – – Experimental Science Description of natural phenomena Last few hundred years – – Theoretical Science Newton’s Laws, Maxwell’s Equations… Last few decades – – Computational Science Simulation of complex phenomena Today – – Data-Intensive Science Scientists overwhelmed with data! Copyright 2018 David Smith All Rights Reserved
  19. 19. 4/16/2018 19 Key to Creating Artificial Intelligence: Increasing Computational Power NNow = • Beating a mouse brain • About a thousandth of a human Copyright 2018 David Smith All Rights Reserved
  20. 20. 4/16/2018 20 Information and Communication Trends • Seamless Interoperability Between Heterogeneous Networks • Mobility for All – Devices for All Things • User Centered Content-Based Information Access • Agents Take Over Routine Work • “E”- Processes for Business and Private Life • Human Computer Interaction is Turning Into Human Computer Cooperation • Human is not part of most computer and data interaction The “Fat Pipe” Copyright 2018 David Smith All Rights Reserved
  21. 21. 4/16/2018 21 What is direction of DATA Walmart handles more than 1 million customer transactions every hour. • Facebook handles 40 billion photos from its user base. • Decoding the human genome originally took 10years to process; now it can be achieved in one week. Copyright 2018 David Smith All Rights Reserved
  22. 22. 4/16/2018 22 “The market for enterprise AI systems will increase from $202.5 million in 2015 to $11.1 billion by 2024.” - Tractica Internet of Things: The Next Frontier Copyright 2018 David Smith All Rights Reserved
  23. 23. 4/16/2018 23 Data available from “Internet of Things” Copyright 2018 David Smith All Rights Reserved
  24. 24. 4/16/2018 24 IoT is generating massive volumes of structured and unstructured data, and an increasing share of this data is being deployed on cloud services. The data is often heterogeneous and lives across multiple relational and non-relational systems. When these smart devices are connected to intelligent applications such as Siri, Alexa ,Cortana or Google Home, the possibilities become endless. Conversational AI will enable high-level conversations with these intelligent applications These bots, per Microsoft CEO Satya Nadella, will be the next apps. 2018 will see the convergence of these intelligent applications with many IoT devices. Copyright 2018 David Smith All Rights Reserved
  25. 25. 4/16/2018 25 As the world gets smarter, infrastructure demands will grow Smart traffic systems Smart water management Smart energy grids Smart healthcare Smart food systems Smart oil field technologies Smart regions Smart weather Smart countries Smart supply chains Smart cities Smart retail Copyright 2018 David Smith All Rights Reserved
  26. 26. 4/16/2018 26 Copyright 2018 David Smith All Rights Reserved
  27. 27. 4/16/2018 27 Will technological breakthroughs be developed in time to boost economic productivity and solve the problems caused by a growing world population, rapid urbanization, and climate change? Game Changer - Impact of New Technologies • The Internet of Things • Not just Big Data, but a zettaflood • Much D to D • Wisdom of the Data Science • The next 'Net’ • Move from physical to virtual • The world gets Bio • Regenerative Medicine Copyright 2018 David Smith All Rights Reserved
  28. 28. 4/16/2018 28 Conclusion The Age of Data is here Data is the central business asset Data generation has changed forever • The World is moving to Real Time • Data Science is the Key Your legacy analytic software WILL fail in the Age of Data Crisis of software that scales to meet demand Streaming data changes the concept of data Think about where the data comes from Attempt to capture and analyze any data that might be relevant, regardless of where it resides Data Science is changing how data is: – Collected, discovered, analyzed, used, acted upon … In Parting: Be Paranoid “Sooner or later, something fundamental in your business world will change.”  Andrew S. Grove, Founder, Intel “Only the Paranoid Survive” Copyright 2018 David Smith All Rights Reserved
  29. 29. 4/16/2018 29 Thank You David Smith 9 global GIS data sets that you can download for free. 1 Natural Earth Data. 2 Esri Open Data. 3 USGS Earth Explorer. 4 OpenStreetMap. 5 NASA's Socioeconomic Data and Applications Center (SEDAC) 6 Open Topography. 7 UNEP Environmental Data Explorer. 9 NASA Earth Observations (NEO) Copyright 2018 David Smith All Rights Reserved