Business Intelligence Data Analytics June 28 2012 Icpas V4 Final 20120625 8am

514 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
514
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Business Intelligence Data Analytics June 28 2012 Icpas V4 Final 20120625 8am

  1. 1. BUSINESS INTELLIGENCE& ADVANCED ANALYTICS The Search for Patterns, Waldo, and Black Swans Barrett Peterson, C.P.A. ICPAS Fox River Trail Chapter, June 28, 2012
  2. 2. WHYBUSINESSINTELLIGENCE? Information Good Data Good Analysis
  3. 3. HISTORY ANDBACKGROUND
  4. 4. A LITTLE BACKGROUNDHISTORY • Computer based businessA trip down intelligence systems is an ideamemorylane that is middle aged – about 40 . Previously described as: – Decision support systems [DSS] – Executive information systems [EIS] – Management information systems [MIS]
  5. 5. A LITTLE BACKGROUND • Internet Development – ARAPNET and others – 1960s – Internet Protocols – 1982, presumably by Al GoreHistory • IBM researcher Edgar Codd credited with development of relational data base theory in 1970.Important • IBM’s Donald Chamberlin and Raymond Boyce developTechnology structured query language [SQL] in the early 1970s toInventions manipulate and retrieve data from IBM’s early relational data base management system • World Wide Web and 1st web browser invented by Tim Berners-Lee in 1990 by combining the internet, hypertext mark-up language, and Uniform Resource Locator [URL] system. Became Nexus. • Mosaic, designed by Marc Andressen became the first commercial web browser [Netscape]. • Development of big data enabling database designs and high speed processing during the last 15 years.
  6. 6. A LITTLE BACKGROUND • Development of the primary infrastructure – Database design – Processing and Storage HardwareHistory – Server Development and Massively Parallel Processing • Improved telecommunications speedDrivers • Hardware miniaturization, capacity, and speedEnabling – Memory [RAM] capacityBI and – Storage capacity and transfer speedAdvanced – Bus speedAnalytics – Video processing capacity and speed • Increased hardware speed and capacity • Digital formats for sensors, cameras, RFID, and other data collection sources • Mobile computing • “Cloud” capability exploits many of these developments
  7. 7. A LITTLE BACKGROUND • AnalyticsTERMINOLOGY • Business IntelligenceA consultant’s • Knowledge Managementcollection of • Content Managementconfusing names -a sampler • Data Mining • Big Data • Data Integration • Gameification • Blob [Binary Large Object]
  8. 8. A LITTLE BACKGROUND • CPU speed and power – Moore’s lawDrivers – Multi-core chipsAnd – Solid State MemoryEnablers • Storage improvement and cost reduction – Greatly increased capacityof – Greatly increased access/transfer speedBig – Greatly reduced cost • Data collection from a wide range of devicesData • Data communications – speed and volume • Database management techniques and software • Application speed and power
  9. 9. BUSINESSINTELLIGENCEAND ADVANCED ANALYTICS DEFINED
  10. 10. TONIGHT’S CRITICAL DEFINITIONSBusiness A system comprised of “computer”Intelligence hardware and software to: • Collect, “clean”, filter, and integrate data • Store data [hardware and software] • Provide knowledge management, analytical , and presentation tools to translate data into decision useful information
  11. 11. TONIGHT’S CRITICAL DEFINITIONS • Prehistoric – Mainframe Era – DSS, EIS, MIS – Hierarchical Master Data FilesBusiness • The Current Era [Primarily] – Business Intelligence – Primarily “structured” data [data that can beIntelligence represented in relational /dimensional tables or flat files], and BLOB [binary large object] formatsGenerations – Analysis of “known” patterns – Presented in tables, simple charts, and dashboards • Emerging – Big Data and Advanced Analytics – to discover new, changing, or variable patterns – A wide variety of “unstructured” digital data formats added to “structured” data – Emerging storage structures – “Exploratory” analytics – Zoomable User Interface [ZUIs] – Solid State Memory and Solid-State Drives
  12. 12. THE HARDWAREAND SOFTWARE ELEMENTS OF BUSINESS INTELLIGENCE
  13. 13. BUSINESS INTELLIGENCE ELEMENTS • Computer – CPU, Memory, and Operating System Software • Data Collection – Master Data Management – Collection Processes and Devices – Data Cleansing Processes and SoftwarePrincipal • Data StorageComponents – Physical Devices and Storage Management Software – Data Management and Integrationfor – Database Software Storage • Relational – Traditional ERP/Transaction systemsMaximum • Dimensional – Traditional Data Warehouse, includingApplication associated BLOB • Distributed , Multiple Server, Storage Systems • NoSQL [Not Only SQL] Distributed Operational Stores • Hadoop for Highly Parallel Processing and Intensive Data Analytics Applications • Middleware Software • Business Intelligence Application Software – OLAP, Dashboard, and Chart Reports – Statistical Analysis and Presentation Tools
  14. 14. BUSINESS INTELLIGENCE ELEMENTS • Data Governance and Management – Uniform terminology – Uniform meaningDATA – Uniform units of measureISSUES: – Metadata • Data Structure and AttributesTHE – Structured - Relational/DimensionalCORNERESTONE – Unstructured – Rate of change, context, and other attributes • Data Collection and Preparation – Filtering, particularly “Big Data” – Extract, Transform, Load [ETL] for “structured data • Data Base File Systems • Data Storage and Retrieval – Capacity – Access/Retrieval speed
  15. 15. BUSINESS INTELLIGENCE ELEMENTS • Metadata management – Business definitions , rules, sources – Technical attributes, such as type, scale, transformation methodsMASTER – Processing requirements – filtering, ETL, aggregation, summarizationDATA • Data Definitions and data dictionariesGOVERNANCE – Name – Unit(s) of measureAND • Data collection and filtering or transforming requirementsMANAGEMENT – Sources – internal and external – Context addition/filtering requirements • Data integration specifications – Multiple platforms and applications – Mapping to intermediate data marts • Privacy requirements – Personal Identifying data – Laws: HIPPA, Privacy act
  16. 16. BUSINESS INTELLIGENCE ELEMENTS • Data Structures – “Structured” Data , principally text andData numbers capable of incorporation in relational or dimensional tablesStructures – “Unstructured” Data, not suitable for relationaland tables, many in newer data formatsAttributes • Big Data AttributesAre Critical – Both “structured” and “unstructured” – The four major “Vs” of big dataDrivers • Volume - huge • Velocity – fast changing, unlike structured • Variety – format and content • Variability – lacks the consistency of structured data
  17. 17. BUSINESS INTELLIGENCE ELEMENTS • Content Structure – Traditional Financial Data – NumericalData – Sign/Debit or CreditStructures – Text Descriptions • Database Management StructuresIT – Legacy Systems: Hierarchical and NetworkLingo – Transaction Systems: Relational • Relations [Tables]. Attribute [columns], Instance [Rows] • Rules: no duplicate rows; single value for attributes – Warehouse Systems: Dimensional • Facts [data items, usually a dollar amount or unit count] • Measures – dollar or count for facts • Dimensions – groups of hierarchies and descriptors of various aspects or context for the facts/measures • Microsoft Office and Similar File Formats • Photography and Art
  18. 18. Business Intelligence ElementsRELATIONALTABLEILLUSTRATION “Tuple” is borrowed from mathematics and set theory and is used in database design to refer to the attributes of an “item” or “value” [row], the subject or title of the table. Value examples include customers, vendors, orders, product SKUs
  19. 19. BUSINESS INTELLIGENCE ELEMENTSMATHCAN BECOMPLICATED
  20. 20. BUSINESS INTELLIGENCE ELEMENTS • Numbers and words/letters – Relational/Dimensional – Spreadsheets – Word Processing documentsDATA • Sound and Music • PhotoFILE • VideoTYPE • Video Game • CAD DesignCATEGORIES, • GraphicalALMOST – PDF – Raster, Vector GraphicsENGLISH – Statistical Visualization • Scientific • Signal • XML [Web based mark-up formats] • Geo-Location • Web Logs
  21. 21. BUSINESS INTELLIGENCE ELEMENTS • Collection – Company transaction/ERP systems – Purchased, such as Nielsen, IRIDATA – Vendor supplied, such as bank transactionsCOLLECTION • FilteringAND – Adding context such as date or location – Eliminating “chatter” from high volume dataPREPARATION – Error correction • Aggregation & Integration
  22. 22. DATA COLLECTION - RFIDRFID tag RFID tag reader
  23. 23. DATA COLLECTIONVarious sensors Surveillance Camera
  24. 24. DATA FILTERING AND CLEANSING IS IMPORTANT
  25. 25. BUSINESS INTELLIGENCE ELEMENTS • Relational – SQL • Dimensional – SQL, OLAPDATA • Binary Large Object [BLOB] – binary data,BASE most often photos, video, audio, or PDF filesFILE • Massively Parallel-Processing [MPP]SYSTEMS • Apache Hadoopp Distributed File System [HDFS] – Java – Google File System [GFS], used solely by Google – Google Map Reduce • Amazon S3 filesystem [used by Amazon] • NoSQL • Resource Description Framework [RDF] Databases, like Big Data
  26. 26. BUSINESS INTELLIGENCE ELEMENTS • Significant Originators – Google MapReduce – Google File System [GFS]SELECT – Amazon S3 filesystemBIG DATA • Continuing DevelopmentsDATABASE – Apache Software FoundationMANAGEMENT • Apache Cassandra distributed database management systemSYSTEMS • Apache Hadoop software framework to support data-intensive distributed applications • Apache Hive, a data warehouse structure built on Hadoop • Pig - high level programming language for creating MapReduce programs with Hadoop – Significant to Technology Development • Facebook • Yahoo • LinkedIn [Project Voldemort]
  27. 27. BUSINESS INTELLIGENCE ELEMENTS • Convergence aspect of mainframes and serversCOMPUTER • Massively parallel , multiple server,HARDWARE distributed processing, in multiple dataCONSIDERATIONS centers – grid computing • Multi-core , high capacity, lower power consumption, CPUs • Memory servers for RAM employing DRAM comprised of Fully Buffered Direct Inline Memory Modules [FBDIMM] • Solid state flash drive storage • Greatly improved., and less costly, hard drive storage
  28. 28. BI CONFIGURATION SIZESSmall – BI, but not Big Data Large – IBM Sequoia At capable Medium Livermore Labs
  29. 29. BUSINESS INTELLIGENCE ELEMENTS • Data Storage Terminology – Memory – CPU direct connected, often called RAM – Storage – not directly connected to the CPUDATA • Data Storage Device TypesSTORAGE – MemoryHARDWARE/ • DRAM – based • Flash memory – based Solid-State Drives [SSDs]SOFTWARE – Storage • Hard Disk Drives [HDD] • Optical Drives – CDs, DVDs • Data Storage Systems – Direct Attached – Network Attached Storage [NAS] – Storage Area Network [SAN] – pNFS – Parallel Network file systems
  30. 30. BUSINESS INTELLIGENCE ELEMENTS • Traditional Reporting Systems – ERP systems, including extract and presentation tools – Downloads to Excel and similar programs for analysis using functions and pivot tablesBI • Presentation ToolsAPPLICATION • Specialized AnalyticsSOFTWARE – IBM InfoSphere BigInsights and InfoSphere Streams – IBM Netezza – ParAccel Analytic Database – EMC Greenplum – SAS High Performance Computing – Information Builders WebFocus • Exploratory Tools, like IBM SPSS [originally Statistical Package for the Social Sciences] – Data mining with specialized algorithms – Statistical analysis and related charting software
  31. 31. BUSINESS INTELLIGENCE ELEMENTS • BI Reporting • Predictive AnalyticsADVANCEDANALYTICS • Data Exploration - correlationAPPLICATION • Data Visualization - graphicalTYPES • Instrumentation Analytics • Content Analytics • Web Analytics • Functional Applications • Industry Applications
  32. 32. BUSINESS INTELLIGENCE ELEMENTSUSESTATISTICALTECHNIQUESAPPROPRIATELY
  33. 33. ALGORITHMS CAN BE TREACHEROUSDATAMODELSHAVELIMITS
  34. 34. BI AND ADVANCE ANALYTICS OUTPUT ILLUSTRATIONS
  35. 35. EXAMPLES OF USES
  36. 36. • Sales and Operations Planning • Financial Instruments Modeling • Production Control • Online Retail • Economics and Policy DevelopmentSELECTED • Agriculture/FarmingEXAMPLES • Weather Analysis/PredictionOF USES • Environmental Impact Assessment • Healthcare Diagnosis and Records Management • Genomic Analytics and Pharmaceutical and Medical Research • Natural Resource Exploration • Research Physics • Road, Rail Traffic Management • Security Surveillance • Astronomy • Logistics Management, Including GPS Tracking • Electrical and Telecommunications Grids Mgmt • Social Media – Facebook, LinkedIn, Google+, Twitter, YouTube, Pinterest • TV shows – Star Trek, Person of Interest
  37. 37. • Retail – Amazon – Dell – Delta Sonic Car Washes • Data Services – IBMSELECTED – GoogleUSERS – Amazon • Financial Services • Manufacturing – McCain Foods – Frozen foods – Boeing • Transportation and Logistics – Logistics – UPS, FedEx – Rail – UP, CSX, TTX – Air – United, AMR, Southwest • Social Media – LinkedIn – Facebook • Medicine and Health – Center for Disease Control (CDC) – J. Craig Venter Institute • Science – Livermore Labs
  38. 38. SELECTED EXAMPLES OF USE • Technical Elements – Direct on-line accessAMAZON – Amazon specialized “Big Data” database – Distributed and extremely large data centers – Highly automated, high technology warehouses – High supplier [vendors] integration • User Benefits – Favorable prices – Suggested associated purchases – Individual interest advertising
  39. 39. SELECTED EXAMPLES OF USE • Technical Elements – Web driven order entry and customDELL purchase configuration – Tracking of sales correspondence with promotional offers – Supplier re-order integration • User Benefits – Ability to customize purchase – Reasonable cost – Prompt delivery
  40. 40. SELECTED EXAMPLES OF USE • Technical components – Shared component and assembly designsBOEING – More detailed quality specifications and product tolerances – Control of assembly schedule – “Real time” exchange of technical information – Dissemination of best practices • Customer benefits – Faster deliveries – Increased product quality – Reduced defects
  41. 41. SELECTED EXAMPLES OF USE • Techniques employed – Collect cellphone and GPS signals, trafficNEW cameras, and roadside sensorsJERSEY – Identify accidents, traffic jams, and road damageDEPARTMENT – Emergency vehicles can be dispatchedOF – Update traffic websitesTRANSPORTATION – Sends messages to drivers’ GPS devices and cellphones – Uses supercomputers running Intrix application • Benefits – Eliminates traffic congestion faster – More timely relief for accident victims – Facilitate road paving scheduling
  42. 42. SELECTED EXAMPLES OF USE • Technical Elements – General LinkedIn Structure • Personal ProfileLINKEDIN • Individual Connections • Groups • Company Searches • Questions and Answers – Attached application partners • Amazon – Reading List • Slideshare • User Benefits – Networking with professional contacts – Personal branding capabilities – Business Development – Job Search enhancement
  43. 43. LINKEDIN PROFILE PAGE SAMPLE
  44. 44. Facebook Page Sample
  45. 45. TRENDS• More, bigger, faster – big data gets bigger• Cloud services continue to expand• Mobile computing expands• Hadoop becomes more common• Interactive data visualization will expand• Social media type platforms will increase their prominence• Analytics skills demands will increase
  46. 46. RESOURCES• Books • Competing on Analytics, Davenport & Harris • Analytics at Work, Davenport, Harris, & Morison • The Data Asset, Fisher • Data Strategy, Adelman, Moss, Abai• Websites • The Data Warehouse Institute – tdwi.org • IBM data analytics: www.ibm.com, smarter planet
  47. 47. SUMMARYWHY USE BI AND ADVANCED ANALYTICSINSIGHTFROMDATA

×