OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)

3,627 views

Published on

A presentation from ODTUG 2013 on tools other than OBIEE for Exalytics, focusing on analysis of non-traditional data via Endeca, "big data" via Hadoop and statistical analysis / predictive modeling through Oracle R Enterprise, and the benefits of running these tools on Oracle Exalytics

Published in: Technology
1 Comment
1 Like
Statistics
Notes
No Downloads
Views
Total views
3,627
On SlideShare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
240
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide

OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)

  1. 1. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comMark Rittman, Technical Director, Rittman MeadODTUG KScope’13, New Orleans, June 2013BI, Endeca, Hadoop and R Development (on Exalytics)Wednesday, 26 June 13
  2. 2. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comMark Rittman• Mark Rittman, Co-Founder of Rittman Mead• Oracle ACE Director, specialising in Oracle BI&DW• 14 Years Experience with Oracle Technology• Regular columnist for Oracle Magazine• Author of two Oracle Press Oracle BI books‣ Oracle Business Intelligence Developers Guide‣ Oracle Exalytics Revealed• Writer for Rittman Mead Blog :http://www.rittmanmead.com/blog• Email : mark.rittman@rittmanmead.com• Twitter : @markrittmanWednesday, 26 June 13
  3. 3. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comAbout Rittman Mead• Oracle BI and DW platinum partner• World leading specialist partner for technical excellence, solutions delivery andinnovation in Oracle BI• Approximately 30 consultants worldwide• All expert in Oracle BI and DW• UK based• Offices in US, Europe (Belgium) and India• Skills in broad range of supporting Oracle tools:‣ OBIEE‣ OBIA‣ ODIEE‣ Essbase, Oracle OLAP‣ GoldenGate‣ ExadataWednesday, 26 June 13
  4. 4. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comAbout Rittman Mead• Oracle BI and DW platinum partner• World leading specialist partner for technical excellence, solutions delivery andinnovation in Oracle BI• Approximately 30 consultants worldwide• All expert in Oracle BI and DW• UK based• Offices in US, Europe (Belgium) and India• Skills in broad range of supporting Oracle tools:‣ OBIEE‣ OBIA‣ ODIEE‣ Essbase, Oracle OLAP‣ GoldenGate‣ ExadataWednesday, 26 June 13
  5. 5. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comOracle’s Strategy for Business Analytics• Connect to all of your data, from all your sources,• Subject it to the full range of possible inquiry• Package solutions for known problems and fixed sources, and• Deploy to PCs and mobile devices, on premise or in the cloudOn Premise,On Cloud,On MobileAny Data,Any SourceFull Range ofAnalyticsIntegratedAnalytic AppsWednesday, 26 June 13
  6. 6. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comFull Range of Analytic Engines• Subject your data to the full range of possible enquiry• ROLAP for querying and reporting against relational data; MOLAP for multi-dimensionalanalysis; Search/Analytic database for guided navigation through diverse dataFull Range ofAnalyticsAnalytic ToolsAnalytic EnginesReporting &AnalysisModeling &PlanningUnstructuredAnalyticsPredictiveAnalyticsROLAP MOLAP UnstructuredWednesday, 26 June 13
  7. 7. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comOracle Business Intelligence Enterprise Edition• OBIEE 11.1.1.7 release came out in April 2013 - many new features + updated L&F• Enterprise BI platform centered around the Common Enterprise Semantic Model (RPD)• Mobile BI apps, MS Office integration,ad-hoc, dashboard and reporting• Deployable on Windows, Unix, Linux• Accessing a range of data sources‣ Oracle and other RDBMSs‣ Essbase and other OLAP servers‣ Files, XML, web services‣ ADF and SOA sources‣ TimesTen in-memory databaseWednesday, 26 June 13
  8. 8. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comExalytics as the Exa-Machine for OBIEE• Runs the BI layer on a high-performance, multi-core, 1TB server• In-memory cache used to accelerate the BI part of the stack• If Exadata addresses 80% of the query performance,Exalytics addresses the remaining 20%‣ Consistent response times for queries‣ In-memory caching of aggregates‣ 40 cores for high concurrency‣ Re-engineered BI and OLAP softwarethat assumes 40 cores and 1TB RAMERP/Apps DWOracle BIIn-Memory DB/CacheWednesday, 26 June 13
  9. 9. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comAlso Supports Essbase, and Endeca Information Discovery• In-Memory Essbase for planning, budgeting and sales analysis-style OLAP applications• Endeca Information Discovery for search/analytic applications against diverse dataIn-Memory CacheEssbase Planning EngineSmart StorageManagerLockManagerUnifiedIndexingDataMashupTextAnalysisUnifiedSearchFacetedNavigationInteractiveExplorationInformation DiscoveryOracleExalyticsIn-MemoryMachineWednesday, 26 June 13
  10. 10. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comOracle Endeca Information Discovery• Came through the Endeca acquisition, originally called Endeca Latitude• Adds unstructured data analysis, and “data discovery” to Oracle’s BI capabilities• Agile development using a schema-lessdatabase engine called Endeca Server• Complementary to OBIEE, typically usedprior to a full OBIEE project, to work-outwhat questions you want answeredWednesday, 26 June 13
  11. 11. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comInformation Discovery vs. Reporting & Analysis• Data volume, variety & growth provides challenges to answering business queries‣ Unstructured data, social network data, call centre logs now available too• Datasets change, don’t always fit dimensional models, and arrive quickly• Users want self-service access to data with minimal setup time• Reporting and Analysis is great for accurate answers to known questions ...• ... Data discovery provides fast answers to new questions• Guiding principle :Quickly explore all relevant dataQuickly Explore All Relevant Data•Relationshipsundefined orunknown•No up-front datamodelling•Rapid, iterativechange•Advanced search•Contextualnavigation•Analytics•Structured•Semi-structured•Unstructured•Even messy data is OK•Not in the datawarehouse, yetWednesday, 26 June 13
  12. 12. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com“Search-First” Interface• Traditional BI tools (OBIEE, Discoverer etc)are focused on reporting and delivering• Oracle EID takes a “search first” approach,building on e-commerce and web searchuser experiences• Search + Contextual Navigation+ Visual Analysis• Split-second response times tosupport and encourage data explorationValue Searchacross allattributesBreadcrumb listshowing all filtersapplied so farGuidedNavigation,free-form filteringacross allatttributesWednesday, 26 June 13
  13. 13. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comData Discovery Scenarios• Product Quality Reports / Warranty Analysis• Online TV Shopping Sales Sentiment Analysis• Police Crime Report & Evidence AnalysisType of Customer•Police forces, CID etc•Intelligence agencies (MI5 etc)•Private investigators, journalistsTraditional BI Issues•By the time you’ve fitted it to aconformed dimensional model,you’ve missed the opportunity•Evidence spread across disparatedocuments, sources and formatsData Discovery Solution•Minimal up-front data modeling•No requirement for a common model•Support for unstructured, semi-structured and structured sources•Search is the primary user interfaceType of Customer•Manufacturers, retailers•Consumer bodies•Insurance companiesTraditional BI Issues•Most useful information is in free-form documents and reports•Success comes from correlatinginformation from many sources•Focus on reporting and numbersData Discovery Solution•Easy linking of disparate sourcesthat only share limited commonality•All data considered, with ability toparse and detect meaning in docs•Unlimited exporing across all attribs.Type of Customer•Online and TV-based retailers•E-commerce operations•B2C companies with vocal,online customer baseTraditional BI Issues•Sales reporting only covers whatyou’ve sold. not why you’ve sold it•Consumer sentiment is found onblogs, Facebook and Twitter, noteasily brought into BI datasetsData Discovery Solution•Combine unstructured socialnetworking feeds with sales data•Content acquisition from non-traditional sources•Analyze consumer sentimentWednesday, 26 June 13
  14. 14. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comOracle Advanced Analytics : In-Database Predictive Analytics• In-Database Predictive Analytics and Statistical Analysis• Massively-Scalable, able to analyze huge volumes of data• Exposed through SQL and R, enabling broad usagePredictiveAnalyticsText MiningStatisticsData MiningComprehensive Predictive AnalyticPlatform Built-Inside the Database‣ Data Mining, Text Mining‣ Statistical Analysis (built on R)‣ Built for Data Scientists/AnalystsScalable and ParallelTightly-Integrated with SQLWorks Inside Exadata andBig Data ApplianceWednesday, 26 June 13
  15. 15. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comWhat is R, and Oracle R Enterprise?• R is a statistical language similar toBase SAS, or SPSS• Open-source, run by the R Project(http://www.r-project.org)• R environment is a suite of client/serverproducts for statistical data manipulationand graphical analysis• Modeling and Analysis performedin-memory using “frames”• Enhanced by community-contributed packages• R distribute the open-source version of Rwith Oracle Linux• Oracle R Enterprise extends R to allow analysisagainst frames stored in Oracle tables, viewsand embed R scripts in database PL/SQL packagesWednesday, 26 June 13
  16. 16. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comCapabilities of R Compared to SQL (Built-In Stats Functions)• R provides a wide variety of statistical and graphical techniques• Linear and non-linear modeling, classical statistical tests, time-series analysis• Classification, clustering and other capabilities• Matrix arithmetic, with scalar, vector, matrices, list and data frame (aka table) structures• Extensible through community-contributed packages, and interacts with C++, Java etc• Available for Oracle Database 11gR2 through the Advanced Analytics Option• Extends the (free) SQL statistical capabilities provided by Oracle Database‣ Ranking, Windowing, Reporting‣ Lag/Lead, First/Last‣ Linear Regression, Inverse Percentile etcWednesday, 26 June 13
  17. 17. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comOracle R Enterprise• Regular R is constrained by only working with in-memory datasets (frames)• Data from tables and other database structures has to be loaded into memory• Oracle R Enterprise (ORE) removes this constraint by allowing frames to reside in DB• Automatically exploits database parallelism, plus Oracle scalability / resilience• ORE provides three key areas of functionality‣ Embedded R‣ In-Database Statistics Engine(R extensions for Oracle SQL)‣ Transparency Layer(access RBDMS-based frames)Wednesday, 26 June 13
  18. 18. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comTypical R / ORE Topology• Analyst workstation contains Open Source R client tools• Models created in-memory on workstation• ORE provides capability to access datasetsstored in Oracle RBDMS, transparently• Database has ORE embedded within it• ORE provides data for workstation, andcan spawn its own R sessions forin-database R analysis• Enables lights-out R analysis, plus connectivityto Hadoop and Map/Reduce viaOracle R Connector for HadoopAnalyst Workstation /Laptop(2 core,16GB RAM)Oracle DatabaseServer withOREHadoop Server(Oracle BigData Appliance)In-MemoryR EngineIn-MemoryR Enginesspawned by DBORE Hadoop ConnectorWednesday, 26 June 13
  19. 19. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comBig Data : New Datasets of Higher Variety, Volume and Velocity• Traditional BI datasets have been relatively small, and well structured• Financial data and other metrics, with attributes and hierarchies to slice-and-dice it• Big data is all about collecting and analyzing data sets of wider scope‣ Volume - TBs of data collected from sensors, transactions and other low-granularitydata sources‣ Variety - unstructured, semi-structured as well as structured sources‣ Velocity - data arriving in real-time, and analyzed in real-timeWednesday, 26 June 13
  20. 20. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comHadoop and MapReduce• Apache Hadoop is one of the most well-known Big Data technologies• Family of open-source products used to store, and analyze distributed datasets• Hadoop is the enabling framework, automatically parallelises and co-ordinates jobs• MapReduce is the programming frameworkfor filtering, sorting and aggregating data‣ Map : filter data and pass on to reducers‣ Reduce : sort, group and return results• MapReduce jobs can be written in anylanguage (Java etc), but it is complicatedWednesday, 26 June 13
  21. 21. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comNew in OBIEE 11.1.1.7 : Hadoop Connectivity through Hive• MapReduce jobs are typically written in Java, but Hive can make this simpler• Hive is a query environment over Hadoop/MapReduce to support SQL-like queries• Hive server accepts HiveQL queries via HiveODBC or HiveJDBC, automaticallycreates MapReduce jobs against data previously loaded into the Hive HDFS tables• Approach used by ODI and OBIEEto gain access to Hadoop data• Allows Hadoop data to be accessed just likeany other data source (sort of...)Wednesday, 26 June 13
  22. 22. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comAn example Hive Query Session: Connect and Display Table List[oracle@bigdatalite ~]$ hiveHive history file=/tmp/oracle/hive_job_log_oracle_201304170403_1991392312.txthive> show tables;OKdwh_customerdwh_customer_tmpi_dwh_customerratingssrc_customersrc_sales_personweblogweblog_preprocessedweblog_sessionizedTime taken: 2.925 secondsHive Server lists out all“tables” that have beendefined within the HiveenvironmentWednesday, 26 June 13
  23. 23. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comAn example Hive Query Session: Display Table Row Counthive> select count(*) from src_customer;Total MapReduce jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=In order to limit the maximum number of reducers:set hive.exec.reducers.max=In order to set a constant number of reducers:set mapred.reduce.tasks=Starting Job = job_201303171815_0003, Tracking URL =http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201303171815_0003Kill Command = /usr/lib/hadoop-0.20/bin/hadoop job -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201303171815_00032013-04-17 04:06:59,867 Stage-1 map = 0%, reduce = 0%2013-04-17 04:07:03,926 Stage-1 map = 100%, reduce = 0%2013-04-17 04:07:14,040 Stage-1 map = 100%, reduce = 33%2013-04-17 04:07:15,049 Stage-1 map = 100%, reduce = 100%Ended Job = job_201303171815_0003OK25Time taken: 22.21 secondsRequest count(*) from tableHive server generatesMapReduce job to “map” tablekey/value pairs, and thenreduce the results to tablecountMapReduce job automaticallyrun by Hive ServerResults returned to userWednesday, 26 June 13
  24. 24. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comImporting Hadoop/Hive Metadata into RPD• HiveODBC driver has to be installed into Windows environment, so thatBI Administration tool can connect to Hive and return table metadata• Import as ODBC datasource, change physical DB type to Apache Hadoop afterwards• Note that OBIEE queries cannot span >1 Hive schema (no table prefixes)123Wednesday, 26 June 13
  25. 25. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comSet up ODBC Connection at the OBIEE Server (Linux Only)• OBIEE 11.1.1.7+ ships with HiveODBC drivers, need to use 7.x versions though• Configure the ODBC connection in odbc.ini, name needs to match RPD ODBC name• BI Server should then be able to connect to the Hive server, and Hadoop/MapReduce[ODBC Data Sources]AnalyticsWeb=Oracle BI ServerCluster=Oracle BI ServerSSL_Sample=Oracle BI Serverbigdatalite=Oracle 7.1 Apache Hive Wire Protocol[bigdatalite]Driver=/u01/app/Middleware/Oracle_BI1/common/ODBC/Merant/7.0.1/lib/ARhive27.soDescription=Oracle 7.1 Apache Hive Wire ProtocolArraySize=16384Database=defaultDefaultLongDataBuffLen=1024EnableLongDataBuffLen=1024EnableDescribeParam=0Hostname=bigdataliteLoginTimeout=30MaxVarcharSize=2000PortNumber=10000RemoveColumnQualifiers=0StringDescribeType=12TransactionMode=0UseCurrentSchema=0Wednesday, 26 June 13
  26. 26. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comOBIEE and HadoopDemonstrationWednesday, 26 June 13
  27. 27. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comUnstructured Data, Analytics & Big Data - and Oracle Exalytics• Exalytics, through in-memory aggregates and InfiniBand connection to Exadata,can analyze vast (structured) datasets held in relational and OLAP databases• Endeca Information Discovery can analyze unstructured and semi-structured sources• InfiniBand connector to Big Data Applicance + Hadoop connector inOBIEE supports analysis via Map/Reduce• Oracle R distribution + Oracle Enterprise R supports SAS-style statistical analysisof large data sets, as part of Oracle Advanced Analytics Option• OBIEE can access Hadoop datasource through Hive,and use its in-memory cache to speed-up Hive queriesWednesday, 26 June 13
  28. 28. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comOracle Exalytics In-Memory Machine as a Platform for Endeca & R• Exalytics server acts as ananalysis workstation “on steroids”• 1TB Ram + 40 cores for multiple R engines• Infiniband connection to Exadata and ORE• Endeca Information Discovery uses 40 CPUcores for massively parallel indexing, analysis• More of Endeca Server datastore in RAM• OBIEE uses RAM and cores for TimesTendatamart, in-memory data source federationand Presentation Server caching, performance• All products benefit from high-end serverfeatures and fast connectivity to Exadata + BDAWednesday, 26 June 13
  29. 29. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comAn Example OBIEE / Endeca / R on Exalytics Scenario• “Airline On-Time Performance and Causes of Flight Delays” dataset• Provide by Bureau of Transportation Statistics, Research and Innovative TechnologyAdministration, United States Department of Transportation• Dataset containing 123M rows of non-stop US domestic flight legs• Source and destination airports, operator, aircraft type• Type and duration of delay, delay reason• Freely-available “big data” set• What can OBIEE, R and EID tell us?‣ OBIEE - dashboard analysis + drilling‣ EID - discovery, and analysis of supportinginformation describing delays, reasons etc‣ ORE - deep insight into specific questions• Hadoop - came too late for this presentation, butcould also play a role too (ingestion of new flight data)Wednesday, 26 June 13
  30. 30. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comEndeca Information Discovery Search/Analytic Dashboard• Search capabilities of EIDhelp us explore messy andunfamiliar data• “McDonel” corrects to “McDonnell”• Matches to all variants of“McDonnell Douglas”• Dashboard tells us thatDelta Airlines has sufferedmost delays using McDonnellDouglas aircraft, regardless ofvariations of spellingAny variant ofMcDonnellDouglas isretrieved.Wednesday, 26 June 13
  31. 31. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comUseful information for when Traveling• Not that it worried me, honestly...Wednesday, 26 June 13
  32. 32. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comOracle Endeca Server : A Hybrid Search/Analytic Database• Key to these capabilities is the Oracle Endeca Server and its datastores (databases)• Proprietary database engine focused on search and analytics• Data organized as records, made up of attributes stored as key/value pairs• No over-arching schema,no tables, self-describing attributes• Every record can have its own uniqueset of attributes, with the overall data modelemerging over time as data is loaded• Endeca Server hallmarks:‣ Minimal upfront design‣ Support for “jagged” data‣ Administered via web service calls‣ “No data left behind”‣ “Load and Go”Wednesday, 26 June 13
  33. 33. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comOracle EID Integrator and Studio• Data is loaded into Oracle Endeca Server datastores using Oracle EID Integrator‣ Data Integration (ETL) tool build on open-source CloverETL tool (Eclipse framework)‣ Oracle EID functionality provided through components that call Endeca Server web services• User Interface created and delivered using Oracle EID Studio, 100% web-based‣ Create dashboards made up of search, navigation and data analysis components‣ Also provides Endeca Server / Studio admin featuresWednesday, 26 June 13
  34. 34. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comDiscovery Environment Provides Many Ways to Explore Data“San Fran” providescompletion suggestionsacross all attributes, allowingus to discover manyrepresentations of “San Fran”that may be present in the dataSimilarly, with each step of thedata exploration, all availablerecords in current filter set aresummarised by facets -prompts for every availableattribute, populated by onlythe valid values that will leadto non-empty results sets,allowing users to uncoverrelationships and patterns inthe data using attributes thatthey may not even be awareexistedUnlike search engines, EIDhas a rich, SQL-like languagefor aggregating andcalculations, to populategraphs and visualizations inthe dashboardWednesday, 26 June 13
  35. 35. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comText Analytics through EID Lexical / Parsing / Enrichment Features• Flexible key/value data model and unstructured text enrichment capabilities of EID allowtext analytics to be combined with data discovery and analyticsMost common issue for older,MD-88 aircraft is “corroded orcracked skin on the fuselage”Whilst for newer 777, mostcommon issue is “inoperativeemergency lights in the cabin”Wednesday, 26 June 13
  36. 36. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comFlight Delays Analysis using Endeca StudioDemonstrationWednesday, 26 June 13
  37. 37. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comExalytics OBIEE Dashboard : “Speed of Thought” Analysis123m rows of data, analyzedlive with detail in Oracle DB,and aggregates in TimesTenInteractive visuals, in the formof maps, graphs, tables,scorecards and KPIs“Go-less” prompts anddashboard controls for instantresponse to filter changesWednesday, 26 June 13
  38. 38. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comUsing R to Answer In-Depth, Statistical Questions• Are some airports more prone to delays than others?• Are some days of the week likely to see fewer delays than others?‣ Are these differences significant?• How do arrival delay distributions differ for the best and worst 3 airlines compared to theindustry?‣ Are there significant differences among airlines?• For American Airlines, how has the distribution of delays for departures and arrivalsevolved over time?• How do average annual arrival delays compare across select airlines?‣ What is the underlying trend for each airline?Wednesday, 26 June 13
  39. 39. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comPreparing the Dataset for R, and Running R Queries• Create R frames using datafrom Oracle RDBMS, usingORE transparency layer• Create R queries to manipulateflight delays data• Build regression models• Score and rank data• 40 cores and 1TB RAM in Exalyticsallows multiple R engines tobe spawned, processing largerdatasets than desktop workstationcould supportontimeSubset <- subset(ONTIME_S, UNIQUECARRIER %in%c("AA", "AS", "CO", "DL","WN","NW")) res22<- with(ontimeSubset, tapply(ARRDELAY,list(UNIQUECARRIER, YEAR), mean, na.rm = TRUE))g_range <- range(0, res22, na.rm = TRUE)rindex <- seq_len(nrow(res22))cindex <- seq_len(ncol(res22))par(mfrow = c(2,3))for(i in rindex) {temp <- data.frame(index = cindex, avg_delay = res22[i,])plot(avg_delay ~ index, data = temp, col = "black",axes = FALSE, ylim = g_range, xlab = "", ylab = "",main = attr(res22, "dimnames")[[1]][i])axis(1, at = cindex, labels = attr(res22, "dimnames")[[2]])axis(2, at = 0:ceiling(g_range[2]))abline(lm(avg_delay ~ index, data = temp), col = "green")lines(lowess(temp$index, temp$avg_delay), col="red")}Wednesday, 26 June 13
  40. 40. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comIntegrating with OBIEE and Oracle BI Publisher• R scripts can be embedded in BI Publisher data models• Results returned as image vectors in XML, and rendered as BI Publisher output• R scripts can also be referenced in functions etc and included in OBIEE RPDWednesday, 26 June 13
  41. 41. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comR Analysis Output within the OBIEE DashboardDisplay flight delay per airportfor top N busiest airportswith parameters that are passedto live R engines, using R scriptin BIP data modelRegression analysis used topredict average delay for aroute, using ORE integrationwithin OBIEE BI RepositoryWednesday, 26 June 13
  42. 42. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comSummary• Analytics options available with Oracle Database and Oracle Fusion Middleware supporta wide range of analytic tools and engines• Oracle Exalytics is an excellent platform to run these on, based on RAM and CPU #• OBIEE, with TimesTen for Exalytics and the Summary Advisor, supports“speed of thought” analytics using a rich, interactive dashboard• Endeca Information Discovery provides a search / analytic interface to enable you todiscover the questions that need answering• Oracle R Enterprise, part of the Advanced Analytics Option for Oracle Database,enables deep analysis and insights using the R statistical language and environment• OBIEE can now connect to Hadoop/MapReduce, through Hive• A combined, integrated analysis toolset based on an Oracle Engineered SystemWednesday, 26 June 13
  43. 43. T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.comMark Rittman, Technical Director, Rittman MeadODTUG KScope’13, New Orleans, June 2013BI, Endeca, Hadoop and R Development (on Exalytics)Wednesday, 26 June 13

×