SlideShare a Scribd company logo
info@rittmanmead.com www.rittmanmead.com @rittmanmead
Social Network Analysis using NoSQL and Hadoop

(and why I didn’t use Oracle 12c Spatial & Graph)
Mark Rittman, CTO, Rittman Mead
OGh SQL Celebration Day, Netherlands, June 2016
info@rittmanmead.com www.rittmanmead.com @rittmanmead 2
•Oracle Gold Partner with offices in the UK and USA (Atlanta)

•70+ staff delivering Oracle BI, DW, Big Data and Advanced Analytics projects

•Oracle ACE Director (Mark Rittman, CTO) + 2 Oracle ACEs

•Significant web presence with the 

Rittman Mead Blog (http://www.rittmanmead.com)

•Regular sers of social media 

(Facebook, Twitter, Slideshare etc)

•Regular column in Oracle Magazine 

and other publications

•Hadoop R&D lab for “dogfooding” 

solutions developed for customers
About Rittman Mead
a confession…
My original plan
bad move
bad move
(not as bad as that one though)
but this is a good example

of when NoSQL + Hadoop
is a better solution
info@rittmanmead.com www.rittmanmead.com @rittmanmead 10
Business Scenario
•Rittman Mead want to understand drivers and audience for their website
‣What is our most popular content? Who are the most in-demand blog authors?
‣Who are the influencers? What communities exist around our web presence?
•Three data sources in scope:
RM Website Logs Twitter Stream Website Posts, Comments etc
info@rittmanmead.com www.rittmanmead.com @rittmanmead 11
•ODI provides an excellent framework for running Hadoop ETL jobs

‣ELT approach pushes transformations down to Hadoop - leveraging power of cluster

•Hive, HBase, Sqoop and OLH/ODCH KMs provide native Hadoop loading / transformation

‣Whilst still preserving RDBMS push-down

‣Extensible to cover Pig, Spark etc

•Process orchestration

•Data quality / error handling

•Metadata and model-driven

•New in 12.1.3.0.1 - ability to generate

Pig and Spark jobs too
Real-Time & Batch Log & Event Ingestion : ODI12c
info@rittmanmead.com www.rittmanmead.com @rittmanmead 12
•Initial iteration of project focused on capturing and ingesting web + social media activity

•Apache Flume used for capturing website hits, page views

•Twitter Streaming API used to capture tweets referring to RM website or RM staff

•Activity landed into Hadoop (HDFS), processed and enriched and presented using Hive
Overall Project Architecture - Phase 1
info@rittmanmead.com www.rittmanmead.com @rittmanmead 13
•Provided real-time counts of page views, correlated with Twitter activity stored in Hive tables

•Accessed using Oracle Big Data SQL +

joined to Oracle RDBMS reference data

•Delivered using OBIEE reports and dashboards

•Data Warehousing, but cheaper + real-time

•Answered questions such as

‣What are our most popular site pages?

‣Which pages attracted the most

attention on Twitter, Facebook?

‣What topics are popular?
Real-Time Metrics around Site Activity - “What?”
Combine with Oracle Big Data SQL
for structured OBIEE dashboard analysis
What pages are people visiting?
Who is referring to us on Twitter?
What content has the most reach?
info@rittmanmead.com www.rittmanmead.com @rittmanmead 14
•Good question - especially when Oracle
Database 12c can natively store JSON
documents

•But the real question is : “why use Oracle
Database for ingesting this data?”

‣It’s much more expensive and over-
engineered for the job compared to Hadoop

‣Being ACID-compliant, it’s going to incur
more overhead = less ingest capacity/hour

‣All the community and vendor property graph
expertise based around Hadoop
Why Not Use Oracle Database for This?
info@rittmanmead.com www.rittmanmead.com @rittmanmead 15
•Oracle Big Data Discovery used to go back to the raw event data add more meaning

•Enrich data, extract nouns + terms, add reference data from file, RDBMS etc

•Understand sentiment + meaning of tweets, link disparate + loosely coupled events

•Faceted search dashboards
Oracle BDD for Data Wrangling + Data Enrichment
info@rittmanmead.com www.rittmanmead.com @rittmanmead 16
Answered the “What” and “Why” Questions…
•Counts of page views, tweets, mentions etc helped us understand what content was popular
•Analysis of tweet sentiment, meaning and correlation with content answered why
Combine with Oracle Big Data SQL
for structured OBIEE dashboard analysis
Combine with site content, semantics, text enrichment
Catalog and explore using Oracle Big Data Discovery
What pages are people visiting?
Who is referring to us on Twitter?
What content has the most reach?
Why is some content more popular?
Does sentiment affect viewership?
What content is popular, where?
info@rittmanmead.com www.rittmanmead.com @rittmanmead 17
•Previous counts assumed that all tweet references equally important

•But some Twitter users are far more influential than others

‣Sit at the centre of a community, have 1000’s of followers

‣A reference by them has massive impact on page views

‣Positive or negative comments from them drive perception

•Can we identify them?

‣Potentially “reach out” with analyst program

‣Study what website posts go “viral”

‣Understand out audience, and the conversation, better
But Who Are The Influencers In Our Community?
Influencer	Identification
Communication	
Stream	(e.g.	tweets)
Find	out	people	that	are	
central in	the	given	
network	– e.g.	influencer	
marketing
info@rittmanmead.com www.rittmanmead.com @rittmanmead 18
•Rittman Mead website features many types of content

‣Blogs on BI, data integration, big data, data warehousing

‣Op-Eds (“OBIEE12c - Three Months In, What’s the Verdict?”)

‣Articles on a theme, e.g. performance tuning

‣Details of new courses, new promotions

•Different communities likely to form around these content types

•Different influencers and patterns of recommendation, discovery

•Can we identify some of the communities, segment our audience?
What Communities and Networks Are Our Audience?
Community	Detection
Identify	group	of	people	
that	are	close	to	each	other	
– e.g.	target	group	
marketing
info@rittmanmead.com www.rittmanmead.com @rittmanmead 19
Tabular (SQL) Query Tools Aimed at Counts + Aggs
info@rittmanmead.com www.rittmanmead.com @rittmanmead 20
Graph Example : RM Blog Post Referenced on Twitter
Lifting the Lid on OBIEE Internals with 

Linux Diagnostics Tools http://t.co/gFcUPOm5pI
00 0 0 Page Views10 0 0 Page Views
Follows
20 0 0 Page Views
Follows
30 0 0 Page Views
info@rittmanmead.com www.rittmanmead.com @rittmanmead 21
Network Effect Magnified by Extent of Social Graph
Lifting the Lid on OBIEE Internals with 

Linux Diagnostics Tools http://t.co/gFcUPOm5pI
30 0 0 Page Views70 0 5 Page Views
Lifting the Lid on OBIEE Internals with 

Linux Diagnostics Tools http://t.co/gFcUPOm5pI
info@rittmanmead.com www.rittmanmead.com @rittmanmead 22
Retweets, Mentions and Replies Create Communities
Retweet
Reply
Mention
Reply
#bigdatasql
Reply
Mention
Mention
Mention
Mention
#thatswhatshesaid
info@rittmanmead.com www.rittmanmead.com @rittmanmead 23
This is What’s Termed a “Property Graph”
Lifting the Lid on OBIEE Internals with 

Linux Diagnostics Tools http://t.co/gFcUPOm5pI
Mentions
Lifting the Lid on OBIEE Internals with 

Linux Diagnostics Tools http://t.co/gFcUPOm5pI
Retweets
Node, or “Vertex”
Directed Connection, or “Edge”
Node, or “Vertex”
info@rittmanmead.com www.rittmanmead.com @rittmanmead 24
•Different types of Twitter interaction could imply more or less “influence”

‣Retweet of another user’s Tweet 

implies that person is worth quoting

or you endorse their opinion

‣Reply to another user’s tweet 

could be a weaker recognition of 

that person’s opinion or view

‣Mention of a user in a tweet is a 

weaker recognition that they are 

part of a community / debate
Determining Influencers - Factors to Consider
info@rittmanmead.com www.rittmanmead.com @rittmanmead 25
Relative Importance of Edge Types Added via Weights
Lifting the Lid on OBIEE Internals with 

Linux Diagnostics Tools http://t.co/gFcUPOm5pI
Mentions, Weight = 30
Lifting the Lid on OBIEE Internals with 

Linux Diagnostics Tools http://t.co/gFcUPOm5pI
Retweet, Weight = 100
Edge Property
Edge Property
info@rittmanmead.com www.rittmanmead.com @rittmanmead 26
•Graph, spatial and raster data processing for big data

‣Primarily documented + tested against Oracle BDA

‣Installable on commodity cluster using CDH

•Data stored in Apache HBase or Oracle NoSQL DB

‣Complements Spatial & Graph in Oracle Database

‣Designed for trillions of nodes, edges etc

•Out-of-the-box spatial enrichment services

•Over 35 of most popular graph analysis functions

‣Graph traversal, recommendations

‣Finding communities and influencers, 

‣Pattern matching
Oracle Big Data Spatial & Graph
info@rittmanmead.com www.rittmanmead.com @rittmanmead 27
Why Not Use Oracle 12c + Spatial & Graph Option?
A different type of graph
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Data loaded from files or through Java API into HBase 

•In-Memory Analytics layer runs common graph and spatial algorithms on data

•Visualised using R or other

graphics packaged
Oracle Big Data Graph and Spatial Architecture
Massively Scalable Graph Store
• Oracle NoSQL
• HBase
Lightning-Fast In-Memory Analytics
• YARN Container
• Standalone Server
• Embedded
info@rittmanmead.com www.rittmanmead.com @rittmanmead 30
The Property Graph Model
info@rittmanmead.com www.rittmanmead.com @rittmanmead 31
Graph Types in Oracle Database Spatial & Graph
info@rittmanmead.com www.rittmanmead.com @rittmanmead 32
•ODI12c used to prepare two files in Oracle Flat File Format

‣Extracted vertices and edges from existing data in Hive

‣Wrote vertices (Twitter users) to .opv file, 

edges (RTs, replies etc) to .ope file

•For exercise, only considered 2-3 days of tweets

‣Did not include follows (user A followed user B)

as not reported by Twitter Streaming API

‣Could approximate larger follower networks through

multiplying weight of edge by follower scale

-Useful for Page Rank, but does it skew 

actual detection of influencers in exercise?
Preparing Vertices and Edges for Ingestion
info@rittmanmead.com www.rittmanmead.com @rittmanmead 33
Oracle Flat File Format Vertices and Edge Files
• Unique ID for the vertex
• Property name (“name”)
• Property value datatype (1 = String)
• Property value (“markrittman”)
Vertex File (.opv)
• Unique ID for the edge
• Leading edge vertex ID
• Trailing edge vertex ID
• Edge Type (“mentions”)
• Edge Property (“weight”)
• Edge Property datatype and value
Edge File (.ope)
info@rittmanmead.com www.rittmanmead.com @rittmanmead 34
cfg = GraphConfigBuilder.forPropertyGraphHbase() 
.setName("connectionsHBase") 
.setZkQuorum("bigdatalite").setZkClientPort(2181) 
.setZkSessionTimeout(120000).setInitialEdgeNumRegions(3) 
.setInitialVertexNumRegions(3).setSplitsPerRegion(1) 
.addEdgeProperty("weight", PropertyType.DOUBLE, "1000000") 
.build();
opg = OraclePropertyGraph.getInstance(cfg);
opg.clearRepository();
vfile="../../data/biwa_connections.opv"
efile="../../data/biwa_connections.ope"
opgdl=OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(opg, vfile, efile, 2);
// read through the vertices
opg.getVertices();
// read through the edges
opg.getEdges();
Loading Edges and Vertices into HBase
Uses “Gremlin” Shell for HBase
• Creates connection to HBase
• Sets initial configuration for database
• Builds the database ready for load
• Defines location of Vertex and Edge files
• Creates instance of 

OraclePropertyGraphDataLoader
• Loads data from files
• Prepares the property graph for use
• Loads in Edges and Vertices
• Now ready for in-memory processing
info@rittmanmead.com www.rittmanmead.com @rittmanmead 35
Choice of Persistent Stores for Property Graph Model
info@rittmanmead.com www.rittmanmead.com @rittmanmead 36
Accompanied by PGX : Graph Analysis Framework
(so that’s why I used Hadoop)
info@rittmanmead.com www.rittmanmead.com @rittmanmead 38
Calculating Most Influential Tweeters Using Page Rank
vOutput="/tmp/mygraph.opv"
eOutput="/tmp/mygraph.ope"
OraclePropertyGraphUtils.exportFlatFiles(opg, vOutput, eOutput, 2,
false);
session = Pgx.createSession("session-id-1");
analyst = session.createAnalyst();
graph = session.readGraphWithProperties(opg.getConfig());
rank = analyst.pagerank(graph, 0.001, 0.85, 100);
rank.getTopKValues(5);
==>PgxVertex with ID 1=0.13885623487462861
==>PgxVertex with ID 3=0.08686102641801993
==>PgxVertex with ID 101=0.06757752513733056
==>PgxVertex with ID 6=0.06743774001139484
==>PgxVertex with ID 37=0.0481517609757462
==>PgxVertex with ID 17=0.042234536894569276
==>PgxVertex with ID 29=0.04109794527311113
==>PgxVertex with ID 65=0.032058649698044187
==>PgxVertex with ID 15=0.023075360575195276
==>PgxVertex with ID 93=0.019265959946506813
• Initiates an in-memory analytics session
• Runs Page Rank algorithm to determine influencers
• Outputs top ten vertices (users)
Top 10 vertices
info@rittmanmead.com www.rittmanmead.com @rittmanmead 39
Calculating Most Influential Tweeters Using Page Rank
v1=opg.getVertex(1l); v2=opg.getVertex(3l); v3=opg.getVertex(101l); 
v4=opg.getVertex(6l); v5=opg.getVertex(37l); v6=opg.getVertex(17l); 
v7=opg.getVertex(29l); v8=opg.getVertex(65l); v9=opg.getVertex(15l); 
v10=opg.getVertex(93l);
System.out.println("Top 10 influencers: n " + v1.getProperty("name") + 
"n " + v2.getProperty("name") + 
"n " + v3.getProperty("name") + 
"n " + v4.getProperty("name") + 
"n " + v5.getProperty("name") + 
"n " + v6.getProperty("name") + 
"n " + v7.getProperty("name") + 
"n " + v8.getProperty("name") + 
"n " + v9.getProperty("name") + 
"n " + v10.getProperty("name"));
Top 10 influencers:
markrittman
rmoff
rittmanmead
mRainey
JeromeFr
Nephentur
borkur
BIExperte
i_m_dave
dw_pete
Note :
Over a 3-day period in May 2015
Twitter users referencing RM website + staff accounts
info@rittmanmead.com www.rittmanmead.com @rittmanmead 40
•Open source graph analysis tool with Oracle
Big Data Graph and Spatial Plug-in

•Available shortly from Oracle, connects to
Oracle NoSQL or HBase and runs Page
Rank etc

•Alternative to command-line for In-Memory
Analytics once base graph created
Visualising Property Graphs with Cityscape
info@rittmanmead.com www.rittmanmead.com @rittmanmead 41
Calculating Top 10 Users using Page Rank Algorithm
info@rittmanmead.com www.rittmanmead.com @rittmanmead 42
Visualising the Social Graph Around Particular Users
info@rittmanmead.com www.rittmanmead.com @rittmanmead 43
Detecting Clusters (Communities)
info@rittmanmead.com www.rittmanmead.com @rittmanmead 44
Calculating Shortest Path Between Users
info@rittmanmead.com www.rittmanmead.com @rittmanmead 45
Conclusions, and Further Reading
•Tools such as OBIEE are great for understanding what (counts, page views, popular items)
•Oracle Big Data Discovery can be useful for understanding “why?” (sentiment, terms etc)
•Graph Analysis can help answer “who”?
•Who are our audience? What are our communities? Who are their important influencers?
•Oracle Big Data Graph and Spatial can answer these questions to “big data” scale
•Articles on the Rittman Mead Blog
‣http://www.rittmanmead.com/category/oracle-big-data-appliance/
‣http://www.rittmanmead.com/category/big-data/
‣http://www.rittmanmead.com/category/oracle-big-data-discovery/
•Rittman Mead offer consulting, training and managed services for Oracle Big Data
‣http://www.rittmanmead.com/bigdata
info@rittmanmead.com www.rittmanmead.com @rittmanmead
Oracle Big Data Spatial & Graph

Social Media Analysis - Case Study
Mark Rittman, CTO, Rittman Mead
OGh SQL Celebration Day, Netherlands, June 2016

More Related Content

What's hot

SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
Mark Rittman
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Mark Rittman
 
Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...
Mark Rittman
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Mark Rittman
 
How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017
Rittman Analytics
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
Rittman Analytics
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
StampedeCon
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
Mark Rittman
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016
StampedeCon
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
Guido Schmutz
 
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platformBig Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
Caserta
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
Trieu Nguyen
 
Architecture of Big Data Solutions
Architecture of Big Data SolutionsArchitecture of Big Data Solutions
Architecture of Big Data Solutions
Guido Schmutz
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
Michael Rainey
 
The Hidden Value of Hadoop Migration
The Hidden Value of Hadoop MigrationThe Hidden Value of Hadoop Migration
The Hidden Value of Hadoop Migration
Databricks
 
Designing the Next Generation Data Lake
Designing the Next Generation Data LakeDesigning the Next Generation Data Lake
Designing the Next Generation Data Lake
Robert Chong
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
Caserta
 
Big Data on azure
Big Data on azureBig Data on azure
Big Data on azure
David Giard
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
Databricks
 
How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016
StampedeCon
 

What's hot (20)

SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
 
Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
 
How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platformBig Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
 
Architecture of Big Data Solutions
Architecture of Big Data SolutionsArchitecture of Big Data Solutions
Architecture of Big Data Solutions
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
 
The Hidden Value of Hadoop Migration
The Hidden Value of Hadoop MigrationThe Hidden Value of Hadoop Migration
The Hidden Value of Hadoop Migration
 
Designing the Next Generation Data Lake
Designing the Next Generation Data LakeDesigning the Next Generation Data Lake
Designing the Next Generation Data Lake
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
Big Data on azure
Big Data on azureBig Data on azure
Big Data on azure
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016How to get started in Big Data without Big Costs - StampedeCon 2016
How to get started in Big Data without Big Costs - StampedeCon 2016
 

Viewers also liked

The Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsThe Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data Platforms
Mark Rittman
 
Where are the slides?
Where are the slides?Where are the slides?
Where are the slides?
Robin Moffatt
 
Big Data Ready Enterprise
Big Data Ready Enterprise Big Data Ready Enterprise
Big Data Ready Enterprise
DataWorks Summit/Hadoop Summit
 
HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016
DataWorks Summit/Hadoop Summit
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
DataWorks Summit/Hadoop Summit
 
Amr Training Certificates - 2002-2005-2010
Amr Training Certificates - 2002-2005-2010Amr Training Certificates - 2002-2005-2010
Amr Training Certificates - 2002-2005-2010Amr Sakran
 
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_Radiosurgery
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_RadiosurgeryBarrow_Quarterly_1997_Physical_Aspects_of_Stx_Radiosurgery
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_RadiosurgeryJeffrey A. Fiedler
 
강원도팬션 국제항공권할인
강원도팬션 국제항공권할인강원도팬션 국제항공권할인
강원도팬션 국제항공권할인
foskfs
 
기업들의 Sns 활동 한계에 봉착했나 이제 시작인가
기업들의 Sns 활동 한계에 봉착했나 이제 시작인가기업들의 Sns 활동 한계에 봉착했나 이제 시작인가
기업들의 Sns 활동 한계에 봉착했나 이제 시작인가June Kim
 
La sostenibiltà
La sostenibiltàLa sostenibiltà
La sostenibiltàvoltafano
 
Live proud
Live proudLive proud
Live proud
Daron O'Brien
 
วารสาร พ.ย. ธ.ค.59
วารสาร พ.ย. ธ.ค.59วารสาร พ.ย. ธ.ค.59
วารสาร พ.ย. ธ.ค.59
Yui Yuyee
 
Wastek Teknologi (1)
Wastek Teknologi (1)Wastek Teknologi (1)
Wastek Teknologi (1)
Tedi Eka
 
Health in America and the World
Health in America and the WorldHealth in America and the World
Health in America and the WorldJohn Grant
 
Tanggar amanat ( nanti printkan setiap helaian 36 keping )
Tanggar  amanat ( nanti printkan setiap helaian 36 keping )Tanggar  amanat ( nanti printkan setiap helaian 36 keping )
Tanggar amanat ( nanti printkan setiap helaian 36 keping )
Nur Fatehah
 
PwC's - Redefining finance's role in the digital-age
PwC's - Redefining finance's role in the digital-agePwC's - Redefining finance's role in the digital-age
PwC's - Redefining finance's role in the digital-age
Todd DeStefano
 
Thanksgiving day
Thanksgiving dayThanksgiving day
Thanksgiving day
beatrizteacherbreton
 

Viewers also liked (20)

The Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsThe Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data Platforms
 
Where are the slides?
Where are the slides?Where are the slides?
Where are the slides?
 
Big Data Ready Enterprise
Big Data Ready Enterprise Big Data Ready Enterprise
Big Data Ready Enterprise
 
HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
 
Amr Training Certificates - 2002-2005-2010
Amr Training Certificates - 2002-2005-2010Amr Training Certificates - 2002-2005-2010
Amr Training Certificates - 2002-2005-2010
 
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_Radiosurgery
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_RadiosurgeryBarrow_Quarterly_1997_Physical_Aspects_of_Stx_Radiosurgery
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_Radiosurgery
 
강원도팬션 국제항공권할인
강원도팬션 국제항공권할인강원도팬션 국제항공권할인
강원도팬션 국제항공권할인
 
Cara membahagi pusaka
Cara membahagi  pusakaCara membahagi  pusaka
Cara membahagi pusaka
 
Tm31
Tm31Tm31
Tm31
 
기업들의 Sns 활동 한계에 봉착했나 이제 시작인가
기업들의 Sns 활동 한계에 봉착했나 이제 시작인가기업들의 Sns 활동 한계에 봉착했나 이제 시작인가
기업들의 Sns 활동 한계에 봉착했나 이제 시작인가
 
Posting responses to discussion forums in moodle (for teachers)
Posting responses to discussion forums in moodle (for teachers)Posting responses to discussion forums in moodle (for teachers)
Posting responses to discussion forums in moodle (for teachers)
 
La sostenibiltà
La sostenibiltàLa sostenibiltà
La sostenibiltà
 
Live proud
Live proudLive proud
Live proud
 
วารสาร พ.ย. ธ.ค.59
วารสาร พ.ย. ธ.ค.59วารสาร พ.ย. ธ.ค.59
วารสาร พ.ย. ธ.ค.59
 
Wastek Teknologi (1)
Wastek Teknologi (1)Wastek Teknologi (1)
Wastek Teknologi (1)
 
Health in America and the World
Health in America and the WorldHealth in America and the World
Health in America and the World
 
Tanggar amanat ( nanti printkan setiap helaian 36 keping )
Tanggar  amanat ( nanti printkan setiap helaian 36 keping )Tanggar  amanat ( nanti printkan setiap helaian 36 keping )
Tanggar amanat ( nanti printkan setiap helaian 36 keping )
 
PwC's - Redefining finance's role in the digital-age
PwC's - Redefining finance's role in the digital-agePwC's - Redefining finance's role in the digital-age
PwC's - Redefining finance's role in the digital-age
 
Thanksgiving day
Thanksgiving dayThanksgiving day
Thanksgiving day
 

Similar to Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I didn't use Oracle Database 12c for this)

IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
Mark Rittman
 
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
Mark Rittman
 
Building the Inform Semantic Publishing Ecosystem: from Author to Audience
Building the Inform Semantic Publishing Ecosystem: from Author to AudienceBuilding the Inform Semantic Publishing Ecosystem: from Author to Audience
Building the Inform Semantic Publishing Ecosystem: from Author to Audience
Vital.AI
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
Inside Analysis
 
Semantics and Machine Learning
Semantics and Machine LearningSemantics and Machine Learning
Semantics and Machine Learning
Vladimir Alexiev, PhD, PMP
 
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
Practical Tips for Oracle Business Intelligence Applications 11g ImplementationsPractical Tips for Oracle Business Intelligence Applications 11g Implementations
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
Michael Rainey
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
Mark Rittman
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Rittman Analytics
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Open Analytics
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenChristopher Whitaker
 
Using Schema-Less Data API for Building Self-Service Analytics
Using Schema-Less Data API for Building Self-Service AnalyticsUsing Schema-Less Data API for Building Self-Service Analytics
Using Schema-Less Data API for Building Self-Service Analytics
Dionizas Antipenkovas
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
Mark Rittman
 
Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)
Wes McKinney
 
Analyzing a Link with Google's Eyes by Matteo Monari
Analyzing a Link with Google's Eyes by Matteo MonariAnalyzing a Link with Google's Eyes by Matteo Monari
Analyzing a Link with Google's Eyes by Matteo MonariBizup
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
Dremio Corporation
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
DATAVERSITY
 
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)
Mark Rittman
 
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake EditionFrom BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
Rittman Analytics
 
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Mark Rittman
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment Options
Caserta
 

Similar to Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I didn't use Oracle Database 12c for this) (20)

IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
 
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
 
Building the Inform Semantic Publishing Ecosystem: from Author to Audience
Building the Inform Semantic Publishing Ecosystem: from Author to AudienceBuilding the Inform Semantic Publishing Ecosystem: from Author to Audience
Building the Inform Semantic Publishing Ecosystem: from Author to Audience
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Semantics and Machine Learning
Semantics and Machine LearningSemantics and Machine Learning
Semantics and Machine Learning
 
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
Practical Tips for Oracle Business Intelligence Applications 11g ImplementationsPractical Tips for Oracle Business Intelligence Applications 11g Implementations
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
 
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
 
Using Schema-Less Data API for Building Self-Service Analytics
Using Schema-Less Data API for Building Self-Service AnalyticsUsing Schema-Less Data API for Building Self-Service Analytics
Using Schema-Less Data API for Building Self-Service Analytics
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
 
Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)
 
Analyzing a Link with Google's Eyes by Matteo Monari
Analyzing a Link with Google's Eyes by Matteo MonariAnalyzing a Link with Google's Eyes by Matteo Monari
Analyzing a Link with Google's Eyes by Matteo Monari
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)
OBIEE, Endeca, Hadoop and ORE Development (on Exalytics) (ODTUG 2013)
 
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake EditionFrom BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
From BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition
 
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment Options
 

More from Mark Rittman

Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle Cloud
Mark Rittman
 
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Mark Rittman
 
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements  traditional business anal...What is Big Data Discovery, and how it complements  traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...
Mark Rittman
 
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Mark Rittman
 
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
Mark Rittman
 
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
Mark Rittman
 
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODIBIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
Mark Rittman
 
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI ProjectsOGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
Mark Rittman
 
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12cUKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
Mark Rittman
 
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Mark Rittman
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Mark Rittman
 
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12cPart 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Mark Rittman
 

More from Mark Rittman (12)

Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle Cloud
 
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
 
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements  traditional business anal...What is Big Data Discovery, and how it complements  traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...
 
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
 
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
 
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
 
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODIBIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
 
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI ProjectsOGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
 
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12cUKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
 
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
 
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12cPart 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
 

Recently uploaded

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 

Recently uploaded (20)

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 

Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I didn't use Oracle Database 12c for this)

  • 1. info@rittmanmead.com www.rittmanmead.com @rittmanmead Social Network Analysis using NoSQL and Hadoop
 (and why I didn’t use Oracle 12c Spatial & Graph) Mark Rittman, CTO, Rittman Mead OGh SQL Celebration Day, Netherlands, June 2016
  • 2. info@rittmanmead.com www.rittmanmead.com @rittmanmead 2 •Oracle Gold Partner with offices in the UK and USA (Atlanta) •70+ staff delivering Oracle BI, DW, Big Data and Advanced Analytics projects •Oracle ACE Director (Mark Rittman, CTO) + 2 Oracle ACEs •Significant web presence with the 
 Rittman Mead Blog (http://www.rittmanmead.com) •Regular sers of social media 
 (Facebook, Twitter, Slideshare etc) •Regular column in Oracle Magazine 
 and other publications •Hadoop R&D lab for “dogfooding” 
 solutions developed for customers About Rittman Mead
  • 5.
  • 8. (not as bad as that one though)
  • 9. but this is a good example
 of when NoSQL + Hadoop is a better solution
  • 10. info@rittmanmead.com www.rittmanmead.com @rittmanmead 10 Business Scenario •Rittman Mead want to understand drivers and audience for their website ‣What is our most popular content? Who are the most in-demand blog authors? ‣Who are the influencers? What communities exist around our web presence? •Three data sources in scope: RM Website Logs Twitter Stream Website Posts, Comments etc
  • 11. info@rittmanmead.com www.rittmanmead.com @rittmanmead 11 •ODI provides an excellent framework for running Hadoop ETL jobs ‣ELT approach pushes transformations down to Hadoop - leveraging power of cluster •Hive, HBase, Sqoop and OLH/ODCH KMs provide native Hadoop loading / transformation ‣Whilst still preserving RDBMS push-down ‣Extensible to cover Pig, Spark etc •Process orchestration •Data quality / error handling •Metadata and model-driven •New in 12.1.3.0.1 - ability to generate
 Pig and Spark jobs too Real-Time & Batch Log & Event Ingestion : ODI12c
  • 12. info@rittmanmead.com www.rittmanmead.com @rittmanmead 12 •Initial iteration of project focused on capturing and ingesting web + social media activity •Apache Flume used for capturing website hits, page views •Twitter Streaming API used to capture tweets referring to RM website or RM staff •Activity landed into Hadoop (HDFS), processed and enriched and presented using Hive Overall Project Architecture - Phase 1
  • 13. info@rittmanmead.com www.rittmanmead.com @rittmanmead 13 •Provided real-time counts of page views, correlated with Twitter activity stored in Hive tables •Accessed using Oracle Big Data SQL +
 joined to Oracle RDBMS reference data •Delivered using OBIEE reports and dashboards •Data Warehousing, but cheaper + real-time •Answered questions such as ‣What are our most popular site pages? ‣Which pages attracted the most
 attention on Twitter, Facebook? ‣What topics are popular? Real-Time Metrics around Site Activity - “What?” Combine with Oracle Big Data SQL for structured OBIEE dashboard analysis What pages are people visiting? Who is referring to us on Twitter? What content has the most reach?
  • 14. info@rittmanmead.com www.rittmanmead.com @rittmanmead 14 •Good question - especially when Oracle Database 12c can natively store JSON documents •But the real question is : “why use Oracle Database for ingesting this data?” ‣It’s much more expensive and over- engineered for the job compared to Hadoop ‣Being ACID-compliant, it’s going to incur more overhead = less ingest capacity/hour ‣All the community and vendor property graph expertise based around Hadoop Why Not Use Oracle Database for This?
  • 15. info@rittmanmead.com www.rittmanmead.com @rittmanmead 15 •Oracle Big Data Discovery used to go back to the raw event data add more meaning •Enrich data, extract nouns + terms, add reference data from file, RDBMS etc •Understand sentiment + meaning of tweets, link disparate + loosely coupled events •Faceted search dashboards Oracle BDD for Data Wrangling + Data Enrichment
  • 16. info@rittmanmead.com www.rittmanmead.com @rittmanmead 16 Answered the “What” and “Why” Questions… •Counts of page views, tweets, mentions etc helped us understand what content was popular •Analysis of tweet sentiment, meaning and correlation with content answered why Combine with Oracle Big Data SQL for structured OBIEE dashboard analysis Combine with site content, semantics, text enrichment Catalog and explore using Oracle Big Data Discovery What pages are people visiting? Who is referring to us on Twitter? What content has the most reach? Why is some content more popular? Does sentiment affect viewership? What content is popular, where?
  • 17. info@rittmanmead.com www.rittmanmead.com @rittmanmead 17 •Previous counts assumed that all tweet references equally important •But some Twitter users are far more influential than others ‣Sit at the centre of a community, have 1000’s of followers ‣A reference by them has massive impact on page views ‣Positive or negative comments from them drive perception •Can we identify them? ‣Potentially “reach out” with analyst program ‣Study what website posts go “viral” ‣Understand out audience, and the conversation, better But Who Are The Influencers In Our Community? Influencer Identification Communication Stream (e.g. tweets) Find out people that are central in the given network – e.g. influencer marketing
  • 18. info@rittmanmead.com www.rittmanmead.com @rittmanmead 18 •Rittman Mead website features many types of content ‣Blogs on BI, data integration, big data, data warehousing ‣Op-Eds (“OBIEE12c - Three Months In, What’s the Verdict?”) ‣Articles on a theme, e.g. performance tuning ‣Details of new courses, new promotions •Different communities likely to form around these content types •Different influencers and patterns of recommendation, discovery •Can we identify some of the communities, segment our audience? What Communities and Networks Are Our Audience? Community Detection Identify group of people that are close to each other – e.g. target group marketing
  • 19. info@rittmanmead.com www.rittmanmead.com @rittmanmead 19 Tabular (SQL) Query Tools Aimed at Counts + Aggs
  • 20. info@rittmanmead.com www.rittmanmead.com @rittmanmead 20 Graph Example : RM Blog Post Referenced on Twitter Lifting the Lid on OBIEE Internals with 
 Linux Diagnostics Tools http://t.co/gFcUPOm5pI 00 0 0 Page Views10 0 0 Page Views Follows 20 0 0 Page Views Follows 30 0 0 Page Views
  • 21. info@rittmanmead.com www.rittmanmead.com @rittmanmead 21 Network Effect Magnified by Extent of Social Graph Lifting the Lid on OBIEE Internals with 
 Linux Diagnostics Tools http://t.co/gFcUPOm5pI 30 0 0 Page Views70 0 5 Page Views Lifting the Lid on OBIEE Internals with 
 Linux Diagnostics Tools http://t.co/gFcUPOm5pI
  • 22. info@rittmanmead.com www.rittmanmead.com @rittmanmead 22 Retweets, Mentions and Replies Create Communities Retweet Reply Mention Reply #bigdatasql Reply Mention Mention Mention Mention #thatswhatshesaid
  • 23. info@rittmanmead.com www.rittmanmead.com @rittmanmead 23 This is What’s Termed a “Property Graph” Lifting the Lid on OBIEE Internals with 
 Linux Diagnostics Tools http://t.co/gFcUPOm5pI Mentions Lifting the Lid on OBIEE Internals with 
 Linux Diagnostics Tools http://t.co/gFcUPOm5pI Retweets Node, or “Vertex” Directed Connection, or “Edge” Node, or “Vertex”
  • 24. info@rittmanmead.com www.rittmanmead.com @rittmanmead 24 •Different types of Twitter interaction could imply more or less “influence”
 ‣Retweet of another user’s Tweet 
 implies that person is worth quoting
 or you endorse their opinion
 ‣Reply to another user’s tweet 
 could be a weaker recognition of 
 that person’s opinion or view
 ‣Mention of a user in a tweet is a 
 weaker recognition that they are 
 part of a community / debate Determining Influencers - Factors to Consider
  • 25. info@rittmanmead.com www.rittmanmead.com @rittmanmead 25 Relative Importance of Edge Types Added via Weights Lifting the Lid on OBIEE Internals with 
 Linux Diagnostics Tools http://t.co/gFcUPOm5pI Mentions, Weight = 30 Lifting the Lid on OBIEE Internals with 
 Linux Diagnostics Tools http://t.co/gFcUPOm5pI Retweet, Weight = 100 Edge Property Edge Property
  • 26. info@rittmanmead.com www.rittmanmead.com @rittmanmead 26 •Graph, spatial and raster data processing for big data ‣Primarily documented + tested against Oracle BDA ‣Installable on commodity cluster using CDH •Data stored in Apache HBase or Oracle NoSQL DB ‣Complements Spatial & Graph in Oracle Database ‣Designed for trillions of nodes, edges etc •Out-of-the-box spatial enrichment services •Over 35 of most popular graph analysis functions ‣Graph traversal, recommendations ‣Finding communities and influencers, ‣Pattern matching Oracle Big Data Spatial & Graph
  • 27. info@rittmanmead.com www.rittmanmead.com @rittmanmead 27 Why Not Use Oracle 12c + Spatial & Graph Option?
  • 28. A different type of graph
  • 29. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Data loaded from files or through Java API into HBase •In-Memory Analytics layer runs common graph and spatial algorithms on data •Visualised using R or other
 graphics packaged Oracle Big Data Graph and Spatial Architecture Massively Scalable Graph Store • Oracle NoSQL • HBase Lightning-Fast In-Memory Analytics • YARN Container • Standalone Server • Embedded
  • 31. info@rittmanmead.com www.rittmanmead.com @rittmanmead 31 Graph Types in Oracle Database Spatial & Graph
  • 32. info@rittmanmead.com www.rittmanmead.com @rittmanmead 32 •ODI12c used to prepare two files in Oracle Flat File Format ‣Extracted vertices and edges from existing data in Hive ‣Wrote vertices (Twitter users) to .opv file, 
 edges (RTs, replies etc) to .ope file •For exercise, only considered 2-3 days of tweets ‣Did not include follows (user A followed user B)
 as not reported by Twitter Streaming API ‣Could approximate larger follower networks through
 multiplying weight of edge by follower scale -Useful for Page Rank, but does it skew 
 actual detection of influencers in exercise? Preparing Vertices and Edges for Ingestion
  • 33. info@rittmanmead.com www.rittmanmead.com @rittmanmead 33 Oracle Flat File Format Vertices and Edge Files • Unique ID for the vertex • Property name (“name”) • Property value datatype (1 = String) • Property value (“markrittman”) Vertex File (.opv) • Unique ID for the edge • Leading edge vertex ID • Trailing edge vertex ID • Edge Type (“mentions”) • Edge Property (“weight”) • Edge Property datatype and value Edge File (.ope)
  • 34. info@rittmanmead.com www.rittmanmead.com @rittmanmead 34 cfg = GraphConfigBuilder.forPropertyGraphHbase() .setName("connectionsHBase") .setZkQuorum("bigdatalite").setZkClientPort(2181) .setZkSessionTimeout(120000).setInitialEdgeNumRegions(3) .setInitialVertexNumRegions(3).setSplitsPerRegion(1) .addEdgeProperty("weight", PropertyType.DOUBLE, "1000000") .build(); opg = OraclePropertyGraph.getInstance(cfg); opg.clearRepository(); vfile="../../data/biwa_connections.opv" efile="../../data/biwa_connections.ope" opgdl=OraclePropertyGraphDataLoader.getInstance(); opgdl.loadData(opg, vfile, efile, 2); // read through the vertices opg.getVertices(); // read through the edges opg.getEdges(); Loading Edges and Vertices into HBase Uses “Gremlin” Shell for HBase • Creates connection to HBase • Sets initial configuration for database • Builds the database ready for load • Defines location of Vertex and Edge files • Creates instance of 
 OraclePropertyGraphDataLoader • Loads data from files • Prepares the property graph for use • Loads in Edges and Vertices • Now ready for in-memory processing
  • 35. info@rittmanmead.com www.rittmanmead.com @rittmanmead 35 Choice of Persistent Stores for Property Graph Model
  • 36. info@rittmanmead.com www.rittmanmead.com @rittmanmead 36 Accompanied by PGX : Graph Analysis Framework
  • 37. (so that’s why I used Hadoop)
  • 38. info@rittmanmead.com www.rittmanmead.com @rittmanmead 38 Calculating Most Influential Tweeters Using Page Rank vOutput="/tmp/mygraph.opv" eOutput="/tmp/mygraph.ope" OraclePropertyGraphUtils.exportFlatFiles(opg, vOutput, eOutput, 2, false); session = Pgx.createSession("session-id-1"); analyst = session.createAnalyst(); graph = session.readGraphWithProperties(opg.getConfig()); rank = analyst.pagerank(graph, 0.001, 0.85, 100); rank.getTopKValues(5); ==>PgxVertex with ID 1=0.13885623487462861 ==>PgxVertex with ID 3=0.08686102641801993 ==>PgxVertex with ID 101=0.06757752513733056 ==>PgxVertex with ID 6=0.06743774001139484 ==>PgxVertex with ID 37=0.0481517609757462 ==>PgxVertex with ID 17=0.042234536894569276 ==>PgxVertex with ID 29=0.04109794527311113 ==>PgxVertex with ID 65=0.032058649698044187 ==>PgxVertex with ID 15=0.023075360575195276 ==>PgxVertex with ID 93=0.019265959946506813 • Initiates an in-memory analytics session • Runs Page Rank algorithm to determine influencers • Outputs top ten vertices (users) Top 10 vertices
  • 39. info@rittmanmead.com www.rittmanmead.com @rittmanmead 39 Calculating Most Influential Tweeters Using Page Rank v1=opg.getVertex(1l); v2=opg.getVertex(3l); v3=opg.getVertex(101l); v4=opg.getVertex(6l); v5=opg.getVertex(37l); v6=opg.getVertex(17l); v7=opg.getVertex(29l); v8=opg.getVertex(65l); v9=opg.getVertex(15l); v10=opg.getVertex(93l); System.out.println("Top 10 influencers: n " + v1.getProperty("name") + "n " + v2.getProperty("name") + "n " + v3.getProperty("name") + "n " + v4.getProperty("name") + "n " + v5.getProperty("name") + "n " + v6.getProperty("name") + "n " + v7.getProperty("name") + "n " + v8.getProperty("name") + "n " + v9.getProperty("name") + "n " + v10.getProperty("name")); Top 10 influencers: markrittman rmoff rittmanmead mRainey JeromeFr Nephentur borkur BIExperte i_m_dave dw_pete Note : Over a 3-day period in May 2015 Twitter users referencing RM website + staff accounts
  • 40. info@rittmanmead.com www.rittmanmead.com @rittmanmead 40 •Open source graph analysis tool with Oracle Big Data Graph and Spatial Plug-in •Available shortly from Oracle, connects to Oracle NoSQL or HBase and runs Page Rank etc •Alternative to command-line for In-Memory Analytics once base graph created Visualising Property Graphs with Cityscape
  • 41. info@rittmanmead.com www.rittmanmead.com @rittmanmead 41 Calculating Top 10 Users using Page Rank Algorithm
  • 42. info@rittmanmead.com www.rittmanmead.com @rittmanmead 42 Visualising the Social Graph Around Particular Users
  • 43. info@rittmanmead.com www.rittmanmead.com @rittmanmead 43 Detecting Clusters (Communities)
  • 44. info@rittmanmead.com www.rittmanmead.com @rittmanmead 44 Calculating Shortest Path Between Users
  • 45. info@rittmanmead.com www.rittmanmead.com @rittmanmead 45 Conclusions, and Further Reading •Tools such as OBIEE are great for understanding what (counts, page views, popular items) •Oracle Big Data Discovery can be useful for understanding “why?” (sentiment, terms etc) •Graph Analysis can help answer “who”? •Who are our audience? What are our communities? Who are their important influencers? •Oracle Big Data Graph and Spatial can answer these questions to “big data” scale •Articles on the Rittman Mead Blog ‣http://www.rittmanmead.com/category/oracle-big-data-appliance/ ‣http://www.rittmanmead.com/category/big-data/ ‣http://www.rittmanmead.com/category/oracle-big-data-discovery/ •Rittman Mead offer consulting, training and managed services for Oracle Big Data ‣http://www.rittmanmead.com/bigdata
  • 46. info@rittmanmead.com www.rittmanmead.com @rittmanmead Oracle Big Data Spatial & Graph
 Social Media Analysis - Case Study Mark Rittman, CTO, Rittman Mead OGh SQL Celebration Day, Netherlands, June 2016