SlideShare a Scribd company logo
How Concur Uses Big Data to 
Get You to TC On Time 
Denny Lee 
Senior Director, Data Sciences Engineering
About Concur 
What do we do? 
• Leading provider of spend management solutions and 
(Travel, Invoice, TripIt, etc.) services in the world 
• Global customer base of 20,000 clients and 25 million 
users 
• Processing more than $50 Billion in Travel & Expense 
(T&E) spend each year
About the Speaker 
Who Am I? 
• Long time SQL Server BI 
guy (24TB Yahoo! Cube) 
• Project Isotope (Hadoop 
on Windows and Azure) 
• At Concur, helping with 
Big Data and Data 
Sciences
Is Big Data …. 
The most overused buzzword today? 
An actual useful framework? 
Yes!
Consolidate Visualize Insight Recommend 
TechBar 
Themes
Consolidate
BTS 
Invoice Web Analytics 
Expense 
Travel 
Weather
A long time ago… 
• We started using Hadoop because 
• It was free 
• i.e. Didn’t want to pay for a big data warehouse 
• Could slowly extract from hundreds of relational data 
sources, consolidate it, and query it 
• We were not thinking about advanced analytics 
• We were thinking …. “cheaper reporting” 
• We have some hardware lying around … let’s cobble it 
together and now we have reports
But why Hadoop? 
• Even with primarily relational systems, it involved 
hundreds of sources 
• Getting Tableau or any BI tool to connect to so many 
sources is … not fun 
• More times than not, we needed to understand a subset 
or aggregate of this data - not all of the data! 
• Can use Pig to process, extract, filter the data 
• Can use Hive - a SQL like query language - to query my 
data
Invoice 
Expense 
Travel
Visualize
demo 
Querying Hive via Hue and Tableau 
to understand Air Traffic patterns
Connecting to Hive using Hue - can query using HiveQL, a SQL-like query language
Install Cloudera Hive Driver, Connect to Cloudera Hadoop, fill in above 
and you’re connected to Hive
Connecting Tableau to Hive may take a very long time in Live mode
Instead, choose Extract which will bring the data across from Hive and you 
run live queries within Tableau. Note, the extraction will take a long time too!
Now that the data is in Tableau, I can pivot, slice, and filter at the speed of thought!
Can quickly switch to map mode and determine where most itineraries are from in 2013
If you’re expecting to Hadoop 
or Hive to be fast….
Evolution of Hive 
• Hive built originally by Facebook placed 
a SQL-like query language in front of 
Hadoop Map-Reduce. 
• Has its flexibility but also its overhead 
and complexity 
• Apache community working on Hive 
Stinger project to advance Hive 
including DAG scheduler, optimized 
columnar format, and improved engine 
semantics
Insight
demo 
Querying Impala via Hue and Tableau 
to understand Air Departure Delays
Query airport information using Impala, sort of looks like Hive so far…
But notice the query running in Impala significantly faster!
Not just limit 10 types of queries but ones that involve more complicated 
where clauses
And quickly chart out the results - e.g. highest airport in Taiwan is 
Sun Moon Lake
Or even quickly map out the airport locations on a map to see that Sun Moon 
Lake Airport is in the center of Taiwan
And using Impala is not just for Hue 
- its even better on Tableau
Now I can connect to my data live and have fast queries returned to Tableau
After quickly modifying the data within Tableau, can discover the amount of flight 
delays to Seattle, and denote that San Jose has the least # of delays
Why Impala? 
• Focus is to speed up BI queries 
• Analogous to relational BI tools except 
now I can do this against a distributed 
cluster 
• Similar to relational BI tools that as its 
special purpose, can do a lot of 
optimizations to improve speed 
• But note this demo was against the 
same Hive table against data stored in 
Hadoop
demo 
Leveraging AtScale to build models on 
Impala and slicing them in Tableau
Using AtScale to build up a dimensional model based on the data that is 
stored within Impala / Hive
Slice and filter the Impala model using Tableau 
For more info, check out: http://atscale.com/
Data Extraction 
How to query multiple endpoints or multiple data sources? 
Setup a whole bunch of VMs and have someone connecting to 
each one and executing get commands?
Optimizing Data Extraction 
Use Hadoop streaming to execute python script to perform get 
Hadoop will generate tasks for each API get call and then execute 
it across all the clusters in the node in parallel
Recommend
TechBar 
Quick Primer on Apache Spark
What is Apache Spark? 
Fast and general cluster computing system 
interoperable with Hadoop 
Improves efficiency through: 
»In-memory computing primitives 
»General computation graphs 
Improves usability through: 
»Rich APIs in Scala, Java, Python 
»Interactive shell 
Up to 100× faster 
(2-10× on disk) 
2-5× less code
Project History 
Started in 2009, open sourced 2010 
30+ companies now contributing code 
»Databricks, Yahoo!, Intel, Adobe, Cloudera, Bizo, 
… 
One of the largest communities in big data
A General Stack 
Spark 
Spark 
Streaming 
real-time 
Shark 
SQL 
GraphX 
graph 
MLlib 
machine 
learning 
…
demo 
Applying Spark for Recommendations
Starbucks Store #3313 
601 108th Ave NE 
Bellevue, WA (425) 646-9602 
------------------------------- 
Chk 713452 
05/14/2014 11:04 AM 
1961558 Drawer: 1 Reg: 1 
------------------------------- 
Bacon Art Brkfst 3.45 
Warmed 
T1 Latte 2.70 
Triple 1.50 
Soy 0.60 
Gr Vanilla Mac 4.15 
Reload Card 50.00 
AMEX $50.00 
XXXXXXXXXXXXXXXXXX1004 
SBUX Card $13.56 
SUBTOTAL $62.40 
New Caffe Espresso 
Frappuccino(R) Blended beverage 
Our Signature 
Frappuccino(R) roast coffee and 
fresh milk, blended with ice. 
Topped with our new espresso 
whipped cream and new 
Italian roast drizzle 
Expense Categorization 
One of my receipts that I had OCRed 
One of the issues we’re trying to solve 
is to auto-categorize this, so how 
can we do this? 
Below is a simplistic solution using 
WordCount 
Note, a real solution should involve 
machine learning algorithms
Spark assembly has been built with Hive, including Datanucleus jars on 
classpath 
Welcome to 
____ __ 
/ __/__ ___ _____/ /__ 
_ / _ / _ `/ __/ '_/ 
/___/ .__/_,_/_/ /_/_ version 1.1.0 
/_/ 
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45) 
Type in expressions to have them evaluated. 
Type :help for more information. 
2014-09-07 22:31:21.064 java[1871:15527] Unable to load realm info from 
SCDynamicStore 
14/09/07 22:31:21 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable 
Spark context available as sc. 
scala> val receipt = 
sc.textFile("/usr/local/Cellar/workspace/data/receipt/receipt.txt") 
receipt: org.apache.spark.rdd.RDD[String] = 
/usr/local/Cellar/workspace/data/receipt/receipt.txt MappedRDD[1] at textFile 
at <console>:12 
scala> receipt.count 
res0: Long = 30
scala> val words = receipt.flatMap(_.split(" ")) 
words: org.apache.spark.rdd.RDD[String] = FlatMappedRDD[2] at flatMap at 
<console>:14 
scala> words.count 
res1: Long = 161 
scala> words.distinct.count 
res2: Long = 72 
scala> val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + 
_).map{case(x,y) => (y,x)}.sortByKey(false).map{case(i,j) => (j, i)} 
wordCounts: org.apache.spark.rdd.RDD[(String, Int)] = MappedRDD[12] at map at 
<console>:16 
scala> wordCounts.take(12) 
res5: Array[(String, Int)] = Array(("",82), (with,2), 
(Card,2), (new,2), (------------------------------- 
,2), (Frappuccino(R),2), (roast,2), (1,2), (and,2), 
(New,1), (Topped,1), (Starbucks,1))
Still beta, but can connect from Tableau to SparkSQL using Shark driver
Can / will be able to connect to this SparkSQL live
Quick view of Android vs. iOS mobile sessions
SparkSQL - What’s Next? 
• Currently makes use of Hive code-base 
• Major focus for 1.2 
• Pluggable external datasources 
• Easier access through pure SQL 
interface 
• Access things like JSON tables 
though SQL?
Consolidate Visualize Insight Recommend
Invite 
• Pacific Northwest Cloudera User Group 
• http://bit.ly/1uFD6vJ 
• Doug Cutting, Hadoop Co-Creator, will be speaking at 
Disney on 9/24 
• Seattle Spark Meetup 
• http://bit.ly/1q4Z0Ke 
• Next sessions: 
• Deep Dive into Spark and Mesos Internals 
• Unlocking your Hadoop data with Apache Spark 
and CDH5
Q&A
How Concur uses Big Data to get you to Tableau Conference On Time

More Related Content

What's hot

Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Databricks
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Michael Rys
 
Overview of stinger interactive query for hive
Overview of stinger   interactive query for hiveOverview of stinger   interactive query for hive
Overview of stinger interactive query for hive
David Kaiser
 
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
DataWorks Summit
 
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
Show me the Money! Cost & Resource  Tracking for Hadoop and Storm Show me the Money! Cost & Resource  Tracking for Hadoop and Storm
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
DataWorks Summit/Hadoop Summit
 
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
Edureka!
 
AWS July Webinar Series: Amazon Redshift Reporting and Advanced Analytics
AWS July Webinar Series: Amazon Redshift Reporting and Advanced AnalyticsAWS July Webinar Series: Amazon Redshift Reporting and Advanced Analytics
AWS July Webinar Series: Amazon Redshift Reporting and Advanced Analytics
Amazon Web Services
 
B3 - Business intelligence apps on aws
B3 - Business intelligence apps on awsB3 - Business intelligence apps on aws
B3 - Business intelligence apps on aws
Amazon Web Services
 
Introduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLIntroduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLNick Dimiduk
 
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
Edureka!
 
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Edureka!
 
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Web Services
 
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
Jason L Brugger
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Qubole
 
Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...
Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...
Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...
Amazon Web Services
 
Qubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europeQubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europeJoydeep Sen Sarma
 
Xldb2011 tue 0940_facebook_realtimeanalytics
Xldb2011 tue 0940_facebook_realtimeanalyticsXldb2011 tue 0940_facebook_realtimeanalytics
Xldb2011 tue 0940_facebook_realtimeanalyticsliqiang xu
 
Apache spark
Apache sparkApache spark
Apache spark
TEJPAL GAUTAM
 
High-Performance Advanced Analytics with Spark-Alchemy
High-Performance Advanced Analytics with Spark-AlchemyHigh-Performance Advanced Analytics with Spark-Alchemy
High-Performance Advanced Analytics with Spark-Alchemy
Databricks
 
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Amazon Web Services
 

What's hot (20)

Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
 
Overview of stinger interactive query for hive
Overview of stinger   interactive query for hiveOverview of stinger   interactive query for hive
Overview of stinger interactive query for hive
 
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
 
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
Show me the Money! Cost & Resource  Tracking for Hadoop and Storm Show me the Money! Cost & Resource  Tracking for Hadoop and Storm
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
 
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
Spark Streaming | Twitter Sentiment Analysis Example | Apache Spark Training ...
 
AWS July Webinar Series: Amazon Redshift Reporting and Advanced Analytics
AWS July Webinar Series: Amazon Redshift Reporting and Advanced AnalyticsAWS July Webinar Series: Amazon Redshift Reporting and Advanced Analytics
AWS July Webinar Series: Amazon Redshift Reporting and Advanced Analytics
 
B3 - Business intelligence apps on aws
B3 - Business intelligence apps on awsB3 - Business intelligence apps on aws
B3 - Business intelligence apps on aws
 
Introduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLIntroduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQL
 
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
 
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
 
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
 
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
 
Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...
Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...
Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...
 
Qubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europeQubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europe
 
Xldb2011 tue 0940_facebook_realtimeanalytics
Xldb2011 tue 0940_facebook_realtimeanalyticsXldb2011 tue 0940_facebook_realtimeanalytics
Xldb2011 tue 0940_facebook_realtimeanalytics
 
Apache spark
Apache sparkApache spark
Apache spark
 
High-Performance Advanced Analytics with Spark-Alchemy
High-Performance Advanced Analytics with Spark-AlchemyHigh-Performance Advanced Analytics with Spark-Alchemy
High-Performance Advanced Analytics with Spark-Alchemy
 
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
 

Viewers also liked

Concur Automated Travel and Expense Management
Concur Automated Travel and Expense ManagementConcur Automated Travel and Expense Management
Concur Automated Travel and Expense Management
Net at Work
 
Concur Discovers the True Value of Data
Concur Discovers the True Value of DataConcur Discovers the True Value of Data
Concur Discovers the True Value of Data
Cloudera, Inc.
 
Intro to Concur Presentation
Intro to Concur PresentationIntro to Concur Presentation
Intro to Concur PresentationKalen Flanagan
 
Concur integration with PI solution pack
Concur integration with PI solution pack Concur integration with PI solution pack
Concur integration with PI solution pack
SatyaSuman Lakkimsetty
 
Concur vs SAP on premise Travel Management
Concur vs SAP on premise Travel ManagementConcur vs SAP on premise Travel Management
Concur vs SAP on premise Travel Management
Sven Ringling
 
Concur Travel and Expense
Concur Travel and ExpenseConcur Travel and Expense
Concur Travel and Expensejamielynch8
 
Acquis Consulting - Harnessing The Power of Travel & Entertainment Data
Acquis Consulting - Harnessing The Power of Travel & Entertainment DataAcquis Consulting - Harnessing The Power of Travel & Entertainment Data
Acquis Consulting - Harnessing The Power of Travel & Entertainment Data
SAP Concur
 
Concur and adp a unified vision
Concur and adp   a unified visionConcur and adp   a unified vision
Concur and adp a unified visionElliot Lazarus
 
Concur Case Study_Eloqua_2011
Concur Case Study_Eloqua_2011Concur Case Study_Eloqua_2011
Concur Case Study_Eloqua_2011Greg Forrest
 
Flyer concurforce sales
Flyer concurforce salesFlyer concurforce sales
Flyer concurforce salesElliot Lazarus
 
DMAIC Analysis on Purdue Travel Procedure CONCUR User Integration Improvement...
DMAIC Analysis on Purdue Travel Procedure CONCUR User Integration Improvement...DMAIC Analysis on Purdue Travel Procedure CONCUR User Integration Improvement...
DMAIC Analysis on Purdue Travel Procedure CONCUR User Integration Improvement...
J D
 
Data first with Tableau [FutureStack16]
Data first with Tableau [FutureStack16]Data first with Tableau [FutureStack16]
Data first with Tableau [FutureStack16]
New Relic
 
Tableau Suite Analysis
Tableau Suite Analysis Tableau Suite Analysis
Tableau Suite Analysis
Kymberly Grayson-Perry
 
whitepaper_advanced_analytics_with_tableau_eng
whitepaper_advanced_analytics_with_tableau_engwhitepaper_advanced_analytics_with_tableau_eng
whitepaper_advanced_analytics_with_tableau_engS. Hanau
 
How Concur Technologies (a SAP company) Leverages Visual Testing for Localiza...
How Concur Technologies (a SAP company) Leverages Visual Testing for Localiza...How Concur Technologies (a SAP company) Leverages Visual Testing for Localiza...
How Concur Technologies (a SAP company) Leverages Visual Testing for Localiza...
Applitools
 
Steve Jarvis - Concur - Boosting Productivity For Travel & Expense Management
Steve Jarvis - Concur - Boosting Productivity For Travel & Expense ManagementSteve Jarvis - Concur - Boosting Productivity For Travel & Expense Management
Steve Jarvis - Concur - Boosting Productivity For Travel & Expense Management
Ramon Ray
 
Tableau @ Facebook - Summer 2014
Tableau @ Facebook - Summer 2014Tableau @ Facebook - Summer 2014
Tableau @ Facebook - Summer 2014
Andy Kriebel
 
Designing dashboards for performance shridhar wip 040613
Designing dashboards for performance shridhar wip 040613Designing dashboards for performance shridhar wip 040613
Designing dashboards for performance shridhar wip 040613
Mrunal Shridhar
 
My tableau
My tableauMy tableau
My tableau
Girish Srivastava
 

Viewers also liked (20)

Concur Automated Travel and Expense Management
Concur Automated Travel and Expense ManagementConcur Automated Travel and Expense Management
Concur Automated Travel and Expense Management
 
Concur Discovers the True Value of Data
Concur Discovers the True Value of DataConcur Discovers the True Value of Data
Concur Discovers the True Value of Data
 
Intro to Concur Presentation
Intro to Concur PresentationIntro to Concur Presentation
Intro to Concur Presentation
 
Concur integration with PI solution pack
Concur integration with PI solution pack Concur integration with PI solution pack
Concur integration with PI solution pack
 
Concur vs SAP on premise Travel Management
Concur vs SAP on premise Travel ManagementConcur vs SAP on premise Travel Management
Concur vs SAP on premise Travel Management
 
Concur Travel and Expense
Concur Travel and ExpenseConcur Travel and Expense
Concur Travel and Expense
 
Acquis Consulting - Harnessing The Power of Travel & Entertainment Data
Acquis Consulting - Harnessing The Power of Travel & Entertainment DataAcquis Consulting - Harnessing The Power of Travel & Entertainment Data
Acquis Consulting - Harnessing The Power of Travel & Entertainment Data
 
Concur and adp a unified vision
Concur and adp   a unified visionConcur and adp   a unified vision
Concur and adp a unified vision
 
Concur Case Study_Eloqua_2011
Concur Case Study_Eloqua_2011Concur Case Study_Eloqua_2011
Concur Case Study_Eloqua_2011
 
Flyer concurforce sales
Flyer concurforce salesFlyer concurforce sales
Flyer concurforce sales
 
Concur for mobile
Concur for mobileConcur for mobile
Concur for mobile
 
DMAIC Analysis on Purdue Travel Procedure CONCUR User Integration Improvement...
DMAIC Analysis on Purdue Travel Procedure CONCUR User Integration Improvement...DMAIC Analysis on Purdue Travel Procedure CONCUR User Integration Improvement...
DMAIC Analysis on Purdue Travel Procedure CONCUR User Integration Improvement...
 
Data first with Tableau [FutureStack16]
Data first with Tableau [FutureStack16]Data first with Tableau [FutureStack16]
Data first with Tableau [FutureStack16]
 
Tableau Suite Analysis
Tableau Suite Analysis Tableau Suite Analysis
Tableau Suite Analysis
 
whitepaper_advanced_analytics_with_tableau_eng
whitepaper_advanced_analytics_with_tableau_engwhitepaper_advanced_analytics_with_tableau_eng
whitepaper_advanced_analytics_with_tableau_eng
 
How Concur Technologies (a SAP company) Leverages Visual Testing for Localiza...
How Concur Technologies (a SAP company) Leverages Visual Testing for Localiza...How Concur Technologies (a SAP company) Leverages Visual Testing for Localiza...
How Concur Technologies (a SAP company) Leverages Visual Testing for Localiza...
 
Steve Jarvis - Concur - Boosting Productivity For Travel & Expense Management
Steve Jarvis - Concur - Boosting Productivity For Travel & Expense ManagementSteve Jarvis - Concur - Boosting Productivity For Travel & Expense Management
Steve Jarvis - Concur - Boosting Productivity For Travel & Expense Management
 
Tableau @ Facebook - Summer 2014
Tableau @ Facebook - Summer 2014Tableau @ Facebook - Summer 2014
Tableau @ Facebook - Summer 2014
 
Designing dashboards for performance shridhar wip 040613
Designing dashboards for performance shridhar wip 040613Designing dashboards for performance shridhar wip 040613
Designing dashboards for performance shridhar wip 040613
 
My tableau
My tableauMy tableau
My tableau
 

Similar to How Concur uses Big Data to get you to Tableau Conference On Time

The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkThe Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache Spark
Cloudera, Inc.
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNATomas Cervenka
 
Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013
MLconf
 
Big data clustering
Big data clusteringBig data clustering
Big data clustering
Jagadeesan A S
 
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data PlatformsCassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
DataStax Academy
 
SQL on Hadoop in Taiwan
SQL on Hadoop in TaiwanSQL on Hadoop in Taiwan
SQL on Hadoop in Taiwan
Treasure Data, Inc.
 
Devops kc meetup_5_20_2013
Devops kc meetup_5_20_2013Devops kc meetup_5_20_2013
Devops kc meetup_5_20_2013Aaron Blythe
 
Sparkflows - Build E2E Data Analytics Use Cases in less than 30 mins
Sparkflows - Build E2E Data Analytics Use Cases in less than 30 minsSparkflows - Build E2E Data Analytics Use Cases in less than 30 mins
Sparkflows - Build E2E Data Analytics Use Cases in less than 30 mins
sparkflows
 
Twitter with hadoop for oow
Twitter with hadoop for oowTwitter with hadoop for oow
Twitter with hadoop for oow
Gwen (Chen) Shapira
 
Hive_Pig.pptx
Hive_Pig.pptxHive_Pig.pptx
Hive_Pig.pptx
PAVANKUMARNOOKALA
 
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Jeff Magnusson
 
Hadoop for the Absolute Beginner
Hadoop for the Absolute BeginnerHadoop for the Absolute Beginner
Hadoop for the Absolute Beginner
Ike Ellis
 
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
DataWorks Summit
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
Andrew Brust
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
Rahul Jain
 
Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14
Julian Hyde
 
DevOps Columbus Meetup Kickoff - Infrastructure as Code
DevOps Columbus Meetup Kickoff - Infrastructure as CodeDevOps Columbus Meetup Kickoff - Infrastructure as Code
DevOps Columbus Meetup Kickoff - Infrastructure as Code
Michael Ducy
 
Big Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWSBig Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWS
javier ramirez
 
Big data week presentation
Big data week presentationBig data week presentation
Big data week presentation
Joseph Adler
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
Big Data Spain
 

Similar to How Concur uses Big Data to get you to Tableau Conference On Time (20)

The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkThe Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache Spark
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
 
Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013
 
Big data clustering
Big data clusteringBig data clustering
Big data clustering
 
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data PlatformsCassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
 
SQL on Hadoop in Taiwan
SQL on Hadoop in TaiwanSQL on Hadoop in Taiwan
SQL on Hadoop in Taiwan
 
Devops kc meetup_5_20_2013
Devops kc meetup_5_20_2013Devops kc meetup_5_20_2013
Devops kc meetup_5_20_2013
 
Sparkflows - Build E2E Data Analytics Use Cases in less than 30 mins
Sparkflows - Build E2E Data Analytics Use Cases in less than 30 minsSparkflows - Build E2E Data Analytics Use Cases in less than 30 mins
Sparkflows - Build E2E Data Analytics Use Cases in less than 30 mins
 
Twitter with hadoop for oow
Twitter with hadoop for oowTwitter with hadoop for oow
Twitter with hadoop for oow
 
Hive_Pig.pptx
Hive_Pig.pptxHive_Pig.pptx
Hive_Pig.pptx
 
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
 
Hadoop for the Absolute Beginner
Hadoop for the Absolute BeginnerHadoop for the Absolute Beginner
Hadoop for the Absolute Beginner
 
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14Cost-based query optimization in Apache Hive 0.14
Cost-based query optimization in Apache Hive 0.14
 
DevOps Columbus Meetup Kickoff - Infrastructure as Code
DevOps Columbus Meetup Kickoff - Infrastructure as CodeDevOps Columbus Meetup Kickoff - Infrastructure as Code
DevOps Columbus Meetup Kickoff - Infrastructure as Code
 
Big Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWSBig Data, Ingeniería de datos, y Data Lakes en AWS
Big Data, Ingeniería de datos, y Data Lakes en AWS
 
Big data week presentation
Big data week presentationBig data week presentation
Big data week presentation
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
 

More from Denny Lee

Azure Cosmos DB: Globally Distributed Multi-Model Database Service
Azure Cosmos DB: Globally Distributed Multi-Model Database ServiceAzure Cosmos DB: Globally Distributed Multi-Model Database Service
Azure Cosmos DB: Globally Distributed Multi-Model Database Service
Denny Lee
 
Spark to DocumentDB connector
Spark to DocumentDB connectorSpark to DocumentDB connector
Spark to DocumentDB connector
Denny Lee
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Denny Lee
 
SQL Server Integration Services Best Practices
SQL Server Integration Services Best PracticesSQL Server Integration Services Best Practices
SQL Server Integration Services Best Practices
Denny Lee
 
SQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best PracticesSQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best Practices
Denny Lee
 
Introduction to Microsoft's Big Data Platform and Hadoop Primer
Introduction to Microsoft's Big Data Platform and Hadoop PrimerIntroduction to Microsoft's Big Data Platform and Hadoop Primer
Introduction to Microsoft's Big Data Platform and Hadoop Primer
Denny Lee
 
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)
Denny Lee
 
Yahoo!, Big Data, and Microsoft BI: Bigger and Better Together
Yahoo!, Big Data, and Microsoft BI: Bigger and Better TogetherYahoo!, Big Data, and Microsoft BI: Bigger and Better Together
Yahoo!, Big Data, and Microsoft BI: Bigger and Better Together
Denny Lee
 
SQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinarSQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinar
Denny Lee
 
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
Denny Lee
 
Designing, Building, and Maintaining Large Cubes using Lessons Learned
Designing, Building, and Maintaining Large Cubes using Lessons LearnedDesigning, Building, and Maintaining Large Cubes using Lessons Learned
Designing, Building, and Maintaining Large Cubes using Lessons Learned
Denny Lee
 
SQLCAT - Data and Admin Security
SQLCAT - Data and Admin SecuritySQLCAT - Data and Admin Security
SQLCAT - Data and Admin Security
Denny Lee
 
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
Denny Lee
 
SQLCAT: A Preview to PowerPivot Server Best Practices
SQLCAT: A Preview to PowerPivot Server Best PracticesSQLCAT: A Preview to PowerPivot Server Best Practices
SQLCAT: A Preview to PowerPivot Server Best Practices
Denny Lee
 
Deploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePointDeploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePoint
Denny Lee
 
SQLCAT: Tier-1 BI in the World of Big Data
SQLCAT: Tier-1 BI in the World of Big DataSQLCAT: Tier-1 BI in the World of Big Data
SQLCAT: Tier-1 BI in the World of Big Data
Denny Lee
 
Big Data, Bigger Brains
Big Data, Bigger BrainsBig Data, Bigger Brains
Big Data, Bigger Brains
Denny Lee
 
SQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery WebinarSQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery Webinar
Denny Lee
 
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
Denny Lee
 
Yahoo! TAO Case Study Excerpt
Yahoo! TAO Case Study ExcerptYahoo! TAO Case Study Excerpt
Yahoo! TAO Case Study Excerpt
Denny Lee
 

More from Denny Lee (20)

Azure Cosmos DB: Globally Distributed Multi-Model Database Service
Azure Cosmos DB: Globally Distributed Multi-Model Database ServiceAzure Cosmos DB: Globally Distributed Multi-Model Database Service
Azure Cosmos DB: Globally Distributed Multi-Model Database Service
 
Spark to DocumentDB connector
Spark to DocumentDB connectorSpark to DocumentDB connector
Spark to DocumentDB connector
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 
SQL Server Integration Services Best Practices
SQL Server Integration Services Best PracticesSQL Server Integration Services Best Practices
SQL Server Integration Services Best Practices
 
SQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best PracticesSQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best Practices
 
Introduction to Microsoft's Big Data Platform and Hadoop Primer
Introduction to Microsoft's Big Data Platform and Hadoop PrimerIntroduction to Microsoft's Big Data Platform and Hadoop Primer
Introduction to Microsoft's Big Data Platform and Hadoop Primer
 
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)
 
Yahoo!, Big Data, and Microsoft BI: Bigger and Better Together
Yahoo!, Big Data, and Microsoft BI: Bigger and Better TogetherYahoo!, Big Data, and Microsoft BI: Bigger and Better Together
Yahoo!, Big Data, and Microsoft BI: Bigger and Better Together
 
SQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinarSQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinar
 
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
 
Designing, Building, and Maintaining Large Cubes using Lessons Learned
Designing, Building, and Maintaining Large Cubes using Lessons LearnedDesigning, Building, and Maintaining Large Cubes using Lessons Learned
Designing, Building, and Maintaining Large Cubes using Lessons Learned
 
SQLCAT - Data and Admin Security
SQLCAT - Data and Admin SecuritySQLCAT - Data and Admin Security
SQLCAT - Data and Admin Security
 
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
 
SQLCAT: A Preview to PowerPivot Server Best Practices
SQLCAT: A Preview to PowerPivot Server Best PracticesSQLCAT: A Preview to PowerPivot Server Best Practices
SQLCAT: A Preview to PowerPivot Server Best Practices
 
Deploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePointDeploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePoint
 
SQLCAT: Tier-1 BI in the World of Big Data
SQLCAT: Tier-1 BI in the World of Big DataSQLCAT: Tier-1 BI in the World of Big Data
SQLCAT: Tier-1 BI in the World of Big Data
 
Big Data, Bigger Brains
Big Data, Bigger BrainsBig Data, Bigger Brains
Big Data, Bigger Brains
 
SQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery WebinarSQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery Webinar
 
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
 
Yahoo! TAO Case Study Excerpt
Yahoo! TAO Case Study ExcerptYahoo! TAO Case Study Excerpt
Yahoo! TAO Case Study Excerpt
 

Recently uploaded

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 

Recently uploaded (20)

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 

How Concur uses Big Data to get you to Tableau Conference On Time

  • 1.
  • 2. How Concur Uses Big Data to Get You to TC On Time Denny Lee Senior Director, Data Sciences Engineering
  • 3. About Concur What do we do? • Leading provider of spend management solutions and (Travel, Invoice, TripIt, etc.) services in the world • Global customer base of 20,000 clients and 25 million users • Processing more than $50 Billion in Travel & Expense (T&E) spend each year
  • 4. About the Speaker Who Am I? • Long time SQL Server BI guy (24TB Yahoo! Cube) • Project Isotope (Hadoop on Windows and Azure) • At Concur, helping with Big Data and Data Sciences
  • 5. Is Big Data …. The most overused buzzword today? An actual useful framework? Yes!
  • 6. Consolidate Visualize Insight Recommend TechBar Themes
  • 8. BTS Invoice Web Analytics Expense Travel Weather
  • 9. A long time ago… • We started using Hadoop because • It was free • i.e. Didn’t want to pay for a big data warehouse • Could slowly extract from hundreds of relational data sources, consolidate it, and query it • We were not thinking about advanced analytics • We were thinking …. “cheaper reporting” • We have some hardware lying around … let’s cobble it together and now we have reports
  • 10. But why Hadoop? • Even with primarily relational systems, it involved hundreds of sources • Getting Tableau or any BI tool to connect to so many sources is … not fun • More times than not, we needed to understand a subset or aggregate of this data - not all of the data! • Can use Pig to process, extract, filter the data • Can use Hive - a SQL like query language - to query my data
  • 13. demo Querying Hive via Hue and Tableau to understand Air Traffic patterns
  • 14. Connecting to Hive using Hue - can query using HiveQL, a SQL-like query language
  • 15.
  • 16.
  • 17.
  • 18. Install Cloudera Hive Driver, Connect to Cloudera Hadoop, fill in above and you’re connected to Hive
  • 19. Connecting Tableau to Hive may take a very long time in Live mode
  • 20. Instead, choose Extract which will bring the data across from Hive and you run live queries within Tableau. Note, the extraction will take a long time too!
  • 21. Now that the data is in Tableau, I can pivot, slice, and filter at the speed of thought!
  • 22. Can quickly switch to map mode and determine where most itineraries are from in 2013
  • 23. If you’re expecting to Hadoop or Hive to be fast….
  • 24. Evolution of Hive • Hive built originally by Facebook placed a SQL-like query language in front of Hadoop Map-Reduce. • Has its flexibility but also its overhead and complexity • Apache community working on Hive Stinger project to advance Hive including DAG scheduler, optimized columnar format, and improved engine semantics
  • 26. demo Querying Impala via Hue and Tableau to understand Air Departure Delays
  • 27. Query airport information using Impala, sort of looks like Hive so far…
  • 28. But notice the query running in Impala significantly faster!
  • 29. Not just limit 10 types of queries but ones that involve more complicated where clauses
  • 30. And quickly chart out the results - e.g. highest airport in Taiwan is Sun Moon Lake
  • 31. Or even quickly map out the airport locations on a map to see that Sun Moon Lake Airport is in the center of Taiwan
  • 32. And using Impala is not just for Hue - its even better on Tableau
  • 33. Now I can connect to my data live and have fast queries returned to Tableau
  • 34. After quickly modifying the data within Tableau, can discover the amount of flight delays to Seattle, and denote that San Jose has the least # of delays
  • 35. Why Impala? • Focus is to speed up BI queries • Analogous to relational BI tools except now I can do this against a distributed cluster • Similar to relational BI tools that as its special purpose, can do a lot of optimizations to improve speed • But note this demo was against the same Hive table against data stored in Hadoop
  • 36. demo Leveraging AtScale to build models on Impala and slicing them in Tableau
  • 37. Using AtScale to build up a dimensional model based on the data that is stored within Impala / Hive
  • 38. Slice and filter the Impala model using Tableau For more info, check out: http://atscale.com/
  • 39. Data Extraction How to query multiple endpoints or multiple data sources? Setup a whole bunch of VMs and have someone connecting to each one and executing get commands?
  • 40. Optimizing Data Extraction Use Hadoop streaming to execute python script to perform get Hadoop will generate tasks for each API get call and then execute it across all the clusters in the node in parallel
  • 42. TechBar Quick Primer on Apache Spark
  • 43. What is Apache Spark? Fast and general cluster computing system interoperable with Hadoop Improves efficiency through: »In-memory computing primitives »General computation graphs Improves usability through: »Rich APIs in Scala, Java, Python »Interactive shell Up to 100× faster (2-10× on disk) 2-5× less code
  • 44. Project History Started in 2009, open sourced 2010 30+ companies now contributing code »Databricks, Yahoo!, Intel, Adobe, Cloudera, Bizo, … One of the largest communities in big data
  • 45. A General Stack Spark Spark Streaming real-time Shark SQL GraphX graph MLlib machine learning …
  • 46. demo Applying Spark for Recommendations
  • 47. Starbucks Store #3313 601 108th Ave NE Bellevue, WA (425) 646-9602 ------------------------------- Chk 713452 05/14/2014 11:04 AM 1961558 Drawer: 1 Reg: 1 ------------------------------- Bacon Art Brkfst 3.45 Warmed T1 Latte 2.70 Triple 1.50 Soy 0.60 Gr Vanilla Mac 4.15 Reload Card 50.00 AMEX $50.00 XXXXXXXXXXXXXXXXXX1004 SBUX Card $13.56 SUBTOTAL $62.40 New Caffe Espresso Frappuccino(R) Blended beverage Our Signature Frappuccino(R) roast coffee and fresh milk, blended with ice. Topped with our new espresso whipped cream and new Italian roast drizzle Expense Categorization One of my receipts that I had OCRed One of the issues we’re trying to solve is to auto-categorize this, so how can we do this? Below is a simplistic solution using WordCount Note, a real solution should involve machine learning algorithms
  • 48. Spark assembly has been built with Hive, including Datanucleus jars on classpath Welcome to ____ __ / __/__ ___ _____/ /__ _ / _ / _ `/ __/ '_/ /___/ .__/_,_/_/ /_/_ version 1.1.0 /_/ Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45) Type in expressions to have them evaluated. Type :help for more information. 2014-09-07 22:31:21.064 java[1871:15527] Unable to load realm info from SCDynamicStore 14/09/07 22:31:21 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Spark context available as sc. scala> val receipt = sc.textFile("/usr/local/Cellar/workspace/data/receipt/receipt.txt") receipt: org.apache.spark.rdd.RDD[String] = /usr/local/Cellar/workspace/data/receipt/receipt.txt MappedRDD[1] at textFile at <console>:12 scala> receipt.count res0: Long = 30
  • 49. scala> val words = receipt.flatMap(_.split(" ")) words: org.apache.spark.rdd.RDD[String] = FlatMappedRDD[2] at flatMap at <console>:14 scala> words.count res1: Long = 161 scala> words.distinct.count res2: Long = 72 scala> val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _).map{case(x,y) => (y,x)}.sortByKey(false).map{case(i,j) => (j, i)} wordCounts: org.apache.spark.rdd.RDD[(String, Int)] = MappedRDD[12] at map at <console>:16 scala> wordCounts.take(12) res5: Array[(String, Int)] = Array(("",82), (with,2), (Card,2), (new,2), (------------------------------- ,2), (Frappuccino(R),2), (roast,2), (1,2), (and,2), (New,1), (Topped,1), (Starbucks,1))
  • 50. Still beta, but can connect from Tableau to SparkSQL using Shark driver
  • 51. Can / will be able to connect to this SparkSQL live
  • 52. Quick view of Android vs. iOS mobile sessions
  • 53. SparkSQL - What’s Next? • Currently makes use of Hive code-base • Major focus for 1.2 • Pluggable external datasources • Easier access through pure SQL interface • Access things like JSON tables though SQL?
  • 55. Invite • Pacific Northwest Cloudera User Group • http://bit.ly/1uFD6vJ • Doug Cutting, Hadoop Co-Creator, will be speaking at Disney on 9/24 • Seattle Spark Meetup • http://bit.ly/1q4Z0Ke • Next sessions: • Deep Dive into Spark and Mesos Internals • Unlocking your Hadoop data with Apache Spark and CDH5
  • 56. Q&A

Editor's Notes

  1. TODO: Apache incubator logo