Powerpoint exploring the locations used in television show Time Clash
Big Data redefines Enterprise Data Warehouse @Bangalore
1. Big Data redefines
Enterprise Data Warehouse
Big Data Innovation, Unicom - Bangalore
February 2013
Raghu Kashyap
2. About Raghu Kashyap
Personal
■ Director – Data Insights Group @ Orbitz Worldwide
■ eMail: raghu.kashyap@orbitz.com
■ Twitter: @ragskashyap
■ Blog: http://kashyaps.com
■ LinkedIn: http://www.linkedin.com/in/raghukashyap/
Areas of Responsibility
■ Orbitz Services Bangalore Center Head
■ Lead Big Data team that builds out Global Data Infrastructure for
Orbitz Worldwide and provides business insights.
■ US, Europe, Australia(APAC)
page 2
9. Redefine Enterprise Data warehouse
ETL only approach
2:12 seconds
Run map reduce job
1m 14.298s
Port flat file to Greenplum using GP connector
Time: 5.077 s
page 9
10. Approach with Hadoop and ETL
Raw Greenplum
logs
Event Model
Map Reduce
ETL
Flat files
GP Connector
External Tables
page 10
11. Resolving database keys
tag_value_dim Greenplum
id tag value tag_value_dim
1 pos ORB id tag value
2 pos ORBC 200 pos ORB
3 pos ORB 157 pos ORBC
ETL
fact
fact
id tag value id fact
id Tag value id value
value
200 $ 5600
1 $ 5600
200 $ 7500
3 $ 7500
page 11
17. MVT
Analyze behavioral and Test data from our MVT testing
page 17
18. Lessons Learnt
Analytics using Big Data comes with a price.
Data Governance
Senior Leadership buy in
I can't tell you the key to success, but the key to failure is
trying to please everyone." -Ed Sheeran
page 18