Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Automic World 2015
Automating Big Data with the Hadoop Agent
Dave Kellermanns
Chief Automation Architect
2 Property of Automic Software. All rights reserved
3 Property of Automic Software. All rights reserved
Every day, we create 2.5 quintillion (18 zeroes !) bytes of data
So mu...
4 Property of Automic Software. All rights reserved
Think you can avoid Big Data?
The Big Data technology and services mar...
5 Property of Automic Software. All rights reserved
• Make better, more quantitative decisions
• Reach new levels of profi...
6 Property of Automic Software. All rights reserved
Where does Big Data fit into the Enterprise?
7 Property of Automic Software. All rights reserved
• Big data technologies must be integrated with
more traditional data ...
8 Property of Automic Software. All rights reserved
A conflict in the skillset of analysts vs data engineers
People runnin...
9 Property of Automic Software. All rights reserved
Hadoop Open Source
“The Apache™ Hadoop® project develops open-source s...
10 Property of Automic Software. All rights reserved
Many people work on Hadoop
11 Property of Automic Software. All rights reserved
3 Releases of the Hadoop Platform
12 Property of Automic Software. All rights reserved
New capabilities keep on coming
13 Property of Automic Software. All rights reserved
APIs do change constantly
14 Property of Automic Software. All rights reserved
© Automic. All rights reserved.
Configuration & Objects
15 Property of Automic Software. All rights reserved
Proven value for Data Automation
Improve
Decisions
Business &
Operati...
16 Property of Automic Software. All rights reserved
Proven Value for Data Automation
Self-service
platform for
data scien...
17 Property of Automic Software. All rights reserved
Business Benefit to Netflix
To “Give Viewers What They Want”
Collect ...
18 Property of Automic Software. All rights reserved
eBay relies on Automic
If Automic goes down eBay loses 70% of their w...
19 Property of Automic Software. All rights reserved
Automating ebay Data Warehouse Platforms
ebay DW environment
Teradata...
20 Property of Automic Software. All rights reserved
Automic’s Value to Big Data
• We help our customers to get out of the...
21 Property of Automic Software. All rights reserved
Contact
Dave Kellermanns
Chief Automation Architect
dave.kellermanns@...
Thank you!
Upcoming SlideShare
Loading in …5
×

Automating Big Data with the Automic Hadoop Agent

662 views

Published on

Automating Big Data with the Hadoop Agent

  • Be the first to comment

  • Be the first to like this

Automating Big Data with the Automic Hadoop Agent

  1. 1. Automic World 2015 Automating Big Data with the Hadoop Agent Dave Kellermanns Chief Automation Architect
  2. 2. 2 Property of Automic Software. All rights reserved
  3. 3. 3 Property of Automic Software. All rights reserved Every day, we create 2.5 quintillion (18 zeroes !) bytes of data So much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. This is called “Internet of the Things”. Connect all together. But the data is called BIG DATA What is Big Data ? Source.Forbes.com
  4. 4. 4 Property of Automic Software. All rights reserved Think you can avoid Big Data? The Big Data technology and services market represents a fast-growing multibillion-dollar worldwide opportunity [...] that will grow at a 26.4% compound annual growth rate to $41.5 billion through 2018, or about six times the growth rate of the overall information technology market […] IDC - 2015
  5. 5. 5 Property of Automic Software. All rights reserved • Make better, more quantitative decisions • Reach new levels of profits, efficiently • Predict with unprecedented accuracy to influence business outcomes • Deliver highly personalized customer experiences at massive scale • Make new discoveries using massive amounts of data • Recognize new revenue streams from digital exhaust Why are companies focused right now on Big Data ?
  6. 6. 6 Property of Automic Software. All rights reserved Where does Big Data fit into the Enterprise?
  7. 7. 7 Property of Automic Software. All rights reserved • Big data technologies must be integrated with more traditional data systems and sources • Efficient Dev-Test-Prod change control needs to be implemented end-to-end • Administration, development, operations, and analytics must all need tools tailored to their roles to maximize • Automation is a core requirement for making these complex systems accessible. It has to be easy to use and customizable Simplifying user experience and procedures
  8. 8. 8 Property of Automic Software. All rights reserved A conflict in the skillset of analysts vs data engineers People running the data platform <workflow-app xmlns="uri:workflow:0.4" name="hive-add-partition-searchevents-wf"> <start to="hive-add-partition-searchevents" /> <action name="hive-add-partition-searchevents" retry-max="1" retry-interval="1"> <hive xmlns="uri:oozie:hive-action:0.4"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> ... ... <script>add_partition_hive_searchevents_script.q</script> <param>YEAR=${YEAR}</param> <param>MONTH=${MONTH}</param> <param>DAY=${DAY}</param> <param>HOUR=${HOUR}</param> </hive> <ok to="end" /> <error to="fail" /> </action> <bundle-app name='BundleApp-LoadAndIndexTopCustomerQueries' xmlns='uri:oozie:bundle:0.2'> <controls> <kick-off-time>${jobStart}</kick-off-time> </controls> <coordinator name='CoordApp-LoadCustomerQueries' > <app-path>${coordAppPathLoadCustomerQueries}</app-path> </coordinator> <coordinator name='CoordApp-IndexTopQueriesES' > <app-path>${coordAppPathIndexTopQueriesES}</app-path> </coordinator> </bundle-app> .... <coordinator-app name="CoordApp-LoadCustomerQueries" frequency="${coord:days(1)}" start="${jobStart}" end="${jobEnd}" timezone="UTC" xmlns="uri:oozie:coordinator:0.2"> ... <action> <workflow> <app-path>${workflowRoot}/hive-action-load-customerqueries.xml </app-path> </workflow> </action> </coordinator-app> ... <coordinator-app name="CoordApp-IndexTopQueriesES" frequency="${coord:days(1)}" start="${jobStartIndex}" end="${jobEnd}" timezone="UTC" xmlns="uri:oozie:coordinator:0.2"> ... <action> <workflow> Automic helps to bridge the gap between the skillsets of the people who need the tool and the skillsets required to run the tool People wanting data
  9. 9. 9 Property of Automic Software. All rights reserved Hadoop Open Source “The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing.” “Open source as a development model promotes a universal access via a free license to a product's design or blueprint, and universal redistribution of that design or blueprint, including subsequent improvements to it by anyone”
  10. 10. 10 Property of Automic Software. All rights reserved Many people work on Hadoop
  11. 11. 11 Property of Automic Software. All rights reserved 3 Releases of the Hadoop Platform
  12. 12. 12 Property of Automic Software. All rights reserved New capabilities keep on coming
  13. 13. 13 Property of Automic Software. All rights reserved APIs do change constantly
  14. 14. 14 Property of Automic Software. All rights reserved © Automic. All rights reserved. Configuration & Objects
  15. 15. 15 Property of Automic Software. All rights reserved Proven value for Data Automation Improve Decisions Business & Operational Intelligence Data Warehousing Big Data Call centre performance Hadoop Big Data automation Data Ingestion across IaaS Fast Cognos Analytics delivery POS data mining, ETL & MFT
  16. 16. 16 Property of Automic Software. All rights reserved Proven Value for Data Automation Self-service platform for data scientists We use Automic in our data center to define dependencies between various jobs between our data center and the cloud, and run them as ‘process flows’. Automic ensures that the right data is delivered on time to Data Scientists. This requires approximately 6,000 jobs per day. Ashi Sheth Manger of Enterprise Services, Netflix
  17. 17. 17 Property of Automic Software. All rights reserved Business Benefit to Netflix To “Give Viewers What They Want” Collect hundreds of terabytes of data daily Petabyte-scale Platform Engineers … build templates and workflows using ONE Automation … enable data scientists to perform all kinds of ad hoc analysis without having to deal with the complexity of the underlying data infrastructure Automic 1 2 • >50m subscribers • >40 countries Recommendation EngineData Scientists … perform data-driven experiments and tests on a daily basis … and many other tools using … to improve the quality of recommendations … resulting in happy customers! 3 4
  18. 18. 18 Property of Automic Software. All rights reserved eBay relies on Automic If Automic goes down eBay loses 70% of their web traffic to Amazon – Automic automates Hadoop for eBay which provides all of their business intelligence for optimized SEO – Automic moves data, schedules the map reduce, schedules the analytics and then pushes the output to Google
  19. 19. 19 Property of Automic Software. All rights reserved Automating ebay Data Warehouse Platforms ebay DW environment Teradata: – Mozart: 2.6PB(used storage)/6.6PB(total storage) – Martini: 1.4PB used, 8.5PM total – EDW concurrent queries: 500+ Singularity (eBay specific TD): – Vivaldi: 9.5PB(used storage) /16.9PB (total storage) – Davinci:2.5PM used, 3.4PB total • SG concurrent queries:100+ Hadoop: – Hadoop Total: 71.5PB /91.9PB (used storage / total storage) – Hadoop Ares: 29.5PB /41.4PB, Hadoop Apollo: 32.2PB /37.8PB, Hadoop Artemis: 9.8PB/11.9PB – Hadoop concurrent jobs running: 1000+ Source: http://www.slideshare.net/madananil/hadoop-at-ebay
  20. 20. 20 Property of Automic Software. All rights reserved Automic’s Value to Big Data • We help our customers to get out of the scripting business by abstracting the APIs from the user by using Hadoop templates • Current functionality can be extended by Automic and Users alike and in turn distributed via Automic’s Marketplace, so there is no need to wait for vendors to catch up and release a new Agent for new APIs (think Falcon, Ranger, Knox, Ambari, Cloudbreak, etc.) • Automic and it’s Objects are agnostic – templates work with Hortonworks, Cloudera, MapR – they can even help you transition between Hadoop distributions
  21. 21. 21 Property of Automic Software. All rights reserved Contact Dave Kellermanns Chief Automation Architect dave.kellermanns@automic.com +1 (720) 440-2838
  22. 22. Thank you!

×