Your SlideShare is downloading. ×
0
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop Meets Exadata- Kerry Osborne

720

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
720
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Many companies that are using Hadoop in a big way still have Oracle databases sitting right next to them. Nokia - I had a meeting with a guy from Nokia a couple of weeks ago. We discussed how they were using Hadoop and he described basically an ETL kind of setup. The HDFS cluster ingests data that is then processed by MR jobs. The aggregated data is then fed into a Relational DB so the analysts could have their way with it. People have preferences for certain tools (BI tools for example). Also, RDBMS’s can be very fast for this type of access is the data is of reasonable size. Not using Flume, ??? Usiing it for many things but positional data from phones was one of the main cases we discussed. Canadian NSA – They have Exadata and Hadoop Cluster – rows of racks of both
  • Use Firefox http://192.168.9.98:7777/pls/apex/f?p=100:2:1849672391763932::NO#
  • Use Firefox http://192.168.9.98:7777/pls/apex/f?p=100:2:1849672391763932::NO#
  • Use Firefox http://192.168.9.98:7777/pls/apex/f?p=100:2:1849672391763932::NO#
  • With all the new options available it will take some serious thought about what architecture makes the most sense for any given problem. I had a conversation 2 weeks ago with the Canadian NSA (CSE) – completely static data set – never updated. Good for Hadoop or for HCC. HCC provides about 10x compression on their data set. So a single Exadata rack which has a raw storage capacity of about half a pedabyte can store over 2 pedabytes with normal redundancy. On the other hand, I had a conversation with Nokia about how they are using Hadoop. They have been heavily investing in the technology for a couple of years. A large part of what they do involves investing data produced by mobile phones. The data is typical mined by MR jobs and aggregated data sets are then loaded into RDBMS’s where analysts can use standard BI tools to do what they do. So they described it as an ETL type process.
  • Transcript

    • 1. HiHadoop Meets Exadata Presented by: Kerry OsborneOracle Open World – October, 2012
    • 2. whoami –Never Worked for OracleWorked with Oracle Since 1982 (V2)Working with Exadata since early 2010Work for Enkitec (www.enkitec.com)(Enkitec owns a Half Rack – V2/X2)(Enkitec owns a Big Data Appliance)Many Exadata customers and POCsExadata Book (recently translated to Chinese)Hadoop Aficionado Blog: kerryosborne.oracle-guy.com Twitter: @KerryOracleGuy 2
    • 3. Top Secret Feature of BDA 3
    • 4. What’s the Point?Data Volumes are Increasing RapidlyCost of Processing / Storing is HighSomething’s Gotta Give!Besides – managing large quantities of data is whatwe do! 4
    • 5. Hadoop Is A Virus* Stolen from Orbitz 5
    • 6. Google Trends 6
    • 7. Google Trends 7
    • 8. Google Trends 8
    • 9. Digression #1 - Big Data Not My Favorite Term 3 or 4 V’s Value Density Not the Right Tool for Every Job 9
    • 10. Disjointed Presentation Architecture Comparison Integration Discussion Case Study ? 10
    • 11. Traditional RDBMS Architecture RACwo Cache (SGA)r workersk dbwr lgwr etc… Block Mapper (ASM) Storage 11
    • 12. HDFS/Hadoop Architecture HA ?wo Job Tracker Name Noderk datanode tasktracker datanode tasktracker workers workers Storage Storage 12
    • 13. HDFS/Hadoop Architecture HA ?wo Job Trackerrk Block Mapper (namenode) datanode tasktracker datanode tasktracker workers workers Storage Storage 13
    • 14. Exadata Architecture RACw workerso Cacherk Block Mapper (ASM) Storage Node Storage Node workers workers Storage Storage 14
    • 15. HDFS/Hadoop Architecture HA ?wo Job Trackerrk Block Mapper (namenode) datanode tasktracker datanode tasktracker workers workers Storage Storage 15
    • 16. Oracle + Hadoop Integration 16
    • 17. Obligatory Marketing Slide 17
    • 18. Integration OptionsMany Ways to Skin the Cat • Fuse • Sqoop • Oracle Big Data Connectors 18
    • 19. Fuse – External Tables 19
    • 20. Sqoop (SQL-to-Hadoop) • Graduated from Incubator Status in March 2012 • Slower (no direct path?) • Quest has a plug-in (oraoop) • Bi-Directional 20
    • 21. Oracle Big Data Connectors Oracle Loader for Hadoop - OLH Oracle Direct Connector for HDFS  - ODCH Oracle R Connector for Hadoop – ORHC Oracle Data Integrator Application Adapter for HadoopNote:All Connectors are One WayAll sold together for $2K per core list 21
    • 22. Oracle Data IntegratorApplication Adapter for Hadoop ODIAAH ? 22
    • 23. Oracle R Connector for Hadoop (ORHC) • Provides ability to pull data from Oracle RDBMS • Provides ability to pull data from HDFS • Provides access to local file system • Not really a loader tool • Most useful for analysts 23
    • 24. Oracle Loader for Hadoop (OLH)• Implemented as a MapReduce job (oraloader.jar)• Saves CPU on DB Server• Can convert to Oracle datatypes• Can partition data and optionally sort it• Online – direct into Oracle tables • Can load into Oracle via JDBC or OCI Direct Path• Offline – generate preprocessed files in HDFS (DP format) 24
    • 25. Oracle Direct Connector for HDFS  (ODCH) • My Favorite • Uses External Tables • Fastest • 12T per hour • Can load DP files preprocessed by OLH • Allows Oracle SQL to query HDFS data • Doesn’t require loading into Oracle • Pretty Cool! • Downside – uses DB CPU’s 25
    • 26. Exadoop* Mad Scientist Project 26
    • 27. Exadoop Unusual Situation! Half Rack with 4 Spare Storage Servers Exadata Cells Very Similar to BDA Servers slower CPU’s less memory but same drives (12X3T) and IB and Flash 4 Cells ≈ Mini BDA! (happy face) 27
    • 28. Digression #2 - BDA Stuff 28
    • 29. Digression #2 - BDA Stuff 29
    • 30. Digression #2 - BDA Stuff 30
    • 31. ExadoopSituation• Pilot Underway – but wanted more power• 4 Exadata Storage Servers were sitting idle• Suggestion was to Install Hadoop Cluster on them• 1st Concern was being able to Reclaim for Exadata• Removing Data Node from HDFS Not a Problem• Adding Storage to ASM Not a Problem• So the Decision Was Made to Move Forward 31
    • 32. ExadoopSet Up• Removed the Internal USB’s• Installed OEL 6.2• Installed CDH3• Loaded Some Data• Set Up ODCH with External Tables 32
    • 33. ExadoopTesting• Selecting Data Using External Tables was Not Very Fast• Quickly Determined we had Used Default 1G Network• Reconfigured with IB• Helped But Not as Much as Expected• Using Little CPU on Data Nodes• But a Single Process was Pegging a CPU on the DB• Added Parallelism• No Good, Only One Slave Active• Added Multiple Files to External Table Def. – Bingo! 33
    • 34. ExadoopTesting - Continued• Added Fuse Client• Created External Tables with Fuse• PX seems to work even on single files• Puts additional CPU load on DB server (2T/hr) 34
    • 35. Wrap Up Right Tool For The Job? Maybe All the Cool Kids Are Doing It! 35
    • 36. Questions?Contact Information : Kerry Osborne kerry.osborne@enkitec.com kerryosborne.oracle-guy.com www.enkitec.com 36

    ×