14/10/2011                                                                     Agenda                                  Usi...
14/10/2011Customers By IndustryPublic Sector & Education            Systems Integrators                                   ...
14/10/2011Background: MapReduce and Hadoop                                                 Talend Integration Suite MPx fo...
14/10/2011Talend Integration Suite MPx                                                Talend Components for Hadoop Feature...
14/10/2011Today’s Demo Scenario       View sample log data from an online game source       Load log data into Hive     ...
14/10/2011 Questions and Answers Mark Chapman  Technical Manager  mchapman@talend.com                                   ...
Upcoming SlideShare
Loading in …5
×

Taland Hadoop data integration

3,634 views

Published on

Published in: Technology
1 Comment
1 Like
Statistics
Notes
No Downloads
Views
Total views
3,634
On SlideShare
0
From Embeds
0
Number of Embeds
164
Actions
Shares
0
Downloads
88
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide

Taland Hadoop data integration

  1. 1. 14/10/2011 Agenda Using Hadoop with Talend  Talend Introduction  MapReduce and Hadoop Mark Chapman  Talend Integration Suite MPx Imad Rahman  Hadoop Features and TIS Components  How to use Talend to simplify Hadoop  Demo!  Questions & Answers © Talend 2011 2 Agenda Global leader in open source integration Venture-backed  Talend Introduction  MapReduce and Hadoop Global operations Corporate Headquarters Talend across the world…  Talend Integration Suite MPx San Francisco (Los Altos) Paris (Suresnes)  Hadoop Features and TIS Components Operations  How to use Talend to simplify Hadoop Orange County (Irvine) Boston (Burlington)  Demo! New York (Tarrytown) London (Maidenhead)  Questions & Answers Utrecht Nuremberg Bonn Munich Milan (Bergame) Tokyo Beijing© Talend 2011 3 © Talend 2011 4 1
  2. 2. 14/10/2011Customers By IndustryPublic Sector & Education Systems Integrators Market Positioning Application Integration Connect applications & services Services & OthersMedia & Telco Master Data Quality Data Software Data profiling Management Data cleansing Model and master any data or domain Retail and ManufacturingFinance & Insurance Data Integration Analytics (ETL) Operational data integration© Talend 2011 5 © Talend 2011 6Talend Unified Platform Agenda Complete unified environment supports all integration approaches – data & application  Talend Introduction Uses consistent technology & leverages open standards  MapReduce and Hadoop Studio Comprehensive Eclipse-based  Talend Integration Suite MPx user interface  Hadoop Features and TIS Components Consolidated metadata & project Repository information  How to use Talend to simplify Hadoop  Demo! Deployment Web-based deployment & scheduling  Questions & Answers Same containers for batch processing, Execution message routing & services Monitoring Single web-based monitoring console© Talend 2011 7 © Talend 2011 8 2
  3. 3. 14/10/2011Background: MapReduce and Hadoop Talend Integration Suite MPx for Big DataMapReduce: Parallel Programming Model • One platform  “Divide and Conquer • All sources  Many possible implementations • All modes Big Data • All scales ·Hadoop ·Filescale High VolumeHadoop: Open Source Java MapReduce (ELT)  Simplified framework Batch ETL Cloud: flexible infrastructure Right-Time  e.g. Amazon Elastic MapReduce© Talend 2011 9 © Talend 2011 10Talend’s Big Data Partnerships Agenda Partnering with Enterprise Big Data Leaders  Talend Introduction  MapReduce and Hadoop  Talend Integration Suite MPx Cloudera: Enterprise Hadoop  Hadoop Features and TIS Components  Talend: Open Source Cloudera  How to use Talend to simplify Hadoop  Connect Partner for Data Integration  Demo!  Questions & Answers Greenplum: Hadoop-Powered Analytics  Big Data-scale Relational DB  Talend supports Greenplum for Hadoop and ELT© Talend 2011 11 © Talend 2011 12 3
  4. 4. 14/10/2011Talend Integration Suite MPx Talend Components for Hadoop Features  HDFS (Hadoop File System) utilities – for loading/unloading files Hadoop Filescale Features Features  Sqoop – utility for RDBMS extract to HDFS (Cloudera only) • Hadoop components for • Use case: process  Data Warehousing on Hadoop using Hive - SQL - like language, to easy job design structured flat files query and transform data • HDFS: store, retrieve data (e.g. logs) • Cloudera Sqoop: Bulk ETL • Uses MapReduce  Transforming Data in Hadoop using Pig – transform, normalize, clean techniques HDFS data – very flexible • Hive: Relational DB layer • Performance optimized for • Pig: In-Hadoop this use case transformations  Talend Integration Suite MPx Hadoop Support • Native code, no Java  Components for HDFS and Sqoop loading/unloading  Components for defining Pig and Hive jobs  Integrate with any of Talend’s supported sources!© Talend 2011 13 © Talend 2011 14 Agenda Applying Talend Big Data in Enterprise  Talend Introduction  Landing data from operational systems  Transforming it before loading DW  MapReduce and Hadoop DW BI  Talend Integration Suite MPx Hadoop  Hadoop Features and TIS Components Hive  How to use Talend to simplify Hadoop Pig Sqoop Hive  Demo! HDFS  Questions & Answers  Performing additional analytics directly in Hadoop  Keeping historical data online for queries© Talend 2011 15 © Talend 2011 16 4
  5. 5. 14/10/2011Today’s Demo Scenario  View sample log data from an online game source  Load log data into Hive  Aggregate the data into 2 aggregate tables  Load aggregated data into RDBMS  Additional processing using PIG Show Time!© Talend 2011 17Wrap-up Agenda  Talend Introduction  Talend Integration Suite MPx…  MapReduce and Hadoop  delivers MapReduce technologies as part of a  Talend Integration Suite MPx comprehensive data management solution  Hadoop Features and TIS Components  makes using Hadoop like other data integration activities  How to use Talend to simplify Hadoop  …is available for you to try  Demo!  Questions & Answers  Free 2 month license to Talend Integration Suite MPx  Visit http://info.talend.com/hugoffer.html© Talend 2011 19 © Talend 2011 20 5
  6. 6. 14/10/2011 Questions and Answers Mark Chapman  Technical Manager  mchapman@talend.com Thank You!  Skype: mchapman68 Imad Rahman  Technical Presales Consultant  irahman@talend.com  Skype: imadrahman.talend© Talend 2011 21 6

×