Big Data beyond Apache Hadoop - How to integrate ALL your Data
by Kai Waehner, IT Consultant at Talend on Apr 26, 2013
- 2,195 views
Big data represents a significant paradigm shift in enterprise technology. Big data radically changes the nature of the data management profession as it introduces new concerns about the volume, ...
Big data represents a significant paradigm shift in enterprise technology. Big data radically changes the nature of the data management profession as it introduces new concerns about the volume, velocity and variety of corporate data.
Apache Hadoop is the open source defacto standard for implementing big data solutions on the Java platform. Hadoop consists of its kernel, MapReduce, and the Hadoop Distributed Filesystem (HDFS). A challenging task is to send all data to Hadoop for processing and storage (and then get it back to your application later), because in practice data comes from many different applications (SAP, Salesforce, Siebel, etc.) and databases (File, SQL, NoSQL), uses different technologies and concepts for communication (e.g. HTTP, FTP, RMI, JMS), and consists of different data formats using CSV, XML, binary data, or other alternatives.
This session shows the powerful combination of Apache Hadoop and Apache Camel to solve this challenging task. Learn how to use every thinkable data with Hadoop – without plenty of complex or redundant boilerplate code. Besides supporting the integration of all different technologies and data formats, Apache Camel also offers an easy, standardized DSL to transform, split or filter incoming data using the Enterprise Integration Patterns (EIP). Therefore, Apache Hadoop and Apache Camel are a perfect match for processing big data on the Java platform.
- Total Views
- Views on SlideShare
- Embed Views