SlideShare a Scribd company logo
Data Ingestion Platform (DiP)
Co-Dev opportunity to ingest any data in near
real time
www.xavient.com
www.xavient.comXavient Data Ingestion Platform (DiP)
Introduction
When numerous big data sources exist in diverse
formats (the sources may often number in the
hundreds and the formats in the dozens), it can
be challenging for businesses to ingest data at a
reasonable speed and process it efficiently in
order to maintain a competitive advantage. To
that end, vendors offer software programs that
are tailored to specific computing environments
or software applications.
When data ingestion is automated, the software
used to carry out the process may also include
data preparation features to structure and
organize data so it can be analyzed on the fly or
at a later time by business intelligence (BI) and
business analytics (BA) programs.
Data Ingestion Platform (DiP) is a system to
ingest data into Big Data systems. Data can be
streamed in real time or ingested in batches.
When data is ingested in real time, each data item
is imported as it is emitted by the source. When
data is ingested in batches, data items are
imported in discrete chunks at periodic intervals
of time. An effective data ingestion process
begins by prioritizing data sources, validating
individual files and routing data items to the
correct destination.
* This is a co-dev opportunity and provides initial baselines and
access to Big Data experts to enhance it further to meet the business
requirements
“Every business is an
analytics business, every
business process is an
analytics process, and every
business user is an analytics
user”
- Gartner
Challenges Faced
Business want to get data from various sources into
Hadoop or NoSql databases for faster access in near real
time. There is need for a platform that can help to build
a scalable and fault tolerant data pipeline.
This system should allow to run the following:
High Speed
Filtering and
Pattern Matching
Contextual
Enrichment
on the Fly
Real-time KPIs,
Analytics, Baselining
and Notification
Predictive
Analytics
Actions and
Decisions



2 |
www.xavient.com Xavient Data Ingestion Platform (DiP)3 |
Data Ingestion Platform (DiP)
Real time data ingestion using Data Ingestion Platform
(DiP) harness the powers of Apache Apex, Apache Flink,
Apache Spark and Apache Storm to stream data into
lambda architecture. Apache Kafka plays a key role as
messaging bus from source to streaming component.
DiP comes along with a UI in case users wants to upload
data from their desktops and also, any data can be
ingested from any source like Cloud Storage or local file
system. UI plays a key role in learning and choosing the
streaming components in the initial phase of
understanding the system.
DiP Technology Stack
• Source System – Web Client
• Messaging System – Apache Kafka
• Target System – HDFS, Apache HBase, Apache Hive
• Reporting System – Apache Phoenix(CLI), Apache
Zeppelin
• Streaming API’s – Apache Apex, Apache Flink,
Apache Spark and Apache Storm
• Programming Language – Java
• IDE – Eclipse
• Build tool – Apache Maven
• Operating System – CentOS 7
DiP Features
Any data source
Any data type
Easy to use UI
Data Visualization
High Level API’s
Java, Scala, Client
bindings
Architecture
• Flume / Client UI ingests data to Kafka Queues
• Platform picks data from subscribed Kafka topics
• Four streaming APIs : Apex Streaming, Flink Streaming, Spark Streaming, Storm Streaming
(Windowed Aggregations to MySQL)
• Process it in real time or micro-batching : HBase, HDFS (External tables on Hive tables), Phoenix
views on Zeppelin
G
U
I
XML
JSON
CSV
TXT
K
A
F
K
A
B
R
O
K
E
R
HBASE
HDFS
Hive
External
tables
Phoenix
Reporting
Zeppelin
Kafka
Operator
Classifier
Operator
File
Operator
HBase
Operator
Apex Streaming
Kafka
Source
Map
Data
HDFS
Sink
HBase
Sink
Flink Streaming
Kafka
Stream
Spark Streaming
Spark
Executers
Kafka
Spout
Storm Topology HDFS
bolt
HBASE
bolt
Filter
bolt
Data Ingestion Platform
www.xavient.comXavient Data Ingestion Platform (DiP)4 |
DiP comes with an easy to use UI that offers the following features –
• Switch easily between the supported streaming engines just by clicking on a radio button.
• Supports xml, json and tsv data formats
• Use text area to enter data manually for getting processed
• Process files for batch processing by simply uploading them
DiP User Interface (Co-Dev)
Use Cases
Sentiment
Analysis
Click
Stream
Analysis
Log
Analysis
Social
Media and
Customer
Sentiment
Analyze
Machine
and Sensor
Data
www.xavient.com Xavient Data Ingestion Platform (DiP)5 |
Great Ideas… Simple Solutions is what Xavient thrives on. As a global IT consulting
and software services company, we focus on transforming business ideas into
effective solutions.
Founded in 2002, the company is led by a passionate team of experts who come with
a history of entrepreneurial and management success. Xavient is headquartered in
the U.S with an international network of delivery centers primarily established in
India.
About Xavient
• Enabled one of the largest Billing
Transformation initiative in North America
• Powered one of the largest OTT platform for
video-on-demand services
• Designed one of the most engaging high
touch - high performance Retail UI/UX
• Proven expertise & unflinching focus on Digital
Media & Communication space for over 14
years
• Partner of choice for 4 out of Top 5 CSPs in the
US
• Developed the Live Streaming solution for a
Weather channel supporting next generation
internet connected devices

More Related Content

What's hot

Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Scott Mitchell
 
Preparing for BI in the Cloud with Windows Azure
Preparing for BI in the Cloud with Windows AzurePreparing for BI in the Cloud with Windows Azure
Preparing for BI in the Cloud with Windows AzurePerficient, Inc.
 
What’s New in Syncsort’s Trillium Software System (TSS) 15.7
What’s New in Syncsort’s Trillium Software System (TSS) 15.7What’s New in Syncsort’s Trillium Software System (TSS) 15.7
What’s New in Syncsort’s Trillium Software System (TSS) 15.7
Precisely
 
Enterprise Cloud for your Business Applications
Enterprise Cloud for your Business ApplicationsEnterprise Cloud for your Business Applications
Enterprise Cloud for your Business Applications
Blazeclan Technologies Private Limited
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshowAccenture
 
Introduction To IPaaS: Drivers, Requirements And Use Cases
Introduction To IPaaS: Drivers, Requirements And Use CasesIntroduction To IPaaS: Drivers, Requirements And Use Cases
Introduction To IPaaS: Drivers, Requirements And Use Cases
Synerzip
 
How to Achieve Data in Motion Expertise | Mario Sanchez, Confluent
How to Achieve Data in Motion Expertise | Mario Sanchez, ConfluentHow to Achieve Data in Motion Expertise | Mario Sanchez, Confluent
How to Achieve Data in Motion Expertise | Mario Sanchez, Confluent
HostedbyConfluent
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
Hortonworks
 
Next Generation Audience Measurement at Spectrum Reach
Next Generation Audience Measurement at Spectrum ReachNext Generation Audience Measurement at Spectrum Reach
Next Generation Audience Measurement at Spectrum Reach
Tim Case
 
Process Batch transaction using AzureBlob Integration with Apache Camel
Process Batch transaction using AzureBlob Integration with Apache CamelProcess Batch transaction using AzureBlob Integration with Apache Camel
Process Batch transaction using AzureBlob Integration with Apache Camel
Srikant Mantha
 
Big Data Hadoop as a Services
Big Data Hadoop as a Services Big Data Hadoop as a Services
Big Data Hadoop as a Services
Vikas Kumar
 
2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration
Jeffrey T. Pollock
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
Edgar Alejandro Villegas
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorld
Jeffrey T. Pollock
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
MapR Technologies
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
Smart Enterprise Big Data Bus for the Modern Responsive EnterpriseSmart Enterprise Big Data Bus for the Modern Responsive Enterprise
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
DataWorks Summit
 
MicroStrategy World 2014: Scaling MicroStrategy at eBay
MicroStrategy World 2014: Scaling MicroStrategy at eBayMicroStrategy World 2014: Scaling MicroStrategy at eBay
MicroStrategy World 2014: Scaling MicroStrategy at eBay
Tim Case
 
Application Portfolio Migration
Application Portfolio MigrationApplication Portfolio Migration
Application Portfolio Migration
Amazon Web Services
 
Business Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureBusiness Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows Azure
Infosys
 

What's hot (20)

Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
 
Preparing for BI in the Cloud with Windows Azure
Preparing for BI in the Cloud with Windows AzurePreparing for BI in the Cloud with Windows Azure
Preparing for BI in the Cloud with Windows Azure
 
What’s New in Syncsort’s Trillium Software System (TSS) 15.7
What’s New in Syncsort’s Trillium Software System (TSS) 15.7What’s New in Syncsort’s Trillium Software System (TSS) 15.7
What’s New in Syncsort’s Trillium Software System (TSS) 15.7
 
Enterprise Cloud for your Business Applications
Enterprise Cloud for your Business ApplicationsEnterprise Cloud for your Business Applications
Enterprise Cloud for your Business Applications
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
 
Introduction To IPaaS: Drivers, Requirements And Use Cases
Introduction To IPaaS: Drivers, Requirements And Use CasesIntroduction To IPaaS: Drivers, Requirements And Use Cases
Introduction To IPaaS: Drivers, Requirements And Use Cases
 
How to Achieve Data in Motion Expertise | Mario Sanchez, Confluent
How to Achieve Data in Motion Expertise | Mario Sanchez, ConfluentHow to Achieve Data in Motion Expertise | Mario Sanchez, Confluent
How to Achieve Data in Motion Expertise | Mario Sanchez, Confluent
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
Next Generation Audience Measurement at Spectrum Reach
Next Generation Audience Measurement at Spectrum ReachNext Generation Audience Measurement at Spectrum Reach
Next Generation Audience Measurement at Spectrum Reach
 
Process Batch transaction using AzureBlob Integration with Apache Camel
Process Batch transaction using AzureBlob Integration with Apache CamelProcess Batch transaction using AzureBlob Integration with Apache Camel
Process Batch transaction using AzureBlob Integration with Apache Camel
 
Big Data Hadoop as a Services
Big Data Hadoop as a Services Big Data Hadoop as a Services
Big Data Hadoop as a Services
 
2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration
 
Cloud Migration
Cloud MigrationCloud Migration
Cloud Migration
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorld
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
Smart Enterprise Big Data Bus for the Modern Responsive EnterpriseSmart Enterprise Big Data Bus for the Modern Responsive Enterprise
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
 
MicroStrategy World 2014: Scaling MicroStrategy at eBay
MicroStrategy World 2014: Scaling MicroStrategy at eBayMicroStrategy World 2014: Scaling MicroStrategy at eBay
MicroStrategy World 2014: Scaling MicroStrategy at eBay
 
Application Portfolio Migration
Application Portfolio MigrationApplication Portfolio Migration
Application Portfolio Migration
 
Business Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureBusiness Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows Azure
 

Similar to Xavient - DiP

xGem Data Stream Processing
xGem Data Stream ProcessingxGem Data Stream Processing
xGem Data Stream Processing
Jorge Hirtz
 
Data ingestion
Data ingestionData ingestion
Data ingestion
nitheeshe2
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
Stéphane Fréchette
 
OOP 2014
OOP 2014OOP 2014
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza SeattleBuilding Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
Evan Chan
 
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Etu Solution
 
Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016
Joan Novino
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
ConfluentInc1
 
Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014
Darko Marjanovic
 
Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014
Milos Milovanovic
 
Big Data , Big Problem?
Big Data , Big Problem?Big Data , Big Problem?
Big Data , Big Problem?
Mohammadhasan Farazmand
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
Asis Mohanty
 
GETTING YOUR DATA IN HADOOP.pptx
GETTING YOUR DATA IN HADOOP.pptxGETTING YOUR DATA IN HADOOP.pptx
GETTING YOUR DATA IN HADOOP.pptx
infinix8
 
Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overview
Rohit Jain
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA
 
10 Big Data Technologies you Didn't Know About
10 Big Data Technologies you Didn't Know About 10 Big Data Technologies you Didn't Know About
10 Big Data Technologies you Didn't Know About
Jesus Rodriguez
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
Ameet Paranjape
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log Processing
Cascading
 

Similar to Xavient - DiP (20)

xGem Data Stream Processing
xGem Data Stream ProcessingxGem Data Stream Processing
xGem Data Stream Processing
 
Data ingestion
Data ingestionData ingestion
Data ingestion
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
 
OOP 2014
OOP 2014OOP 2014
OOP 2014
 
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza SeattleBuilding Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
 
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
 
Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
 
Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014
 
Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014
 
Big Data , Big Problem?
Big Data , Big Problem?Big Data , Big Problem?
Big Data , Big Problem?
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
 
GETTING YOUR DATA IN HADOOP.pptx
GETTING YOUR DATA IN HADOOP.pptxGETTING YOUR DATA IN HADOOP.pptx
GETTING YOUR DATA IN HADOOP.pptx
 
Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overview
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
 
10 Big Data Technologies you Didn't Know About
10 Big Data Technologies you Didn't Know About 10 Big Data Technologies you Didn't Know About
10 Big Data Technologies you Didn't Know About
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
 
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log Processing
 

Recently uploaded

Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 

Recently uploaded (20)

Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 

Xavient - DiP

  • 1. Data Ingestion Platform (DiP) Co-Dev opportunity to ingest any data in near real time www.xavient.com
  • 2. www.xavient.comXavient Data Ingestion Platform (DiP) Introduction When numerous big data sources exist in diverse formats (the sources may often number in the hundreds and the formats in the dozens), it can be challenging for businesses to ingest data at a reasonable speed and process it efficiently in order to maintain a competitive advantage. To that end, vendors offer software programs that are tailored to specific computing environments or software applications. When data ingestion is automated, the software used to carry out the process may also include data preparation features to structure and organize data so it can be analyzed on the fly or at a later time by business intelligence (BI) and business analytics (BA) programs. Data Ingestion Platform (DiP) is a system to ingest data into Big Data systems. Data can be streamed in real time or ingested in batches. When data is ingested in real time, each data item is imported as it is emitted by the source. When data is ingested in batches, data items are imported in discrete chunks at periodic intervals of time. An effective data ingestion process begins by prioritizing data sources, validating individual files and routing data items to the correct destination. * This is a co-dev opportunity and provides initial baselines and access to Big Data experts to enhance it further to meet the business requirements “Every business is an analytics business, every business process is an analytics process, and every business user is an analytics user” - Gartner Challenges Faced Business want to get data from various sources into Hadoop or NoSql databases for faster access in near real time. There is need for a platform that can help to build a scalable and fault tolerant data pipeline. This system should allow to run the following: High Speed Filtering and Pattern Matching Contextual Enrichment on the Fly Real-time KPIs, Analytics, Baselining and Notification Predictive Analytics Actions and Decisions    2 |
  • 3. www.xavient.com Xavient Data Ingestion Platform (DiP)3 | Data Ingestion Platform (DiP) Real time data ingestion using Data Ingestion Platform (DiP) harness the powers of Apache Apex, Apache Flink, Apache Spark and Apache Storm to stream data into lambda architecture. Apache Kafka plays a key role as messaging bus from source to streaming component. DiP comes along with a UI in case users wants to upload data from their desktops and also, any data can be ingested from any source like Cloud Storage or local file system. UI plays a key role in learning and choosing the streaming components in the initial phase of understanding the system. DiP Technology Stack • Source System – Web Client • Messaging System – Apache Kafka • Target System – HDFS, Apache HBase, Apache Hive • Reporting System – Apache Phoenix(CLI), Apache Zeppelin • Streaming API’s – Apache Apex, Apache Flink, Apache Spark and Apache Storm • Programming Language – Java • IDE – Eclipse • Build tool – Apache Maven • Operating System – CentOS 7 DiP Features Any data source Any data type Easy to use UI Data Visualization High Level API’s Java, Scala, Client bindings Architecture • Flume / Client UI ingests data to Kafka Queues • Platform picks data from subscribed Kafka topics • Four streaming APIs : Apex Streaming, Flink Streaming, Spark Streaming, Storm Streaming (Windowed Aggregations to MySQL) • Process it in real time or micro-batching : HBase, HDFS (External tables on Hive tables), Phoenix views on Zeppelin G U I XML JSON CSV TXT K A F K A B R O K E R HBASE HDFS Hive External tables Phoenix Reporting Zeppelin Kafka Operator Classifier Operator File Operator HBase Operator Apex Streaming Kafka Source Map Data HDFS Sink HBase Sink Flink Streaming Kafka Stream Spark Streaming Spark Executers Kafka Spout Storm Topology HDFS bolt HBASE bolt Filter bolt Data Ingestion Platform
  • 4. www.xavient.comXavient Data Ingestion Platform (DiP)4 | DiP comes with an easy to use UI that offers the following features – • Switch easily between the supported streaming engines just by clicking on a radio button. • Supports xml, json and tsv data formats • Use text area to enter data manually for getting processed • Process files for batch processing by simply uploading them DiP User Interface (Co-Dev) Use Cases Sentiment Analysis Click Stream Analysis Log Analysis Social Media and Customer Sentiment Analyze Machine and Sensor Data
  • 5. www.xavient.com Xavient Data Ingestion Platform (DiP)5 | Great Ideas… Simple Solutions is what Xavient thrives on. As a global IT consulting and software services company, we focus on transforming business ideas into effective solutions. Founded in 2002, the company is led by a passionate team of experts who come with a history of entrepreneurial and management success. Xavient is headquartered in the U.S with an international network of delivery centers primarily established in India. About Xavient • Enabled one of the largest Billing Transformation initiative in North America • Powered one of the largest OTT platform for video-on-demand services • Designed one of the most engaging high touch - high performance Retail UI/UX • Proven expertise & unflinching focus on Digital Media & Communication space for over 14 years • Partner of choice for 4 out of Top 5 CSPs in the US • Developed the Live Streaming solution for a Weather channel supporting next generation internet connected devices