SlideShare a Scribd company logo
1 of 7
Download to read offline
Kamal A
Certified BigData, SOA andJavaArchitect
Email:onlinetraining2011@gmail.com
Agile project scope details – User story , Scrum cycles.
TechnologyStack details.
 SetupVirtual machine based Hadoop cluster setup.
Installation of Hadoop , Hive and Sqoop.
ONLINETRANING2011@GMAIL.COM
Analyze the payment data xmls.
Parse xml data using choice of technology(DOM , JAXB etc).
Load data in RDBMS tables in incremental mode.
 Schedule the preprocessing job to run for every 30 min run (
Javascheduler Quartz- source 1 every 15 min, crontab - source 2
: every 1 hour )
Add multithreading / parallel process model. (To handle large
volumes .
ONLINETRANING2011@GMAIL.COM|
Build data migration flow from RDBMS into Hadoop/ Hive
usingSqoop.
Create Import tables in Hive.
Create Sqoop - Hive data import script.
Verify data import records and write error for records
mismatch.
ONLINETRANING2011@GMAIL.COM|
RunHive analytic query and store output data in result table.(
Schedule the job to run)
 Execute Hive joins for complex queries.
Write UDF for data normalization.
Use Sqoop to resend data from Hive to RDBMS through shell
script .
ONLINETRANING2011@GMAIL.COM
 Visualize output data in RDBMS table using open
source/commercialtools like Tableau.
Create report using Bar graph to show the trends for issue rate.
.
Create report using Pie chart for payment data distribution on
issues.
Use Hiveserver2 to connect and generate live analytic results.
ONLINETRANING2011@GMAIL.COM|
 Email: onlinetraining2011@gmail.com
 Skype:onlinetraining2011
 Some live sessionvideos:
HadoopInstallation: http://www.youtube.com/watch?v=i9yckEduQBE
HDFS File system Lab: http://www.youtube.com/watch?v=ZIpJ5LUWNUw
 Linkedingroup :
http://www.linkedin.com/groups/Online-Hadoop-Training-4838165
Blog: http://onlinetraining2011.blogspot.com
ProjectDuration: 20 hours
TrainingMedium: OnlineviaGotomeeting
ONLINETRANING2011@GMAIL.COM|

More Related Content

Viewers also liked

Viewers also liked (17)

BIGDATA & HADOOP PROJECT
BIGDATA & HADOOP PROJECTBIGDATA & HADOOP PROJECT
BIGDATA & HADOOP PROJECT
 
Hadoop/HBase POC framework
Hadoop/HBase POC frameworkHadoop/HBase POC framework
Hadoop/HBase POC framework
 
Big Data Usecases
Big Data UsecasesBig Data Usecases
Big Data Usecases
 
Improve the Health of Your Data
Improve the Health of Your DataImprove the Health of Your Data
Improve the Health of Your Data
 
Query Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingQuery Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programming
 
Outlier and fraud detection using Hadoop
Outlier and fraud detection using HadoopOutlier and fraud detection using Hadoop
Outlier and fraud detection using Hadoop
 
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
 
QuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solutionQuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solution
 
BIG DATA and USE CASES
BIG DATA and USE CASESBIG DATA and USE CASES
BIG DATA and USE CASES
 
Using Hadoop and Hive to Optimize Travel Search , WindyCityDB 2010
Using Hadoop and Hive to Optimize Travel Search, WindyCityDB 2010Using Hadoop and Hive to Optimize Travel Search, WindyCityDB 2010
Using Hadoop and Hive to Optimize Travel Search , WindyCityDB 2010
 
Leveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE VerticaLeveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE Vertica
 
Big Data Proof of Concept
Big Data Proof of ConceptBig Data Proof of Concept
Big Data Proof of Concept
 
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis Data mining: Concepts and Techniques, Chapter12 outlier Analysis
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
 
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
 
Hadoop Real Life Use Case & MapReduce Details
Hadoop Real Life Use Case & MapReduce DetailsHadoop Real Life Use Case & MapReduce Details
Hadoop Real Life Use Case & MapReduce Details
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
 

More from Kamal A

Payment Gateway Live hadoop project
Payment Gateway Live hadoop projectPayment Gateway Live hadoop project
Payment Gateway Live hadoop project
Kamal A
 

More from Kamal A (6)

All python data_analyst_r_course
All python data_analyst_r_courseAll python data_analyst_r_course
All python data_analyst_r_course
 
Project using kafka spark mongo db project
Project using kafka spark mongo db projectProject using kafka spark mongo db project
Project using kafka spark mongo db project
 
Big data_hadoop_spark_kafka_nosql_training
Big data_hadoop_spark_kafka_nosql_trainingBig data_hadoop_spark_kafka_nosql_training
Big data_hadoop_spark_kafka_nosql_training
 
Payment Gateway Live hadoop project
Payment Gateway Live hadoop projectPayment Gateway Live hadoop project
Payment Gateway Live hadoop project
 
Practical Hadoop Big Data Training Course by Certified Architect
Practical Hadoop Big Data Training Course by Certified ArchitectPractical Hadoop Big Data Training Course by Certified Architect
Practical Hadoop Big Data Training Course by Certified Architect
 
Hadoop online training course
Hadoop online  training courseHadoop online  training course
Hadoop online training course
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 

Hadoop live project Payment Gateway Data Analytics

  • 1. Kamal A Certified BigData, SOA andJavaArchitect Email:onlinetraining2011@gmail.com
  • 2. Agile project scope details – User story , Scrum cycles. TechnologyStack details.  SetupVirtual machine based Hadoop cluster setup. Installation of Hadoop , Hive and Sqoop. ONLINETRANING2011@GMAIL.COM
  • 3. Analyze the payment data xmls. Parse xml data using choice of technology(DOM , JAXB etc). Load data in RDBMS tables in incremental mode.  Schedule the preprocessing job to run for every 30 min run ( Javascheduler Quartz- source 1 every 15 min, crontab - source 2 : every 1 hour ) Add multithreading / parallel process model. (To handle large volumes . ONLINETRANING2011@GMAIL.COM|
  • 4. Build data migration flow from RDBMS into Hadoop/ Hive usingSqoop. Create Import tables in Hive. Create Sqoop - Hive data import script. Verify data import records and write error for records mismatch. ONLINETRANING2011@GMAIL.COM|
  • 5. RunHive analytic query and store output data in result table.( Schedule the job to run)  Execute Hive joins for complex queries. Write UDF for data normalization. Use Sqoop to resend data from Hive to RDBMS through shell script . ONLINETRANING2011@GMAIL.COM
  • 6.  Visualize output data in RDBMS table using open source/commercialtools like Tableau. Create report using Bar graph to show the trends for issue rate. . Create report using Pie chart for payment data distribution on issues. Use Hiveserver2 to connect and generate live analytic results. ONLINETRANING2011@GMAIL.COM|
  • 7.  Email: onlinetraining2011@gmail.com  Skype:onlinetraining2011  Some live sessionvideos: HadoopInstallation: http://www.youtube.com/watch?v=i9yckEduQBE HDFS File system Lab: http://www.youtube.com/watch?v=ZIpJ5LUWNUw  Linkedingroup : http://www.linkedin.com/groups/Online-Hadoop-Training-4838165 Blog: http://onlinetraining2011.blogspot.com ProjectDuration: 20 hours TrainingMedium: OnlineviaGotomeeting ONLINETRANING2011@GMAIL.COM|