SlideShare a Scribd company logo
1 of 3
PrasadGaikwad
8087002445
prasadgaikwad09@gmail.com
Big Data Hadoop developer with over three years of experience in using cutting edge
technologies such as Big Data on Cloud along with machine learning and data visualization
& discovery to help businesses identify new opportunities and create disruptive business
models. Leading and mentoring team of solution developers in integrating new age
technologies with enterprise ETL and DW appliances. Won Tata Technologies Spot award
for proactive involvement in performance tuning and automating manual deliverables.
Professional Experience
BigData Hadoop Developer| Lead
Aug'15to present
TataTechnologies
Currently working as Big Data lead in digital team for Tata motors ltd, a major Indian
automotive manufacturer. Helping business identify value added insights by designing and
implementing cost effective Cloud based solutions, integrating open source technologies
viz., Spark, Hive, Sqoop, Kafka, Oozie with Enterprise applications such as SAP, CRM and
Cordys. Implementing multiple fast paced PoCs to validate concepts and mature it into
projects if successful.
 Hands on experience in using Spark, Hive,Sqoop, Kafka, Oozie,Hue, Ambari and Zeppelin
for ingesting, cleaning and integrating Enterprise wide application data.
 Excellent understanding on working of Hadoop and Spark internals such as HDFS,
MapReduce, YARN, RDD,Dataframes and Dataset.
 Designing solution architectures using excellent understanding of various cloud offerings
and hands on experience in provisioning and managing resources such as Amazon EMR,
EC2, S3, RedShift, RDS, Kinesis, Lambda, Google Compute Engine, GCS, BigQuery, HDInsight
(Azure)
 Setting up and using multi-node EMR cluster, Cloudera CDH cluster on AWS using Cloudera
Director and on premise HDP Hadoop Cluster using Ambari.
 Activeparticipation in Summits, Sessions and hands-on workshops on large scale data
processing by Solution Architects and SMEs from Industry leaders such as AWS, Cloudera,
Teradata, Microsoftand Google to evaluate and understand Big Data appliances and data
lake solutions offered,on-Premise vs. on-Cloud architecture and how they fit-in current
enterprise ITlandscape.
Projects-
ProfixDataMart – Tata Motors (AWS) - AWS EMR,Spark, Hive,Sqoop,Oozie,SAS
Designed and deployed Datamart on top of Amazon S3 establishing ODBC connectivity to
SAS modelling team via Hiveusing Amazon EMR. Automated daily refresh of data from
Teradata EDWbox using combination of sqoop, spark and oozie to create daily ETLjobs
running on on-demand EMRcluster.
Project Wave-Tata Motors (AWS) - AWS EMR,Spark, Hive, S3, Tableau
Designed and deployed trend analysis dashboards in Tableau to predict impact of
fluctuations in commodity market on VC costusing Hive,Spark, S3, Hue and Zeppelin.
Automated provisioning of EMR clusters with spot instances forexecuting batch ETL
workloads.
Vehiclestoppageanalysis– TataMotors(GCP) - Google BigQuery, Python, MS-SQL, Tableau
Designed and deployed pure cloud Big data solution of telemetry data, to convertlive
tracking of vehiclesinto stoppage heat maps, used clustering algorithms to identify points of
interest to business based on most frequent stoppages across India.
POCs-
SAP BOM dataexplosionusingHive, Pig,Spark and Tableau(onpremise).
Deployed on-premise HDP cluster to develop interactiveTableau dashboards displaying
component/vehicle/plant wise cost variations. Used Pig, Hive, Spark, and Ambari for
Ingesting and integrating BOMdata from SAP BW withCRM
DesigninganddevelopingDataLakestrategy
Developing data lake strategy foringesting structured, semi-structured and unstructured
data generated by various applications in current enterprise landscape and exposing only
relevant data to end users.
SolutionDeveloper|ETLLead
January2014 to Jul'15
TataTechnologies
Worked as ETL lead in Business Intelligence team for Tata motors ltd, a major Indian
automotive manufacturer. Integrated Siebel CRM, SAP and Cordys Portal data using
Informatica in Teradata EDW (10 TB+). Delivered end-to-end business solution,
implemented PoCs, Performance tuned and provided production support for one of the
largest CRM deployment. Technical lead for ETL developers, responsible for deploying CRs
in production, monitoring and troubleshooting execution of nightly ETL.
Hands‐on workexperience in
 Providing L2 and L3 support on production environment serving 5000+ application end
users on reporting engine (OBIEE).
 PerformanceTuning – 40%+ improvement in ETL execution time from 4 hours down to 2.5
hours by implementing Informatica, DAC and Teradata best practices.
 Understanding business requirement fordesigning end to end Module deployment.
 Building and maintaining complex ETLsfor data integration across business applications
with about 1000+ mappings, 600+ tables and 10+ TBrelational database on Teradata using
Informatica Powercenterand DAC.
 Building custom data models to integrate Enterprise data with CSV uploads tocreate
complete picture of business landscape.
 Debugging and resolving ETL failures and data discrepancies of existing models.
 Shell scripting and writing cronjobs for executing Teradata scripts, ftp/sftp transfers
between application servers.
Technical Proficiency
Big Data Technologies Hadoop (Cloudera CDH, HortonWorks HDP,AWS EMR)
HDFS, Hive, Kafka, MapReduce, Oozie,Pig, Spark, Sqoop,
Zookeeper, Zeppelin
Cloud Vendors Amazon AWS, Google Cloud, MicrosoftAzure
Database Teradata, MongoDB, Google BigQuery, Oracle,MySQL
Tools Informatica, DAC, Teradata Utilities
Monitoring and Reporting Apache Ambari, HUE, Cloudera Manager, TDViewpoint, OBIEE,
Tableau
Programming/Scripting
Languages
SQL, python, java, scala
Operating Systems RHEL, CentOS, Fedora, Windows
AcademicQualifications
 B.E. in Information Technology from Walchand Institute of Technology,Solapur with 72%
 Diploma in Computer Engineering from Govt.Polytechnic Mumbai with 81.3%
 S.S.C. with 88.3%

More Related Content

What's hot

Qubole on AWS - White paper
Qubole on AWS - White paper Qubole on AWS - White paper
Qubole on AWS - White paper Vasu S
 
The Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | WhitepaperThe Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | WhitepaperVasu S
 
Apache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel IndustryApache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel IndustryKai Wähner
 
SAP on Microsoft Azure - August 2018 Edition
SAP on Microsoft Azure - August 2018 EditionSAP on Microsoft Azure - August 2018 Edition
SAP on Microsoft Azure - August 2018 Editionarnaudlh
 
InleData Webinar: Empowering Businesses with Delta Lake
InleData Webinar: Empowering Businesses with Delta LakeInleData Webinar: Empowering Businesses with Delta Lake
InleData Webinar: Empowering Businesses with Delta LakeCEPTES Software Inc
 
AWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution ShowcaseAWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution ShowcaseAmazon Web Services
 
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...Kai Wähner
 
Transforming your business through data driven insights and action with Azure
Transforming your business through data driven insights and action with AzureTransforming your business through data driven insights and action with Azure
Transforming your business through data driven insights and action with AzureInovar Tech
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingYu Huang
 
Alteryx Architecture
Alteryx ArchitectureAlteryx Architecture
Alteryx ArchitectureVivek Mohan
 
XML Publisher (www.aboutoracleapps.com)
XML Publisher (www.aboutoracleapps.com)XML Publisher (www.aboutoracleapps.com)
XML Publisher (www.aboutoracleapps.com)Chris Martin
 
Dealing With Drift - Building an Enterprise Data Lake
Dealing With Drift - Building an Enterprise Data LakeDealing With Drift - Building an Enterprise Data Lake
Dealing With Drift - Building an Enterprise Data LakePat Patterson
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftSnapLogic
 
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...SAP Cloud Platform
 
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Amazon Web Services
 
Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...
Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...
Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...Databricks
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML Amazon Web Services
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Kai Wähner
 
(ISM212) Mcdonald’s Uses AWS To Launch Applications Quickly & Globally
(ISM212) Mcdonald’s Uses AWS To Launch Applications Quickly & Globally(ISM212) Mcdonald’s Uses AWS To Launch Applications Quickly & Globally
(ISM212) Mcdonald’s Uses AWS To Launch Applications Quickly & GloballyAmazon Web Services
 

What's hot (20)

Qubole on AWS - White paper
Qubole on AWS - White paper Qubole on AWS - White paper
Qubole on AWS - White paper
 
About Aspans
About AspansAbout Aspans
About Aspans
 
The Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | WhitepaperThe Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | Whitepaper
 
Apache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel IndustryApache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel Industry
 
SAP on Microsoft Azure - August 2018 Edition
SAP on Microsoft Azure - August 2018 EditionSAP on Microsoft Azure - August 2018 Edition
SAP on Microsoft Azure - August 2018 Edition
 
InleData Webinar: Empowering Businesses with Delta Lake
InleData Webinar: Empowering Businesses with Delta LakeInleData Webinar: Empowering Businesses with Delta Lake
InleData Webinar: Empowering Businesses with Delta Lake
 
AWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution ShowcaseAWS Webcast - Tableau Big Data Solution Showcase
AWS Webcast - Tableau Big Data Solution Showcase
 
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
 
Transforming your business through data driven insights and action with Azure
Transforming your business through data driven insights and action with AzureTransforming your business through data driven insights and action with Azure
Transforming your business through data driven insights and action with Azure
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous driving
 
Alteryx Architecture
Alteryx ArchitectureAlteryx Architecture
Alteryx Architecture
 
XML Publisher (www.aboutoracleapps.com)
XML Publisher (www.aboutoracleapps.com)XML Publisher (www.aboutoracleapps.com)
XML Publisher (www.aboutoracleapps.com)
 
Dealing With Drift - Building an Enterprise Data Lake
Dealing With Drift - Building an Enterprise Data LakeDealing With Drift - Building an Enterprise Data Lake
Dealing With Drift - Building an Enterprise Data Lake
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
 
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
Innovate collaborative applications with SAP Jam Collaboration & SAP Cloud Pl...
 
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
 
Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...
Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...
Scale and Optimize Data Engineering Pipelines with Software Engineering Best ...
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML
 
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
 
(ISM212) Mcdonald’s Uses AWS To Launch Applications Quickly & Globally
(ISM212) Mcdonald’s Uses AWS To Launch Applications Quickly & Globally(ISM212) Mcdonald’s Uses AWS To Launch Applications Quickly & Globally
(ISM212) Mcdonald’s Uses AWS To Launch Applications Quickly & Globally
 

Viewers also liked

Teaching Universal Design in Computer Science
Teaching Universal Design in Computer ScienceTeaching Universal Design in Computer Science
Teaching Universal Design in Computer ScienceDamian T. Gordon
 
Conflicts as a tool of successful delivery
Conflicts as a tool of successful deliveryConflicts as a tool of successful delivery
Conflicts as a tool of successful deliveryEvgeniy Labunskiy
 
การแต่งตั้งคณะกรรมการศึกษาธิการจังหวัด
การแต่งตั้งคณะกรรมการศึกษาธิการจังหวัดการแต่งตั้งคณะกรรมการศึกษาธิการจังหวัด
การแต่งตั้งคณะกรรมการศึกษาธิการจังหวัดประพันธ์ เวารัมย์
 
Agresividad timidez retraimiento
Agresividad timidez retraimientoAgresividad timidez retraimiento
Agresividad timidez retraimientoRuth Manrique
 
What if motivation does not work
What if motivation does not workWhat if motivation does not work
What if motivation does not workEvgeniy Labunskiy
 
WHO WANTS TO BE A MILLIONAIRE STUDENT
WHO WANTS TO BE A MILLIONAIRE STUDENTWHO WANTS TO BE A MILLIONAIRE STUDENT
WHO WANTS TO BE A MILLIONAIRE STUDENTpilarmolinamartin
 
Пам'ятки історій та культури Очаківщини
Пам'ятки історій та культури ОчаківщиниПам'ятки історій та культури Очаківщини
Пам'ятки історій та культури ОчаківщиниВікторія Тихомирова
 
ลักษณะนโยบายสาธารณะที่ดีควรเป็นอย่างไร
ลักษณะนโยบายสาธารณะที่ดีควรเป็นอย่างไรลักษณะนโยบายสาธารณะที่ดีควรเป็นอย่างไร
ลักษณะนโยบายสาธารณะที่ดีควรเป็นอย่างไรประพันธ์ เวารัมย์
 
Operating Systems: Processor Management
Operating Systems: Processor ManagementOperating Systems: Processor Management
Operating Systems: Processor ManagementDamian T. Gordon
 

Viewers also liked (11)

Teaching Universal Design in Computer Science
Teaching Universal Design in Computer ScienceTeaching Universal Design in Computer Science
Teaching Universal Design in Computer Science
 
Conflicts as a tool of successful delivery
Conflicts as a tool of successful deliveryConflicts as a tool of successful delivery
Conflicts as a tool of successful delivery
 
VZ_Com_Strategy
VZ_Com_StrategyVZ_Com_Strategy
VZ_Com_Strategy
 
การแต่งตั้งคณะกรรมการศึกษาธิการจังหวัด
การแต่งตั้งคณะกรรมการศึกษาธิการจังหวัดการแต่งตั้งคณะกรรมการศึกษาธิการจังหวัด
การแต่งตั้งคณะกรรมการศึกษาธิการจังหวัด
 
Agresividad timidez retraimiento
Agresividad timidez retraimientoAgresividad timidez retraimiento
Agresividad timidez retraimiento
 
What if motivation does not work
What if motivation does not workWhat if motivation does not work
What if motivation does not work
 
WHO WANTS TO BE A MILLIONAIRE STUDENT
WHO WANTS TO BE A MILLIONAIRE STUDENTWHO WANTS TO BE A MILLIONAIRE STUDENT
WHO WANTS TO BE A MILLIONAIRE STUDENT
 
Річний звіт
Річний звітРічний звіт
Річний звіт
 
Пам'ятки історій та культури Очаківщини
Пам'ятки історій та культури ОчаківщиниПам'ятки історій та культури Очаківщини
Пам'ятки історій та культури Очаківщини
 
ลักษณะนโยบายสาธารณะที่ดีควรเป็นอย่างไร
ลักษณะนโยบายสาธารณะที่ดีควรเป็นอย่างไรลักษณะนโยบายสาธารณะที่ดีควรเป็นอย่างไร
ลักษณะนโยบายสาธารณะที่ดีควรเป็นอย่างไร
 
Operating Systems: Processor Management
Operating Systems: Processor ManagementOperating Systems: Processor Management
Operating Systems: Processor Management
 

Similar to PrasadGaikwad

Amr Ghanem resume
Amr Ghanem resumeAmr Ghanem resume
Amr Ghanem resumeAmr Ghanem
 
Manipulating Data with Talend.
Manipulating Data with Talend.Manipulating Data with Talend.
Manipulating Data with Talend.Edureka!
 
Manipulating data with Talend. Learn how?
Manipulating data with Talend. Learn how?Manipulating data with Talend. Learn how?
Manipulating data with Talend. Learn how?Edureka!
 
Chandan's_Resume
Chandan's_ResumeChandan's_Resume
Chandan's_ResumeChandan Das
 
Talend For Big Data : Secret Key to Hadoop
Talend For Big Data  : Secret Key to HadoopTalend For Big Data  : Secret Key to Hadoop
Talend For Big Data : Secret Key to HadoopEdureka!
 
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...marksimpsongw
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingAmazon Web Services
 
Analytics on the Cloud with Tableau on AWS
Analytics on the Cloud with Tableau on AWSAnalytics on the Cloud with Tableau on AWS
Analytics on the Cloud with Tableau on AWSAmazon Web Services
 
Rajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developerRajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developerRajeev Kumar
 
Bhadale group of companies cloud service catalogue
Bhadale group of companies cloud service catalogueBhadale group of companies cloud service catalogue
Bhadale group of companies cloud service catalogueVijayananda Mohire
 
Simplifying Big Data ETL with Talend
Simplifying Big Data ETL with TalendSimplifying Big Data ETL with Talend
Simplifying Big Data ETL with TalendEdureka!
 
Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608Mark Tabladillo
 
Suman saha data architect
Suman saha data architectSuman saha data architect
Suman saha data architectsuman0221
 
Revolutionizing the Business Landscape with SAP Business Technology Platform ...
Revolutionizing the Business Landscape with SAP Business Technology Platform ...Revolutionizing the Business Landscape with SAP Business Technology Platform ...
Revolutionizing the Business Landscape with SAP Business Technology Platform ...VCERPConsultingPvtLt1
 
Talend webinar
Talend webinarTalend webinar
Talend webinarEdureka!
 
Resume_Akanksha_Pandya_2022.docx
Resume_Akanksha_Pandya_2022.docxResume_Akanksha_Pandya_2022.docx
Resume_Akanksha_Pandya_2022.docxapandya9
 

Similar to PrasadGaikwad (20)

Amr Ghanem resume
Amr Ghanem resumeAmr Ghanem resume
Amr Ghanem resume
 
Manipulating Data with Talend.
Manipulating Data with Talend.Manipulating Data with Talend.
Manipulating Data with Talend.
 
Manipulating data with Talend. Learn how?
Manipulating data with Talend. Learn how?Manipulating data with Talend. Learn how?
Manipulating data with Talend. Learn how?
 
Chandan's_Resume
Chandan's_ResumeChandan's_Resume
Chandan's_Resume
 
Ramesh kutumbaka resume
Ramesh kutumbaka resumeRamesh kutumbaka resume
Ramesh kutumbaka resume
 
Talend For Big Data : Secret Key to Hadoop
Talend For Big Data  : Secret Key to HadoopTalend For Big Data  : Secret Key to Hadoop
Talend For Big Data : Secret Key to Hadoop
 
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...
Mark Simpson - UKOUG23 - Refactoring Monolithic Oracle Database Applications ...
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
 
Jonathan_Michael
Jonathan_MichaelJonathan_Michael
Jonathan_Michael
 
Analytics on the Cloud with Tableau on AWS
Analytics on the Cloud with Tableau on AWSAnalytics on the Cloud with Tableau on AWS
Analytics on the Cloud with Tableau on AWS
 
Rajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developerRajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developer
 
Bhadale group of companies cloud service catalogue
Bhadale group of companies cloud service catalogueBhadale group of companies cloud service catalogue
Bhadale group of companies cloud service catalogue
 
Simplifying Big Data ETL with Talend
Simplifying Big Data ETL with TalendSimplifying Big Data ETL with Talend
Simplifying Big Data ETL with Talend
 
Resume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 FebResume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 Feb
 
Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608Microsoft Data Science Technologies 201608
Microsoft Data Science Technologies 201608
 
Suman saha data architect
Suman saha data architectSuman saha data architect
Suman saha data architect
 
Mahmoud khattab 16
Mahmoud khattab 16Mahmoud khattab 16
Mahmoud khattab 16
 
Revolutionizing the Business Landscape with SAP Business Technology Platform ...
Revolutionizing the Business Landscape with SAP Business Technology Platform ...Revolutionizing the Business Landscape with SAP Business Technology Platform ...
Revolutionizing the Business Landscape with SAP Business Technology Platform ...
 
Talend webinar
Talend webinarTalend webinar
Talend webinar
 
Resume_Akanksha_Pandya_2022.docx
Resume_Akanksha_Pandya_2022.docxResume_Akanksha_Pandya_2022.docx
Resume_Akanksha_Pandya_2022.docx
 

PrasadGaikwad

  • 1. PrasadGaikwad 8087002445 prasadgaikwad09@gmail.com Big Data Hadoop developer with over three years of experience in using cutting edge technologies such as Big Data on Cloud along with machine learning and data visualization & discovery to help businesses identify new opportunities and create disruptive business models. Leading and mentoring team of solution developers in integrating new age technologies with enterprise ETL and DW appliances. Won Tata Technologies Spot award for proactive involvement in performance tuning and automating manual deliverables. Professional Experience BigData Hadoop Developer| Lead Aug'15to present TataTechnologies Currently working as Big Data lead in digital team for Tata motors ltd, a major Indian automotive manufacturer. Helping business identify value added insights by designing and implementing cost effective Cloud based solutions, integrating open source technologies viz., Spark, Hive, Sqoop, Kafka, Oozie with Enterprise applications such as SAP, CRM and Cordys. Implementing multiple fast paced PoCs to validate concepts and mature it into projects if successful.  Hands on experience in using Spark, Hive,Sqoop, Kafka, Oozie,Hue, Ambari and Zeppelin for ingesting, cleaning and integrating Enterprise wide application data.  Excellent understanding on working of Hadoop and Spark internals such as HDFS, MapReduce, YARN, RDD,Dataframes and Dataset.  Designing solution architectures using excellent understanding of various cloud offerings and hands on experience in provisioning and managing resources such as Amazon EMR, EC2, S3, RedShift, RDS, Kinesis, Lambda, Google Compute Engine, GCS, BigQuery, HDInsight (Azure)  Setting up and using multi-node EMR cluster, Cloudera CDH cluster on AWS using Cloudera Director and on premise HDP Hadoop Cluster using Ambari.  Activeparticipation in Summits, Sessions and hands-on workshops on large scale data processing by Solution Architects and SMEs from Industry leaders such as AWS, Cloudera, Teradata, Microsoftand Google to evaluate and understand Big Data appliances and data lake solutions offered,on-Premise vs. on-Cloud architecture and how they fit-in current enterprise ITlandscape.
  • 2. Projects- ProfixDataMart – Tata Motors (AWS) - AWS EMR,Spark, Hive,Sqoop,Oozie,SAS Designed and deployed Datamart on top of Amazon S3 establishing ODBC connectivity to SAS modelling team via Hiveusing Amazon EMR. Automated daily refresh of data from Teradata EDWbox using combination of sqoop, spark and oozie to create daily ETLjobs running on on-demand EMRcluster. Project Wave-Tata Motors (AWS) - AWS EMR,Spark, Hive, S3, Tableau Designed and deployed trend analysis dashboards in Tableau to predict impact of fluctuations in commodity market on VC costusing Hive,Spark, S3, Hue and Zeppelin. Automated provisioning of EMR clusters with spot instances forexecuting batch ETL workloads. Vehiclestoppageanalysis– TataMotors(GCP) - Google BigQuery, Python, MS-SQL, Tableau Designed and deployed pure cloud Big data solution of telemetry data, to convertlive tracking of vehiclesinto stoppage heat maps, used clustering algorithms to identify points of interest to business based on most frequent stoppages across India. POCs- SAP BOM dataexplosionusingHive, Pig,Spark and Tableau(onpremise). Deployed on-premise HDP cluster to develop interactiveTableau dashboards displaying component/vehicle/plant wise cost variations. Used Pig, Hive, Spark, and Ambari for Ingesting and integrating BOMdata from SAP BW withCRM DesigninganddevelopingDataLakestrategy Developing data lake strategy foringesting structured, semi-structured and unstructured data generated by various applications in current enterprise landscape and exposing only relevant data to end users. SolutionDeveloper|ETLLead January2014 to Jul'15 TataTechnologies Worked as ETL lead in Business Intelligence team for Tata motors ltd, a major Indian automotive manufacturer. Integrated Siebel CRM, SAP and Cordys Portal data using Informatica in Teradata EDW (10 TB+). Delivered end-to-end business solution, implemented PoCs, Performance tuned and provided production support for one of the largest CRM deployment. Technical lead for ETL developers, responsible for deploying CRs in production, monitoring and troubleshooting execution of nightly ETL.
  • 3. Hands‐on workexperience in  Providing L2 and L3 support on production environment serving 5000+ application end users on reporting engine (OBIEE).  PerformanceTuning – 40%+ improvement in ETL execution time from 4 hours down to 2.5 hours by implementing Informatica, DAC and Teradata best practices.  Understanding business requirement fordesigning end to end Module deployment.  Building and maintaining complex ETLsfor data integration across business applications with about 1000+ mappings, 600+ tables and 10+ TBrelational database on Teradata using Informatica Powercenterand DAC.  Building custom data models to integrate Enterprise data with CSV uploads tocreate complete picture of business landscape.  Debugging and resolving ETL failures and data discrepancies of existing models.  Shell scripting and writing cronjobs for executing Teradata scripts, ftp/sftp transfers between application servers. Technical Proficiency Big Data Technologies Hadoop (Cloudera CDH, HortonWorks HDP,AWS EMR) HDFS, Hive, Kafka, MapReduce, Oozie,Pig, Spark, Sqoop, Zookeeper, Zeppelin Cloud Vendors Amazon AWS, Google Cloud, MicrosoftAzure Database Teradata, MongoDB, Google BigQuery, Oracle,MySQL Tools Informatica, DAC, Teradata Utilities Monitoring and Reporting Apache Ambari, HUE, Cloudera Manager, TDViewpoint, OBIEE, Tableau Programming/Scripting Languages SQL, python, java, scala Operating Systems RHEL, CentOS, Fedora, Windows AcademicQualifications  B.E. in Information Technology from Walchand Institute of Technology,Solapur with 72%  Diploma in Computer Engineering from Govt.Polytechnic Mumbai with 81.3%  S.S.C. with 88.3%