SlideShare a Scribd company logo
Sanath Pabba
Mobile: +1-4168939098 Email: sanath.sanny@gmail.com Skype id : live:sanath.sanny_1
Synopsis
 Engineering Professional with 5+ years of experience in Development, Maintenance and Production
support in Bigdata technologies.
 WorkingdirectlywiththeClientsregardingcritical requirementsandindependently workedonManyapps
and provided many ideas to improve the system efficiency.
 Written 12 automated scripts that optimized the duties and saved 70-80 hours of efforts per month.
 Out of 12 automations,4 automationsgiven8-10% of data improvementswhichhelpsdatapreparation
teams effort.
 Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce, Sqoop, Storm,
Spark, Kafka, Yarn, Oozie, and Zookeeper.
 Excellent knowledge on Hadoop Ecosystemssuch as HDFS, Job Tracker, Task Tracker, Name Node, Data
Node and Map Reduce programming paradigm.
 Experience indesigninganddevelopingapplicationsin SparkusingScala to compare the performance of
Spark with Hive and SQL/Oracle.
 Good exposure with Agile software development process.
 Experience in manipulating/analysing large datasets and finding patterns and insights within structured
and unstructured data.
 Strong experience on Hadoop distributions like Cloudera, MapR and HortonWorks.
 Good understandingof NoSQLdatabasesandhandsonworkexperience inwritingapplicationsonNoSQL
databases like HBase, and MongoDB.
 Experienced in writing complex MapReduce programs that work with different file formats like Text,
Sequence, Xml, parquet, and Avro.
 Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of
actions with control flows.
 Experience in migrating the data using Sqoop from HDFS to Relational DBMS and vice-versa.
 Extensive Experience onimportandexportdatausingstreamprocessingplatformslikeFlume andKafka.
 Very good experience in complete project life cycle (design, development,testing and implementation)
of Client Server and Web applications.
 Strong knowledge on Data Warehousing ETL concepts using Informatica Power Center, OLAP, OLTP.
 Having expertise knowledge on banking, insurance, and manufacturing domains.
 Excellentproblem-solvingskillswithastrong technical backgroundand resultorientedteamplayerwith
excellent communication and interpersonal skills.
 Worked on creating the RDDs, DFs for the required input data and performed the data transformations
using Spark-core.
 Hands-on experience in Scala programming and Spark components like Spark-core and Spark-SQL.
Technical Skill Set
 BigData Technologies:HDFS,MapReduce,Hive,Pig,Sqoop,Flume,Spark,Kafka,ZookeeperandOozie
 Languages: C, Java,Scala,Python,SQL,PL/SQL, PigLatin,HiveQL, JavaScript,Shell Scripting
 DevelopmentMethodologies:Agile/Scrum, Waterfall VersionControlToolsGit,SVN,Bitbucket
 RDBMS: Oracle,SQL Server
 BuildTools: Jenkins,Maven,ANT
 BusinessIntelligence Tools:Tableau,Splunk,QlikView,Alteryx
 Tools: IntelliJIDE
 CloudEnvironment:AWS
 Scripting: Unix shell scripting,Pythonscripting
 Scheduling:Maestro
Career Highlights
ProficiencyForte
 Extensivelyworkedondataextraction,Transformationandloadingdatafromvarioussourceslike
DB2, Oracle and Flat files.
 StrongskillsinData RequirementAnalysisandDataMappingfor ETL processes.
 Well versedindevelopingthe SQLqueries,unions,andmultiple table joins.
 Well versedwithUNIXCommands&able towrite shellscriptsanddevelopedfew scriptsto reduce
the manual interventionaspartof JobMonitoringAutomationProcess.
Work experience
 Sep2019 till currentdate in WalmartCanada as CustomerExperienceSpecialist(Part-time).
 Jan 2018 to Mar 2019 in Infosys Limitedas SeniorSystemEngineer.
 Apr 2015 to Jan2018 in NTT Data Global DeliveryServices asApplicationsoftware dev.Consultant.
Project Details
Company : InfosysLimited.
Project : Enterprise businesssolution(Metlife Inc.)
Environment : Hadoop, Spark,Spark SQL, Scala,SQL Server, shell scripting.
Scope: EBS is a project where Informatica pulls data from SFDC and sends to Big Data at RDZ. Big Data
kicksitsprocesswhenthe triggerfile,control fileanddatafilesare received.Allthe filescheckvalidations.
Afterall the transformationsare done,the data is storedin hive,pointingtoHDFS locations.The data is
synced to bigsql and down streaming process is done by QlikView team.
Roles& Responsibilities:
 Writing Sqoop Jobs that loads data from DBMS to Hadoop environments.
 Preparedcode thatinvokessparkscriptsinScalacode thatinvolvesinDataloads,pre-validations,data
preparation and post validations.
 Prepared automation scripts using shell scripting that fetches the data utilizations across the cluster
and notifies admins for every hour that helps admin team to avoid regular monitoring checks.
Company : InfosysLimited.
Project : BluePrism(MetlifeInc.)
Environment : Spark,Spark SQL, Scala,Sqoop,SQL Server,shell scripting.
Scope: BluePrism is a Source application with SQL Server as its Database. Big Data will extract the data
fromBluePrism Environmentsmergetwosourcesintooneandloadthe dataintoHive Database.BigData
also archives the data into corresponding history tables either monthly or ad-hoc basis basedon trigger
file receivedfromBluePrism.ThisisaweeklyextractfromSQLserverusingSqoopandthenloadthe data
through Scala. Jobs have been scheduled in Maestro.
Roles& Responsibilities:
 Prepared data loading scripts using shell scripting which invokes Sqoop jobs.
 Implemented data merging functionality which pulls data from various environments.
 Developed scripts that backup data using AVRO technique.
Company : Infosys Limited.
Project : Gross ProcessingMarginreports(Massmutual)
Environment : Spark,Spark SQL, Scala,Sqoop,SQL Server,shell scripting.
Scope:In GPMReports,we receive the input.csvfromthe Business. Basedonthe clientrequestwe need
to generate 6 reports.We will receive the triggerfile anddata file for each report.Using shell script,we
will perform validationon the trigger files, input file and paths that representation in Linux and HDFS, if
everyvalidationsuccessfulandinvoke the hive scripttogenerate the outputfile andplacedinLinux.We
will append the data in hive tables based on the output file. Migrating the Pig Scripts into spark scripts
using Scala and report generation will be taken place by the spark and stores in Linux directory.
Roles& Responsibilities:
 Based on the business need, used to prepare the data using hive QL and Spark RDD’s.
 By using Hadoop, we load the data, prepare the data, implement filters to remove unwanted and
uncertain fields and merging all 6 reports from various teams.
 Implemented8pre-validationrulesand7 postvalidationruleswhichinvolvesindata count,required
fields and needful changes and in post validations we move the data to HDFS archive path.
Company : Ntt Data Global deliveryservices.
Project : Compliance Apps(National Lifegroup)
Environment : Spark,Spark SQL, Scala,SQL Server,shell scripting,pig.
Scope:Compliance appisagroupof nine adminsystems(AnnuityHost,ERL,MRPS,CDI,Smartapp,WMA,
VRPS,PMACS,SBR).The processistoload the datafilesbasedonthe triggerfileswe receivedforthe nine
admin systems in to HDFS. There are three types of Load that takes place. They are:
1. Base load
2. Full load
3. Deltaload
Tobuildthe workflowtoloadthe datafilesintoHDFSlocationsandintohive tables.We needtocreate
hive tableswithoptimizedcompressedformatandload the data into the tables.To write hive script for
full load and write the shell script to create a workflow. We use Pig/spark for the delta loads and shell
script to invoke the hive for the full load/Historyprocessing.Then,schedule the jobsinMaestro for the
daily run. Initially for delta load, we were using Pig scripts.
Company : NTT Data Global DeliveryServices.
Project : ManufacturingCostWalk Analysis(Honeywell)
Environment : Sqoop,Shell scripting,Hive.
Scope: The Manufacturingcost walkapplicationusedtostore the informationaboutthe productswhich
are beenmanufacturedbyHoneywell.Theyusedtostore the datainSharePointlistsonweeklybasis.But
still it is very difficult for them to handle the data using share point because of long time processing. So
we proposeda solutionforthemwithHive andSqoop.But theirsource of file generationisfromcsvand
xlsfiles.So,we have startedimporting data into hive and processing data based on their requirement.
Academia
 Completed Post graduation from Loyalist college in Project Management (Sep 19- May 20)
 Completed diploma from Indian institute of Information Technology (Hyderabad) in Artificial
Intelligence and data visualization (April 2019 – Aug 2019)
 Completed Graduation Under JNTU-HYD in Electronics and Communications (Aug 11 – May 14)

More Related Content

What's hot

hadoop exp
hadoop exphadoop exp
a9TD6cbzTZotpJihekdc+w==.docx
a9TD6cbzTZotpJihekdc+w==.docxa9TD6cbzTZotpJihekdc+w==.docx
a9TD6cbzTZotpJihekdc+w==.docx
VasimMemon4
 
Pallavi_Resume
Pallavi_ResumePallavi_Resume
Pallavi_Resume
pallavi Mahajan
 
Prashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEWPrashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEW
Prashanth Shankar kumar
 
PRAFUL_HADOOP
PRAFUL_HADOOPPRAFUL_HADOOP
PRAFUL_HADOOP
PRAFUL DASH
 
Resume
ResumeResume
PRAFUL_HADOOP
PRAFUL_HADOOPPRAFUL_HADOOP
PRAFUL_HADOOP
PRAFUL DASH
 
Nagesh Hadoop Profile
Nagesh Hadoop ProfileNagesh Hadoop Profile
Nagesh Hadoop Profile
nagesh madanala
 
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Agile Testing Alliance
 
Resume - Narasimha Rao B V (TCS)
Resume - Narasimha  Rao B V (TCS)Resume - Narasimha  Rao B V (TCS)
Resume - Narasimha Rao B V (TCS)
Venkata Narasimha Rao B
 
R and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with HadoopR and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with Hadoop
Revolution Analytics
 
spark_v1_2
spark_v1_2spark_v1_2
spark_v1_2
Frank Schroeter
 
Big Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeNBig Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeN
DataWorks Summit
 
Big dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosqlBig dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosql
Khanderao Kand
 
Predictive Analytics with Hadoop
Predictive Analytics with HadoopPredictive Analytics with Hadoop
Predictive Analytics with Hadoop
DataWorks Summit
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache spark
sarith divakar
 
Mukul-Resume
Mukul-ResumeMukul-Resume
Mukul-Resume
mukul upadhyay
 
Srikanth hadoop 3.6yrs_hyd
Srikanth hadoop 3.6yrs_hydSrikanth hadoop 3.6yrs_hyd
Srikanth hadoop 3.6yrs_hyd
srikanth K
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
Codemotion
 
Resume_VipinKP
Resume_VipinKPResume_VipinKP
Resume_VipinKP
indhuparvathy
 

What's hot (20)

hadoop exp
hadoop exphadoop exp
hadoop exp
 
a9TD6cbzTZotpJihekdc+w==.docx
a9TD6cbzTZotpJihekdc+w==.docxa9TD6cbzTZotpJihekdc+w==.docx
a9TD6cbzTZotpJihekdc+w==.docx
 
Pallavi_Resume
Pallavi_ResumePallavi_Resume
Pallavi_Resume
 
Prashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEWPrashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEW
 
PRAFUL_HADOOP
PRAFUL_HADOOPPRAFUL_HADOOP
PRAFUL_HADOOP
 
Resume
ResumeResume
Resume
 
PRAFUL_HADOOP
PRAFUL_HADOOPPRAFUL_HADOOP
PRAFUL_HADOOP
 
Nagesh Hadoop Profile
Nagesh Hadoop ProfileNagesh Hadoop Profile
Nagesh Hadoop Profile
 
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Proc...
 
Resume - Narasimha Rao B V (TCS)
Resume - Narasimha  Rao B V (TCS)Resume - Narasimha  Rao B V (TCS)
Resume - Narasimha Rao B V (TCS)
 
R and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with HadoopR and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with Hadoop
 
spark_v1_2
spark_v1_2spark_v1_2
spark_v1_2
 
Big Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeNBig Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeN
 
Big dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosqlBig dataarchitecturesandecosystem+nosql
Big dataarchitecturesandecosystem+nosql
 
Predictive Analytics with Hadoop
Predictive Analytics with HadoopPredictive Analytics with Hadoop
Predictive Analytics with Hadoop
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache spark
 
Mukul-Resume
Mukul-ResumeMukul-Resume
Mukul-Resume
 
Srikanth hadoop 3.6yrs_hyd
Srikanth hadoop 3.6yrs_hydSrikanth hadoop 3.6yrs_hyd
Srikanth hadoop 3.6yrs_hyd
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
 
Resume_VipinKP
Resume_VipinKPResume_VipinKP
Resume_VipinKP
 

Similar to Sanath pabba hadoop resume 1.0

Monika_Raghuvanshi
Monika_RaghuvanshiMonika_Raghuvanshi
Monika_Raghuvanshi
Monika Raghuvanshi
 
Anil_BigData Resume
Anil_BigData ResumeAnil_BigData Resume
Anil_BigData Resume
Anil Sokhal
 
Rajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developerRajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developer
Rajeev Kumar
 
resumePdf
resumePdfresumePdf
resumePdf
Amit Kumar
 
Arindam Sengupta _ Resume
Arindam Sengupta _ ResumeArindam Sengupta _ Resume
Arindam Sengupta _ Resume
Arindam Sengupta
 
sudipto_resume
sudipto_resumesudipto_resume
sudipto_resume
Sudipto Saha
 
Sidharth_CV
Sidharth_CVSidharth_CV
Sidharth_CV
Sidharth Kumar
 
Sandish3Certs
Sandish3CertsSandish3Certs
Sandish3Certs
Sandish Kumar H N
 
HariKrishna4+_cv
HariKrishna4+_cvHariKrishna4+_cv
HariKrishna4+_cv
revuri
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
Happiest Minds Technologies
 
Vijay
VijayVijay
RESUME_N
RESUME_NRESUME_N
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_SparkSunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Mopuru Babu
 
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_SparkSunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Mopuru Babu
 
Vishnu_HadoopDeveloper
Vishnu_HadoopDeveloperVishnu_HadoopDeveloper
Vishnu_HadoopDeveloper
vishnu ch
 
Manikyam_Hadoop_5+Years
Manikyam_Hadoop_5+YearsManikyam_Hadoop_5+Years
Manikyam_Hadoop_5+Years
Manikyam M
 
Resume (1)
Resume (1)Resume (1)
Prasanna Resume
Prasanna ResumePrasanna Resume
Prasanna Resume
Prasanna Raju
 
Enterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsightEnterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsight
Paco Nathan
 
Big data with java
Big data with javaBig data with java
Big data with java
Stefan Angelov
 

Similar to Sanath pabba hadoop resume 1.0 (20)

Monika_Raghuvanshi
Monika_RaghuvanshiMonika_Raghuvanshi
Monika_Raghuvanshi
 
Anil_BigData Resume
Anil_BigData ResumeAnil_BigData Resume
Anil_BigData Resume
 
Rajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developerRajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developer
 
resumePdf
resumePdfresumePdf
resumePdf
 
Arindam Sengupta _ Resume
Arindam Sengupta _ ResumeArindam Sengupta _ Resume
Arindam Sengupta _ Resume
 
sudipto_resume
sudipto_resumesudipto_resume
sudipto_resume
 
Sidharth_CV
Sidharth_CVSidharth_CV
Sidharth_CV
 
Sandish3Certs
Sandish3CertsSandish3Certs
Sandish3Certs
 
HariKrishna4+_cv
HariKrishna4+_cvHariKrishna4+_cv
HariKrishna4+_cv
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
 
Vijay
VijayVijay
Vijay
 
RESUME_N
RESUME_NRESUME_N
RESUME_N
 
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_SparkSunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
Sunshine consulting Mopuru Babu CV_Java_J2ee_Spring_Bigdata_Scala_Spark
 
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_SparkSunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark
 
Vishnu_HadoopDeveloper
Vishnu_HadoopDeveloperVishnu_HadoopDeveloper
Vishnu_HadoopDeveloper
 
Manikyam_Hadoop_5+Years
Manikyam_Hadoop_5+YearsManikyam_Hadoop_5+Years
Manikyam_Hadoop_5+Years
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
Prasanna Resume
Prasanna ResumePrasanna Resume
Prasanna Resume
 
Enterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsightEnterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsight
 
Big data with java
Big data with javaBig data with java
Big data with java
 

Recently uploaded

一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
agdhot
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
ugydym
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
dataschool1
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
nhero3888
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
GeorgiiSteshenko
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
Vietnam Cotton & Spinning Association
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
Vineet
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
Timothy Spann
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
Vineet
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
Timothy Spann
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
ArshadAyub49
 
Senior Software Profiles Backend Sample - Sheet1.pdf
Senior Software Profiles  Backend Sample - Sheet1.pdfSenior Software Profiles  Backend Sample - Sheet1.pdf
Senior Software Profiles Backend Sample - Sheet1.pdf
Vineet
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
perranet1
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Marlon Dumas
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
ywqeos
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 

Recently uploaded (20)

一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
一比一原版加拿大麦吉尔大学毕业证(mcgill毕业证书)如何办理
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
 
Senior Software Profiles Backend Sample - Sheet1.pdf
Senior Software Profiles  Backend Sample - Sheet1.pdfSenior Software Profiles  Backend Sample - Sheet1.pdf
Senior Software Profiles Backend Sample - Sheet1.pdf
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 

Sanath pabba hadoop resume 1.0

  • 1. Sanath Pabba Mobile: +1-4168939098 Email: sanath.sanny@gmail.com Skype id : live:sanath.sanny_1 Synopsis  Engineering Professional with 5+ years of experience in Development, Maintenance and Production support in Bigdata technologies.  WorkingdirectlywiththeClientsregardingcritical requirementsandindependently workedonManyapps and provided many ideas to improve the system efficiency.  Written 12 automated scripts that optimized the duties and saved 70-80 hours of efforts per month.  Out of 12 automations,4 automationsgiven8-10% of data improvementswhichhelpsdatapreparation teams effort.  Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce, Sqoop, Storm, Spark, Kafka, Yarn, Oozie, and Zookeeper.  Excellent knowledge on Hadoop Ecosystemssuch as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.  Experience indesigninganddevelopingapplicationsin SparkusingScala to compare the performance of Spark with Hive and SQL/Oracle.  Good exposure with Agile software development process.  Experience in manipulating/analysing large datasets and finding patterns and insights within structured and unstructured data.  Strong experience on Hadoop distributions like Cloudera, MapR and HortonWorks.  Good understandingof NoSQLdatabasesandhandsonworkexperience inwritingapplicationsonNoSQL databases like HBase, and MongoDB.  Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml, parquet, and Avro.  Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.  Experience in migrating the data using Sqoop from HDFS to Relational DBMS and vice-versa.  Extensive Experience onimportandexportdatausingstreamprocessingplatformslikeFlume andKafka.  Very good experience in complete project life cycle (design, development,testing and implementation) of Client Server and Web applications.  Strong knowledge on Data Warehousing ETL concepts using Informatica Power Center, OLAP, OLTP.  Having expertise knowledge on banking, insurance, and manufacturing domains.  Excellentproblem-solvingskillswithastrong technical backgroundand resultorientedteamplayerwith excellent communication and interpersonal skills.  Worked on creating the RDDs, DFs for the required input data and performed the data transformations using Spark-core.  Hands-on experience in Scala programming and Spark components like Spark-core and Spark-SQL. Technical Skill Set  BigData Technologies:HDFS,MapReduce,Hive,Pig,Sqoop,Flume,Spark,Kafka,ZookeeperandOozie  Languages: C, Java,Scala,Python,SQL,PL/SQL, PigLatin,HiveQL, JavaScript,Shell Scripting  DevelopmentMethodologies:Agile/Scrum, Waterfall VersionControlToolsGit,SVN,Bitbucket  RDBMS: Oracle,SQL Server  BuildTools: Jenkins,Maven,ANT
  • 2.  BusinessIntelligence Tools:Tableau,Splunk,QlikView,Alteryx  Tools: IntelliJIDE  CloudEnvironment:AWS  Scripting: Unix shell scripting,Pythonscripting  Scheduling:Maestro Career Highlights ProficiencyForte  Extensivelyworkedondataextraction,Transformationandloadingdatafromvarioussourceslike DB2, Oracle and Flat files.  StrongskillsinData RequirementAnalysisandDataMappingfor ETL processes.  Well versedindevelopingthe SQLqueries,unions,andmultiple table joins.  Well versedwithUNIXCommands&able towrite shellscriptsanddevelopedfew scriptsto reduce the manual interventionaspartof JobMonitoringAutomationProcess. Work experience  Sep2019 till currentdate in WalmartCanada as CustomerExperienceSpecialist(Part-time).  Jan 2018 to Mar 2019 in Infosys Limitedas SeniorSystemEngineer.  Apr 2015 to Jan2018 in NTT Data Global DeliveryServices asApplicationsoftware dev.Consultant. Project Details Company : InfosysLimited. Project : Enterprise businesssolution(Metlife Inc.) Environment : Hadoop, Spark,Spark SQL, Scala,SQL Server, shell scripting. Scope: EBS is a project where Informatica pulls data from SFDC and sends to Big Data at RDZ. Big Data kicksitsprocesswhenthe triggerfile,control fileanddatafilesare received.Allthe filescheckvalidations. Afterall the transformationsare done,the data is storedin hive,pointingtoHDFS locations.The data is synced to bigsql and down streaming process is done by QlikView team. Roles& Responsibilities:  Writing Sqoop Jobs that loads data from DBMS to Hadoop environments.  Preparedcode thatinvokessparkscriptsinScalacode thatinvolvesinDataloads,pre-validations,data preparation and post validations.  Prepared automation scripts using shell scripting that fetches the data utilizations across the cluster and notifies admins for every hour that helps admin team to avoid regular monitoring checks. Company : InfosysLimited. Project : BluePrism(MetlifeInc.) Environment : Spark,Spark SQL, Scala,Sqoop,SQL Server,shell scripting. Scope: BluePrism is a Source application with SQL Server as its Database. Big Data will extract the data fromBluePrism Environmentsmergetwosourcesintooneandloadthe dataintoHive Database.BigData also archives the data into corresponding history tables either monthly or ad-hoc basis basedon trigger file receivedfromBluePrism.ThisisaweeklyextractfromSQLserverusingSqoopandthenloadthe data through Scala. Jobs have been scheduled in Maestro. Roles& Responsibilities:  Prepared data loading scripts using shell scripting which invokes Sqoop jobs.  Implemented data merging functionality which pulls data from various environments.  Developed scripts that backup data using AVRO technique.
  • 3. Company : Infosys Limited. Project : Gross ProcessingMarginreports(Massmutual) Environment : Spark,Spark SQL, Scala,Sqoop,SQL Server,shell scripting. Scope:In GPMReports,we receive the input.csvfromthe Business. Basedonthe clientrequestwe need to generate 6 reports.We will receive the triggerfile anddata file for each report.Using shell script,we will perform validationon the trigger files, input file and paths that representation in Linux and HDFS, if everyvalidationsuccessfulandinvoke the hive scripttogenerate the outputfile andplacedinLinux.We will append the data in hive tables based on the output file. Migrating the Pig Scripts into spark scripts using Scala and report generation will be taken place by the spark and stores in Linux directory. Roles& Responsibilities:  Based on the business need, used to prepare the data using hive QL and Spark RDD’s.  By using Hadoop, we load the data, prepare the data, implement filters to remove unwanted and uncertain fields and merging all 6 reports from various teams.  Implemented8pre-validationrulesand7 postvalidationruleswhichinvolvesindata count,required fields and needful changes and in post validations we move the data to HDFS archive path. Company : Ntt Data Global deliveryservices. Project : Compliance Apps(National Lifegroup) Environment : Spark,Spark SQL, Scala,SQL Server,shell scripting,pig. Scope:Compliance appisagroupof nine adminsystems(AnnuityHost,ERL,MRPS,CDI,Smartapp,WMA, VRPS,PMACS,SBR).The processistoload the datafilesbasedonthe triggerfileswe receivedforthe nine admin systems in to HDFS. There are three types of Load that takes place. They are: 1. Base load 2. Full load 3. Deltaload Tobuildthe workflowtoloadthe datafilesintoHDFSlocationsandintohive tables.We needtocreate hive tableswithoptimizedcompressedformatandload the data into the tables.To write hive script for full load and write the shell script to create a workflow. We use Pig/spark for the delta loads and shell script to invoke the hive for the full load/Historyprocessing.Then,schedule the jobsinMaestro for the daily run. Initially for delta load, we were using Pig scripts. Company : NTT Data Global DeliveryServices. Project : ManufacturingCostWalk Analysis(Honeywell) Environment : Sqoop,Shell scripting,Hive. Scope: The Manufacturingcost walkapplicationusedtostore the informationaboutthe productswhich are beenmanufacturedbyHoneywell.Theyusedtostore the datainSharePointlistsonweeklybasis.But still it is very difficult for them to handle the data using share point because of long time processing. So we proposeda solutionforthemwithHive andSqoop.But theirsource of file generationisfromcsvand xlsfiles.So,we have startedimporting data into hive and processing data based on their requirement. Academia  Completed Post graduation from Loyalist college in Project Management (Sep 19- May 20)  Completed diploma from Indian institute of Information Technology (Hyderabad) in Artificial Intelligence and data visualization (April 2019 – Aug 2019)  Completed Graduation Under JNTU-HYD in Electronics and Communications (Aug 11 – May 14)