SlideShare a Scribd company logo
1 of 3
VIJAY MURALIDHARAN
BIG DATA ENGINEER
38, Batson Street, Glasgow, G427HD UK C: +44 (0)7459030502 | vijay27101990@outlook.com
Summary
ExperiencedHadoopAdministratoranddeveloperhasa strongbackgroundwithfile distributionsystemsinaBig
Data arena. Understandsthe complex processingneedsof bigdataandhas experience developingcodesand
modulestoaddressthose needs.Bringsa Master’sDegree inCloudComputingalongwithcertificationsas
AdministratoranddeveloperusingApache Hadoop.
Core Qualifications
 Programming Languages –Java,Scala,Python,C++
 Tools – Intellij,GitHub,Eclipse,Notebook
 MapReduce- Hadoop/HDFS(Hortonworks,Cloudera), Hive,Pig,Spark,Sqoop, SparkStreaming,Kafka,
Flume,Oozie, EMR.
 Cloud – AWS/EC2/EMR/S3
 SQL/NoSQL– Hive,SparkSQL,Cassandra.
 APIs– LinkedIn,Twitter,general RESTful Concepts
 WEB – HTML, CSS,MySQL
 OS – Linux/Unix,Windows
 Testing – Manual, Blackbox Testing,MR unittesting
 Scripting – Bash/Shell
 SecurityTools(Hadoop) –Kerberos,Knox,Ranger
Professional Profile
I have Experience asbothHadoopAdministratorand Developer.ProfessionalSynopsisare asfollows
Hadoop Administrator:
 Experience inApache HortonworksHDP and ClouderaDistributions
 ConfiguredMulti-Node ClusterinHortonworksdataplatform, alsobuilt POC(ProofofConcept) Cluster- Pre-
Prodon Virtual Machines,alsowrote shell scriptsfordeployingmulti-node cluster.
 Extensive experience inInstalling,Configuring,andusingecosystemcomponentslike Hadoop,MapReduce,
HDFS, Hive, Pig,Oozie,Sqoop,Flume,Kafka.
 Configured capacityschedulerandtuningitto optimize developmentenvironment.
 Implementationof HighAvailabilityforName Node,resource Manager,MySQL incase of both automatic
and manual failovers.
 Strongknowledge andunderstandingof HadoopSecuritytools –MIT Kerberos,Ranger, and Knox.
 Workedwithpeersindevelopmenttotune infrastructure andplanforresource managementincluding
adding/removingclusternodesformaintenanceorcapacityneeds.
 TranslatingBusinessrequirementstoSystemrequirements.
Hadoop Developer:
 StrongKnowledge andunderstandingof Hadoop HDFS MapReduce conceptsand Hadoop Ecosystem
 Createduse-casesusingmassive publicdatasets.Ranperformance testsforverifyingthe efficiencyof
MapReduce,Hive and Pig.
VIJAY MURALIDHARAN
2
 Explored Spark,Kafka alongwithotheropensource projectstocreate a Real-Time analyticsframework.
Designedandworkedonthe complete datapipelinefor ETL, Analysisand Visualization.
 Loadeddata intoHadoopclusterfrommultiple existingdatasources
 Collaboratedwithpeerswritingautomationscriptsin Oozie.
 DevelopedMapReduce programsinJava.
 Workedon AWSincludingS3, EC2, EMR.
 Designedapplicationsusing UML(Sequence Diagram, Case Diagram, Entityrelationshipdiagrams).
Experience
Big Data Engineer February 2016 to Current
Cloudwick Technologies UK – Glasgow, Scotland
UKDA: Big Data Developer
Responsibilities:
 Workedon a live 60 Node clusterrunningHDP 2.4
 Workedwith highlyunstructured and semi structured data of 90 TB insize (270 TB replicationfactorof 3).
 Extensive experience inwritingpigscriptstotransformraw data fromseveral datasourcesintoforming
baseline data.
 Developed Hive scriptsfor enduser/analystrequirementsforadhoc analysis
 Developed Oozie workflowforschedulingandorchestratingthe ETLprocess
 Workedwiththe adminteamindesigningandupgrading HDP2.4 to HDP 2.5
 Verygoodexperience in managingandmonitoringthe HadoopClusterusingAmbari.
 Good workingknowledge of Hortonworksand Cloudera.
 Good workingknowledge of Tableau.
 InvolvedinHadoopClusterEnvironmentthatincludedaddingandremovingclusternodes,clustercapacity
planning,performance tuning,clustermonitoringandtroubleshooting.
 Implementedauthenticationusing Kerberosandauthorisationusing Ranger.
 Involvedindesignanddevelopmentof completepipeline.
Cloudwick:BigData Engineer
Responsibilities:
 Responsible forassessingclusterperformance andstatusbefore anupgrade
 Troubleshooting the issuesduringthe upgrade andhelpingthe teamtoupgrade clustersmoothly.
 Part of a supportteaminvolvedinimplementationand performance tuningusingcapacity schedulersina
multi-tenantcluster
 Responsible formanagingsecurityforHadoopclusterusingKnox,Kerberosandranger
 Assistedthe teamthatimplementedandmanagedsecurityforHadoopclusterusingKerberosintegration
withActive Directory and OpenLDAP
 Integratedthe sparkand Bigdata Visualisationtoolslike Neo4jandtableau
 Responsible forassistingthe teaminbuilding,operating,monitoringandQA and developmentclusterson
physical hardware andcloud
 Also,responsiblefordocumentingthe entire project,trainingbusinessusersandwritingproductuser
guides.Developedthe Sqoopscriptstomake interactionsbetween PigandMySQL database
 InvolvedwithsolutionsarchitectureteamtoverifywithHadoopecosystemtoolsforthe differentapplication
VIJAY MURALIDHARAN
3
 Usedgraph database (Neo4J) alongwithPythonandRtoolsto cluster(K-means) the datasetand
implementedregressiontechniquesandclassifiedthe datausingMachine learningalgorithms(Decisiontree,
randomForest).
 Providedin-house supporttoonsite consultantsandinvolvedinendtoendapplicationdevelopment.
Performance Evaluation of DistributedFile systemson Near real-time applications(UniversityProject)
 Capturedthe outcome of differentworkloadslike structured,semi-structuredandunstructured whichis
analysedusingHadoopservicesbasedonmeasurementof CPUtime,Mappersand Reducerslaunched and
storage.
 Also,Capturedthe outcome of differentworkloadsbysystematicallytweakingthe JVMparameters(Heap
size,increasingthe garbage collectorsrapidly,specifyingthe MappersandReducers) ondifferentservices
 Performed cost-basedoptimizationondifferenttools(Hive,Pig)
 Also, performedjointand sort operation onskewedandmessyworkloadstomeasure the performance of
howeffective the toolsare
 Summarisedthe quantitative andqualitative strengthsandweaknessof the tools
 Capturedthe outcome of all the operationsperformedbasedondifferentfactorsand concludedwhich
servicesare betteronwhichkindof data workload
E-Sign Technologies July 2012 to August 2014
Software Test Engineer
Responsibilities:
 Testingsoftware toidentifyandresolve problemsfromaenduserperspective
 In charge of testingdevelopedsoftware againstspecificcondition
 Accuratelymonitoringandrecordingresultsintestdocumentation
 Monitoringthe testingprocess andidentifyingandloggingtestfailures
 Performingpeerreviewsandestimates
 InvolvedinPerformance testing,StressandLoadtesting,UATtesting,Smoke testing,andUnittesting
 Testingfull productsuite’s,identifyingproblemsandresolvingthemwiththe developmentteam.
Education
Masters in Cloud Computing 2016
University of Leicester – Leicester, UK
Bachelors in Computer Science Engineering 2012
Hindustan University
Certifications
Hortonworks Certified Administrator for Apache Hadoop ( HDPCA ) June 2016

More Related Content

What's hot

Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBaseCarol McDonald
 
LAS16-305: Smart City Big Data Visualization on 96Boards
LAS16-305: Smart City Big Data Visualization on 96BoardsLAS16-305: Smart City Big Data Visualization on 96Boards
LAS16-305: Smart City Big Data Visualization on 96BoardsLinaro
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowDataWorks Summit
 
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDsApache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDsTimothy Spann
 
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Revolution Analytics
 
MapR and Machine Learning Primer
MapR and Machine Learning PrimerMapR and Machine Learning Primer
MapR and Machine Learning PrimerMathieu Dumoulin
 
Very large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLVery large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLDESMOND YUEN
 
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...Debraj GuhaThakurta
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataMathieu Dumoulin
 
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)BigDataEverywhere
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopRevolution Analytics
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopDataWorks Summit
 
BDSE 2015 Evaluation of Big Data Platforms with HiBench
BDSE 2015 Evaluation of Big Data Platforms with HiBenchBDSE 2015 Evaluation of Big Data Platforms with HiBench
BDSE 2015 Evaluation of Big Data Platforms with HiBencht_ivanov
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Mathieu Dumoulin
 
R server and spark
R server and sparkR server and spark
R server and sparkBAINIDA
 
Introduction to Spark on Hadoop
Introduction to Spark on HadoopIntroduction to Spark on Hadoop
Introduction to Spark on HadoopCarol McDonald
 
Challenges & Capabilites in Managing a MapR Cluster by David Tucker
Challenges & Capabilites in Managing a MapR Cluster by David TuckerChallenges & Capabilites in Managing a MapR Cluster by David Tucker
Challenges & Capabilites in Managing a MapR Cluster by David TuckerMapR Technologies
 
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...VMware Tanzu
 

What's hot (20)

Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBase
 
LAS16-305: Smart City Big Data Visualization on 96Boards
LAS16-305: Smart City Big Data Visualization on 96BoardsLAS16-305: Smart City Big Data Visualization on 96Boards
LAS16-305: Smart City Big Data Visualization on 96Boards
 
Big Data Benchmarking
Big Data BenchmarkingBig Data Benchmarking
Big Data Benchmarking
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDsApache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
 
Hcj 2013-01-21
Hcj 2013-01-21Hcj 2013-01-21
Hcj 2013-01-21
 
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
 
MapR and Machine Learning Primer
MapR and Machine Learning PrimerMapR and Machine Learning Primer
MapR and Machine Learning Primer
 
Very large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLVery large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDL
 
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
 
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
BDSE 2015 Evaluation of Big Data Platforms with HiBench
BDSE 2015 Evaluation of Big Data Platforms with HiBenchBDSE 2015 Evaluation of Big Data Platforms with HiBench
BDSE 2015 Evaluation of Big Data Platforms with HiBench
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
R server and spark
R server and sparkR server and spark
R server and spark
 
Introduction to Spark on Hadoop
Introduction to Spark on HadoopIntroduction to Spark on Hadoop
Introduction to Spark on Hadoop
 
Challenges & Capabilites in Managing a MapR Cluster by David Tucker
Challenges & Capabilites in Managing a MapR Cluster by David TuckerChallenges & Capabilites in Managing a MapR Cluster by David Tucker
Challenges & Capabilites in Managing a MapR Cluster by David Tucker
 
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
 

Viewers also liked

CV sent to Andy Jones Jan 17
CV sent to Andy Jones Jan 17CV sent to Andy Jones Jan 17
CV sent to Andy Jones Jan 17Simon Parkinson
 
AIHT Program Flyer 2016
AIHT Program Flyer 2016AIHT Program Flyer 2016
AIHT Program Flyer 2016Lauren Fox
 
FashionTT - Boost Your Brand
FashionTT - Boost Your BrandFashionTT - Boost Your Brand
FashionTT - Boost Your BrandKris Granger
 
Nanotecnologia Ocular 2016 Argentina Patricia Duran Ospina
Nanotecnologia Ocular 2016 Argentina Patricia Duran OspinaNanotecnologia Ocular 2016 Argentina Patricia Duran Ospina
Nanotecnologia Ocular 2016 Argentina Patricia Duran OspinaUniversidad Técnica de Manabí
 
Développer le travail collaboratif et l'innovation grâce au Lean engineering ...
Développer le travail collaboratif et l'innovation grâce au Lean engineering ...Développer le travail collaboratif et l'innovation grâce au Lean engineering ...
Développer le travail collaboratif et l'innovation grâce au Lean engineering ...Institut Lean France
 
Management information system in emirates airlines
Management information system in emirates airlinesManagement information system in emirates airlines
Management information system in emirates airlinestanveerrai
 
FreakOut dewina Indonesia Launch Event 10th feb
FreakOut dewina Indonesia Launch Event 10th febFreakOut dewina Indonesia Launch Event 10th feb
FreakOut dewina Indonesia Launch Event 10th febFreakOut dewina Indonesia
 

Viewers also liked (13)

Produtos Queiroz Galvão
Produtos Queiroz Galvão Produtos Queiroz Galvão
Produtos Queiroz Galvão
 
CV sent to Andy Jones Jan 17
CV sent to Andy Jones Jan 17CV sent to Andy Jones Jan 17
CV sent to Andy Jones Jan 17
 
Gráficos sondagem
Gráficos sondagemGráficos sondagem
Gráficos sondagem
 
AIHT Program Flyer 2016
AIHT Program Flyer 2016AIHT Program Flyer 2016
AIHT Program Flyer 2016
 
FashionTT - Boost Your Brand
FashionTT - Boost Your BrandFashionTT - Boost Your Brand
FashionTT - Boost Your Brand
 
Nanotecnologia Ocular 2016 Argentina Patricia Duran Ospina
Nanotecnologia Ocular 2016 Argentina Patricia Duran OspinaNanotecnologia Ocular 2016 Argentina Patricia Duran Ospina
Nanotecnologia Ocular 2016 Argentina Patricia Duran Ospina
 
Mãe, Querida Mãe!
Mãe, Querida Mãe!Mãe, Querida Mãe!
Mãe, Querida Mãe!
 
Aparato respiratorio
Aparato respiratorioAparato respiratorio
Aparato respiratorio
 
ANGIOFLUORESCEINOGRAFIA_PARTE 1
ANGIOFLUORESCEINOGRAFIA_PARTE 1ANGIOFLUORESCEINOGRAFIA_PARTE 1
ANGIOFLUORESCEINOGRAFIA_PARTE 1
 
Células de la retina
Células de la retinaCélulas de la retina
Células de la retina
 
Développer le travail collaboratif et l'innovation grâce au Lean engineering ...
Développer le travail collaboratif et l'innovation grâce au Lean engineering ...Développer le travail collaboratif et l'innovation grâce au Lean engineering ...
Développer le travail collaboratif et l'innovation grâce au Lean engineering ...
 
Management information system in emirates airlines
Management information system in emirates airlinesManagement information system in emirates airlines
Management information system in emirates airlines
 
FreakOut dewina Indonesia Launch Event 10th feb
FreakOut dewina Indonesia Launch Event 10th febFreakOut dewina Indonesia Launch Event 10th feb
FreakOut dewina Indonesia Launch Event 10th feb
 

Similar to Vijay

Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Pabba Gupta
 
Anil_BigData Resume
Anil_BigData ResumeAnil_BigData Resume
Anil_BigData ResumeAnil Sokhal
 
sam_resume - updated
sam_resume - updatedsam_resume - updated
sam_resume - updatedsam k
 
Owez_IBM_Hadoop_Admin
Owez_IBM_Hadoop_AdminOwez_IBM_Hadoop_Admin
Owez_IBM_Hadoop_AdminOwez Mujawar
 
Rameez Rangrez_Hadoop_Admin
Rameez Rangrez_Hadoop_AdminRameez Rangrez_Hadoop_Admin
Rameez Rangrez_Hadoop_AdminRameez Rangrez
 
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.OW2
 
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...mindscriptsseo
 
Hadoop Summit Dublin 2016: Hadoop Platform at Yahoo - A Year in Review
Hadoop Summit Dublin 2016: Hadoop Platform at Yahoo - A Year in Review Hadoop Summit Dublin 2016: Hadoop Platform at Yahoo - A Year in Review
Hadoop Summit Dublin 2016: Hadoop Platform at Yahoo - A Year in Review Sumeet Singh
 

Similar to Vijay (20)

Monika_Raghuvanshi
Monika_RaghuvanshiMonika_Raghuvanshi
Monika_Raghuvanshi
 
Sureh hadoop 3 years t
Sureh hadoop 3 years tSureh hadoop 3 years t
Sureh hadoop 3 years t
 
Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0Sanath pabba hadoop resume 1.0
Sanath pabba hadoop resume 1.0
 
Rameez Rangrez(3)
Rameez Rangrez(3)Rameez Rangrez(3)
Rameez Rangrez(3)
 
Nikhil Sinha.
Nikhil Sinha.Nikhil Sinha.
Nikhil Sinha.
 
Anil_BigData Resume
Anil_BigData ResumeAnil_BigData Resume
Anil_BigData Resume
 
Hareesh
HareeshHareesh
Hareesh
 
Sidharth_CV
Sidharth_CVSidharth_CV
Sidharth_CV
 
Nagesh Hadoop Profile
Nagesh Hadoop ProfileNagesh Hadoop Profile
Nagesh Hadoop Profile
 
sam_resume - updated
sam_resume - updatedsam_resume - updated
sam_resume - updated
 
Prashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEWPrashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEW
 
Hadoop_Developer
Hadoop_DeveloperHadoop_Developer
Hadoop_Developer
 
Owez_IBM_Hadoop_Admin
Owez_IBM_Hadoop_AdminOwez_IBM_Hadoop_Admin
Owez_IBM_Hadoop_Admin
 
Arindam Sengupta _ Resume
Arindam Sengupta _ ResumeArindam Sengupta _ Resume
Arindam Sengupta _ Resume
 
BigData_Krishna Kumar Sharma
BigData_Krishna Kumar SharmaBigData_Krishna Kumar Sharma
BigData_Krishna Kumar Sharma
 
Resume - Narasimha Rao B V (TCS)
Resume - Narasimha  Rao B V (TCS)Resume - Narasimha  Rao B V (TCS)
Resume - Narasimha Rao B V (TCS)
 
Rameez Rangrez_Hadoop_Admin
Rameez Rangrez_Hadoop_AdminRameez Rangrez_Hadoop_Admin
Rameez Rangrez_Hadoop_Admin
 
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
 
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...
Big-Data Hadoop Training Institutes in Pune | CloudEra Certification courses ...
 
Hadoop Summit Dublin 2016: Hadoop Platform at Yahoo - A Year in Review
Hadoop Summit Dublin 2016: Hadoop Platform at Yahoo - A Year in Review Hadoop Summit Dublin 2016: Hadoop Platform at Yahoo - A Year in Review
Hadoop Summit Dublin 2016: Hadoop Platform at Yahoo - A Year in Review
 

Vijay

  • 1. VIJAY MURALIDHARAN BIG DATA ENGINEER 38, Batson Street, Glasgow, G427HD UK C: +44 (0)7459030502 | vijay27101990@outlook.com Summary ExperiencedHadoopAdministratoranddeveloperhasa strongbackgroundwithfile distributionsystemsinaBig Data arena. Understandsthe complex processingneedsof bigdataandhas experience developingcodesand modulestoaddressthose needs.Bringsa Master’sDegree inCloudComputingalongwithcertificationsas AdministratoranddeveloperusingApache Hadoop. Core Qualifications  Programming Languages –Java,Scala,Python,C++  Tools – Intellij,GitHub,Eclipse,Notebook  MapReduce- Hadoop/HDFS(Hortonworks,Cloudera), Hive,Pig,Spark,Sqoop, SparkStreaming,Kafka, Flume,Oozie, EMR.  Cloud – AWS/EC2/EMR/S3  SQL/NoSQL– Hive,SparkSQL,Cassandra.  APIs– LinkedIn,Twitter,general RESTful Concepts  WEB – HTML, CSS,MySQL  OS – Linux/Unix,Windows  Testing – Manual, Blackbox Testing,MR unittesting  Scripting – Bash/Shell  SecurityTools(Hadoop) –Kerberos,Knox,Ranger Professional Profile I have Experience asbothHadoopAdministratorand Developer.ProfessionalSynopsisare asfollows Hadoop Administrator:  Experience inApache HortonworksHDP and ClouderaDistributions  ConfiguredMulti-Node ClusterinHortonworksdataplatform, alsobuilt POC(ProofofConcept) Cluster- Pre- Prodon Virtual Machines,alsowrote shell scriptsfordeployingmulti-node cluster.  Extensive experience inInstalling,Configuring,andusingecosystemcomponentslike Hadoop,MapReduce, HDFS, Hive, Pig,Oozie,Sqoop,Flume,Kafka.  Configured capacityschedulerandtuningitto optimize developmentenvironment.  Implementationof HighAvailabilityforName Node,resource Manager,MySQL incase of both automatic and manual failovers.  Strongknowledge andunderstandingof HadoopSecuritytools –MIT Kerberos,Ranger, and Knox.  Workedwithpeersindevelopmenttotune infrastructure andplanforresource managementincluding adding/removingclusternodesformaintenanceorcapacityneeds.  TranslatingBusinessrequirementstoSystemrequirements. Hadoop Developer:  StrongKnowledge andunderstandingof Hadoop HDFS MapReduce conceptsand Hadoop Ecosystem  Createduse-casesusingmassive publicdatasets.Ranperformance testsforverifyingthe efficiencyof MapReduce,Hive and Pig.
  • 2. VIJAY MURALIDHARAN 2  Explored Spark,Kafka alongwithotheropensource projectstocreate a Real-Time analyticsframework. Designedandworkedonthe complete datapipelinefor ETL, Analysisand Visualization.  Loadeddata intoHadoopclusterfrommultiple existingdatasources  Collaboratedwithpeerswritingautomationscriptsin Oozie.  DevelopedMapReduce programsinJava.  Workedon AWSincludingS3, EC2, EMR.  Designedapplicationsusing UML(Sequence Diagram, Case Diagram, Entityrelationshipdiagrams). Experience Big Data Engineer February 2016 to Current Cloudwick Technologies UK – Glasgow, Scotland UKDA: Big Data Developer Responsibilities:  Workedon a live 60 Node clusterrunningHDP 2.4  Workedwith highlyunstructured and semi structured data of 90 TB insize (270 TB replicationfactorof 3).  Extensive experience inwritingpigscriptstotransformraw data fromseveral datasourcesintoforming baseline data.  Developed Hive scriptsfor enduser/analystrequirementsforadhoc analysis  Developed Oozie workflowforschedulingandorchestratingthe ETLprocess  Workedwiththe adminteamindesigningandupgrading HDP2.4 to HDP 2.5  Verygoodexperience in managingandmonitoringthe HadoopClusterusingAmbari.  Good workingknowledge of Hortonworksand Cloudera.  Good workingknowledge of Tableau.  InvolvedinHadoopClusterEnvironmentthatincludedaddingandremovingclusternodes,clustercapacity planning,performance tuning,clustermonitoringandtroubleshooting.  Implementedauthenticationusing Kerberosandauthorisationusing Ranger.  Involvedindesignanddevelopmentof completepipeline. Cloudwick:BigData Engineer Responsibilities:  Responsible forassessingclusterperformance andstatusbefore anupgrade  Troubleshooting the issuesduringthe upgrade andhelpingthe teamtoupgrade clustersmoothly.  Part of a supportteaminvolvedinimplementationand performance tuningusingcapacity schedulersina multi-tenantcluster  Responsible formanagingsecurityforHadoopclusterusingKnox,Kerberosandranger  Assistedthe teamthatimplementedandmanagedsecurityforHadoopclusterusingKerberosintegration withActive Directory and OpenLDAP  Integratedthe sparkand Bigdata Visualisationtoolslike Neo4jandtableau  Responsible forassistingthe teaminbuilding,operating,monitoringandQA and developmentclusterson physical hardware andcloud  Also,responsiblefordocumentingthe entire project,trainingbusinessusersandwritingproductuser guides.Developedthe Sqoopscriptstomake interactionsbetween PigandMySQL database  InvolvedwithsolutionsarchitectureteamtoverifywithHadoopecosystemtoolsforthe differentapplication
  • 3. VIJAY MURALIDHARAN 3  Usedgraph database (Neo4J) alongwithPythonandRtoolsto cluster(K-means) the datasetand implementedregressiontechniquesandclassifiedthe datausingMachine learningalgorithms(Decisiontree, randomForest).  Providedin-house supporttoonsite consultantsandinvolvedinendtoendapplicationdevelopment. Performance Evaluation of DistributedFile systemson Near real-time applications(UniversityProject)  Capturedthe outcome of differentworkloadslike structured,semi-structuredandunstructured whichis analysedusingHadoopservicesbasedonmeasurementof CPUtime,Mappersand Reducerslaunched and storage.  Also,Capturedthe outcome of differentworkloadsbysystematicallytweakingthe JVMparameters(Heap size,increasingthe garbage collectorsrapidly,specifyingthe MappersandReducers) ondifferentservices  Performed cost-basedoptimizationondifferenttools(Hive,Pig)  Also, performedjointand sort operation onskewedandmessyworkloadstomeasure the performance of howeffective the toolsare  Summarisedthe quantitative andqualitative strengthsandweaknessof the tools  Capturedthe outcome of all the operationsperformedbasedondifferentfactorsand concludedwhich servicesare betteronwhichkindof data workload E-Sign Technologies July 2012 to August 2014 Software Test Engineer Responsibilities:  Testingsoftware toidentifyandresolve problemsfromaenduserperspective  In charge of testingdevelopedsoftware againstspecificcondition  Accuratelymonitoringandrecordingresultsintestdocumentation  Monitoringthe testingprocess andidentifyingandloggingtestfailures  Performingpeerreviewsandestimates  InvolvedinPerformance testing,StressandLoadtesting,UATtesting,Smoke testing,andUnittesting  Testingfull productsuite’s,identifyingproblemsandresolvingthemwiththe developmentteam. Education Masters in Cloud Computing 2016 University of Leicester – Leicester, UK Bachelors in Computer Science Engineering 2012 Hindustan University Certifications Hortonworks Certified Administrator for Apache Hadoop ( HDPCA ) June 2016