This document contains Anil Kumar's resume. It summarizes his contact information, professional experience working with Hadoop and related technologies like MapReduce, Pig, and Hive. It also lists his technical skills and qualifications, including being a MapR certified Hadoop Professional. His work experience includes developing MapReduce algorithms, installing and configuring MapR Hadoop clusters, and working on projects for clients like Pfizer and American Express involving data analytics using Hadoop, Spark, and Hive.
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Anil_BigData Resume
1. Anil Kumar
Phone:+918588899595
Mailto: anil.mmec@gmail.com
Proficient in Hadoop and its ecosystem (MapReduce, Pig, Hive, Sqoop.etc) as a developer.
Exposure to MAPR cluster and MCS.
MapR certified Hadoop Professional: Developer (MCHP: D).
About 3 years of experience in IT industry aligned to perform the tasks assigned deliver value and satisfy
customer.
Expertise in Core Java
Proficient communication skills, professional outlook towards knowledge sharing and team work and mostly
worked in customer facing roles.
Role Responsibilities
Interactwiththe clienttogather requirements.
Involvedinidentifyingthe use casesanddesign.
InstalledandconfiguredMapRHadoopclustermultiple times.
Involvedin MapRclustermaintenance andconfigurationtasks.
DevelopedGenericMapReduce Algorithmsthatcanbe usedacrossvarietyof problems.
Implemented various concepts (like: Joins, Secondary sort) in Map Reduce applications using Java.
WrittenHive queries foranalytics.
Environment
Hadoop, HDFS, MapR, MapReduce, R, SpakSQL, Pig, Hive, Ansible, Sqoop, Core Java, Linux, Window 7 and AWS
Technical Skills
Software MAPR, MCS, Map-Reduce, HDFS, Hadoop Ecosystem (Hive, Pig, Sqoop ), R, AWS,
Linux, Windows XP, Windows 7, MySQL ,
Hardware Intel x86 architecture machine
Tools Eclipse Galileo, Eclipse Helios, Apache Tomcat 6.0.33
SDLC Methodology Water Fall ,Agile
Domain Experience Java /J2EE, BigData Technologies.
Business
Development
Approach towards feasible and qualitative business development.
Achievements
.
Got “STAR PERFORMAR AWARD” from “DESS INDIA HEAD” for continues learning attitude and Dedication
shown in deliverables.
Received Appreciation Mails from “American Express” for hard work and Dedication shown in deliverables.
Received Kudos from TCS Leadership during ILP.
Consistently enhancing competencies in the field of Big-Data - Hadoop Map-Reduce, PIG and HIVE.
Done IBM Big data university certifications in Hadoop, Mapreduce, Hive, Pig, and Data transfer tools.
Teaching experience during M.Tech from NIT Kurukshetra.
Secured 98 percentile in GATE 2010.
2. TCS Experience Summary
Project BigData Charge Back, Pfizer: Pfizer Inc. is an American multinational pharmaceutical
corporation headquartered in New York City, New York, with its research headquarters
in Groton, Connecticut. It is among one of the world's largest pharmaceutical companies
by revenues.
Role Hadoop Developer and Cluster Maintenance
Period Mar’15 to current
Tools & Technologies MAPR Cluster Installation, MapR Upgrade, MCS, MapR cluster maintenance and
configuration tasks, Basics of R Programming, R Studio, Shell Scripts, SpakSQL,
Ansible, Ipython, Zeppelin, Hive and Hue.
Description Architecting and evaluating various technical components for Data Lake
implementation in Pfizer.
Pfizer’s vision is to provide Business Analytics & Insights (BAI) with collaborative,
global data discovery, and analytics data platform and tools.
Installed and configured MapR Hadoop cluster with 5 nodes multiple times.
Automated Shell scripts to carry the MapR cluster maintenance and configuration
tasks (eg: service restart, volume management).
Upgrading MapR from 4.0.1 to 4.0.2.
Executing Teragen, TeraSort, DFSIO for MapR and Spark cluster.
MapR cluster deployment on AWS using customized playbooks of Ansible.
Configuring ecosystem components in the cluster.
Configuring web UI for Spark.
Configuring Spark on YARN.
Configuring Data Wrangling tools on the cluster
Cluster monitoring using Ganglia.
Project Out Of Pattern Analysis for Amex client
Role Hadoop Developer
Period Apr'14 to Feb'15
Technology Stack Operating System : Linux
Programming Language : Java, JDBC
Hadoop Vendor : MapR 3.0.2
Data Storage : HDFS, SQLServer2005, Hive
Data Processing : MapReduce
Data Access : Hive, Sqoop
Description Contributed as Hadoop developer in Big Data Project for leading banking client.
Worked on a cluster of 600 nodes.
Proposed different Use cases and closely involved in Requirement gathering with
BA.
The project scope cover, handling huge structured (CSV, RDBMS) and semi
structured data (XML files ) stored in Mainframe System.
FTP Client was designed in Java to FTP data from Mainframe to Hadoop
Platform
Generic MapReduce programs were designed for Data Cleansing and Data
Filtering.
Statistics were calculated on raw data using control file through Mapreduce.
3. If anything come out of pattern than need to generate alert on Stats Using
hadoop System..
Data was loaded into Hive tables for further Analysis.
Project Analytics, Big Data and Information Management for TCS
Role Hadoop Developer
Period Jan’2014 –Mar’2014
Technology Stack Operating System : Centos 6.4 (64 bit)
Programming Language : Java
Hadoop Vendor : Cloudera CDH4u0 with Cloudera Manager 4.0
Data Storage : HDFS, MySQL
Data Processing : MapReduce 1.0
Data Access :Pig, Hive, Sqoop
Visualization : Tableau
Description of Project
Activities
Worked on internal case studies such as credit card fraud analysis, where we
need to find out the fraudulent transactions done with a credit card.
A Cluster of 7 nodes was setup using Cloudera Manager 4.0.
Various data sources & formats like csv, xml, flat files and RDBMS were used to
provide different parts the data.
The data from different sources was transferred to HDFS through SQOOP and
Java Hadoop API.
Filtration part of junk and bad record was performed through pig scripts.
HIVE queries were used to perform analysis.
Visual analysis was implemented using TABLEAU.
Project Analytics, Big Data and Information Management for TCS
Role Hadoop Developer
Period Nov’2013 –Dec’2013
Technology Stack Operating System : Centos 6.4 (64 bit)
Programming Language : Java
Hadoop Vendor : Cloudera CDH3U4 with Cloudera Manager 4.0
Data Storage : HDFS, MySQL
Data Processing : MapReduce 1.0
Data Access :Pig, Hive, Sqoop
Visualization : Tableau
Description of Project
Activities
Worked on internal case studies such as Benchmarking MapReduce framework
with respect to data processing and then analyze data to find hidden trend.
A Cluster of 7 nodes was setup using Cloudera Manager 4.0
Various data sources like csv, xml and RDBMS were used as data source.
The data from different sources was transferred to HDFS through SQOOP.
Implemented MapReduce on different size of data.
4. We Increased and decreased number of mappers and reducers at run time.
Checked the performance by introducing concept of combiner and partitioner.
Filtration of junk and bad records was done using Pig Scripts.
Hive Queries were used to perform analysis.
Visual Analysis was done using Tableau.
Project Initial Learning Program
Role Team Leader
Technology Stack Operating System : Window 7
Programming Language: Java, J2EE, Html, and Oracle.
Achievement Got ILP Kudos Award at ILP Hyderabad.
Description E-Recruitment website that will be enable the establishment of a streamlined, consistent
recruitment process that reduce manual effort and improve the experience for applicants
and for staff in faculties and division seeking to fill vacant position.
Experience Details
TCS Experience 2 Year(s),10 Month(s)
Prev. Experience 0 Year(s),0 Month(s)
Total Experience 2 Year(s),10 Month(s)
Education Summary
Qualification Category College Subject
Bachelor Of Technology MMEC Mullana Information Technology
Master Of Technology NIT Kurukshetra Computer Engineering