1. VIPIN K P
Email:vipinkprc@gmail.com Phone: +91 773 6531 979
+91 990 2321 979
Professional Summary
5+ years of Big Data experience in leading IT organization and extensive experience in various
technology implementations. Experienced Hadoop Developer has a strong background with file
distribution systems in a big-data arena. Understands the complex processing needs of big data
and has experience developing codes and modules to address those needs. Brings a Master's
Degree in Computer Science along with Certification as a Developer using Apache Hadoop.
Core Qualifications
Cloudera Certified Hadoop developer.
In depth Knowledge and experience in Distributed computing platform such as Hadoop.
Expertise in Object Oriented Analysis and Design and core Java development.
Strong knowledge in SQL like Big Data supporting Hive data warehouse.
Able to assess business rules, collaborate with stakeholders and perform source-to-target
data mapping, design and review.
Hands on experience on Hortonworks and Cloudera Hadoop environments.
Expertise in writing Hadoop Jobs for analyzing data using Hive QL (Queries), Pig Latin
(Data flow language), and custom MapReduce programs in Java.
Solid understanding on NoSql tools like HBase, MongoDB, Cassandra.
Involved in the full development lifecycle from requirements gathering through
development, Repository Manager, Designer, Workflow Manager, and Workflow
Monitor. Extensively worked with large Databases in Production environments.
Extending Hive and Pig core functionality by writing custom UDFs.
Good Knowledge onHadoopCluster architecture and monitoring the cluster.
An excellent team player and self-starter with good communication skills and proven
abilities to finish tasks before target deadlines.
Strong analytical and Problem solving skills.
Experience in ETL processes.
Expertise in Spark-Scala.
2. Areas of Expertise
• Big Data Ecosystems: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive,
Pig, Sqoop, Cassandra, Oozie, Flume, Pentaho, Spark
• Programming Languages: Java, C/C++, Scala
• Scripting Languages: Bash
• Databases: NoSQL, Oracle, MySQL, MongoDb
• Scheduler: Autosys, Oozie
• Tools: Eclipse
• Platform: Windows, Linux, Mac
• Application Servers: Tomcat, Jboss
• Testing Tools: Eclipse, NetBeans
• Methodologies: Agile
• Version Control: Git, SVN
Professional Experience
Hadoop Developer
Isilon BDL product Jan 2017 – present
Client : EMC
Description: Facilitated insightful daily analyses of 60 to 80GB of Iphone home log data collected by
Isilon cluster. Cluster usage of each customer, cluster health check, Spawning recommendations,
tableau visualization.
Responsibilities:
• Interacted with client as per requirement gathering and suggested relevant technologies to
build the solution.
• Provided design recommendations and thought leadership to sponsors/stakeholders that
improved review processes and resolved technical problems.
• Developed MapReduce programs to parse the raw data, populate staging tables and store the
refined data in partitioned tables in the EDW.
• Enabled speedy reviews and first mover advantages by using Oozie to automate data loading
into the Hadoop Distributed File System and PIG to pre-process the data.
• Exported parsed data from hdfs to postgresql with Sqoop.
• Managed and reviewed Hadoop log files.
• Completed testing of integration and tracked and solved defects.
• Tested raw data and executed performance scripts.
.
Hadoop Developer
PX4-Hadoop May 2015 – Dec 2016
Client : Apple
Description: Apple Music is an application developed by Apple to provide music for users. Users can
play, skip, scrub forward, scrub back and pause songs. Apple Music has Beats1 Radio as well to provide
3. broadcast music for its users. Autosys is the scheduler for scheduling the job. Project deals with analysis
of customer usage, payment calculation, Royalty bearing along with it exporting data to Teradata and
different other sources. In Apple we collect and analyse large amounts of data daily, monthly and
quarterly basis.
Responsibilities:
• Involved in requirement gathering, analysis and design document creation
• Introduced metadata driven architecture in RINS aggregate
• Worked on a live 1000+ nodes Hadoop cluster running HDP2.2
• Developed Hive scripts for end user / analyst requirements to perform ad hoc analysis
• Very good understanding of Partitions, Bucketing concepts in Hive and designed
both Managed and External tables in Hive to optimize performance
• Solved performance issues in Hive with understanding of Joins, Group and aggregation
and how does it translate to MapReduce jobs.
• Developed UDFs and UDAF in Java as and when necessary to use in HIVE queries
• Developed Oozie workflow for scheduling and orchestrating the ETL process
• Created autosys jobs for aggregates in order to schedule the job
• Shell scripts are introduced to export the result to Teradata and XT interface.
• Testing document is created incorporating all the relevant scenarios.
• Created CR artifacts and deployed the same in production clusters.
Hadoop Developer
ScoringPoA Oct 2014 – April 2015
Client : Dun & Bradstreet
Description: Build one global solution that compares relative levels of risk irrespective of
geographic boundaries. This would help D&B's to build a single, globally-consistent risk score
for their customers with cross-border needs that can go beyond their priority markets
Responsibilities:
• Developed Hive scripts for end user / analyst requirements to perform ad hoc analysis
• Worklows and scheduling jobs are created using Oozie
Hadoop Developer
Global Analytics Platform March 2013 – Sept 2014
Client : Neilsen (Walmart)
Description: Global Analytical Platform of Nielsen Retail team deals with the Buying insights of
the customers. GAP team involves different modules like ETL, Data Model and Publishing
services. Our project deals with the ETL module upon use cases such as Pricing Insights and
Pricing Elasticity. ETL team performs the extraction of data from multiple data sources, does
pre-validation checkings and through source-target mappings we do transformations for stage
4. table creation. Finaly we generate the target tables and provide to the model team to perform
analytics. The entire data flow is triggered using oozie which calls the Hive scripts and Java API.
Responsibilities:
• Involved in requirement gathering, analysis and design document creation
• Worked on a live 100+ nodes Hadoop cluster running CDH5.1
• Developed UDF in java inorder to meet a specific requirement in hive
• Developed Hive scripts for end user / analyst requirements to perform ad hoc analysis
• Worklows and scheduling jobs are created using Oozie
• Introduced shell script to extract the raw data in different formats and load the file
to configured location in hdfs.
Hadoop Developer
Big Data Synchronization Sept 2012 – Feb 2013
Client : CommonWealth Bank of Australia
Description: Tool incorporates HDFS, hive and hbase data backup. Details of the data to be sync
are in mysql. Hive metastore replication make use of mysqlreplication. It also provides facility to
merge and purge files.
Responsibilities:
• Involved in requirement gathering, design, development and testing document creation
• Implemented hdfs replication using distcp
• HBase replication solution is created using HBase CopyTable command
• Hive replication is done using mysql replication
• Worklows and scheduling jobs are created using Oozie
Hadoop Developer
M &S Performance Benchmarking Jun 2012 – Aug 2012
Client : Marks And Spencer
Description: Application is used to know the product affinity.Source data resides in DB2.Data is
transferred to HDFS through Sqoop and analysis is triggered.Visualisation is done through
Pentaho.Manufacturer gets an overview of buying pattern of the products sold.
Responsibilities:
• Exported data from DB2 to hdfs using Sqoop
• Implemented Market Basket Analysis algorithm to analyse product affinity
• Designed and created hive table to load resultant data
• Visualization is done using Pentaho from hive table data
5. Hadoop Developer
Foundation Framework Mar 2012 – May 2012
Client : Internal
Description: Acts as a centralized repository of reusable utilities of big data components
Different hadoop based components can be integrated using this framework.
Responsibilities:
• Involved in requirement gathering, design, development and testing document creation
• Integrated Hive migration solution to framework.
• Committed all the tested code to Git
Hadoop Developer
NSN Performance Benchmarking Jan 2012 – Mar 2012
Client : Nokia Siemens Network
Description: Developing a near real time platform to analyze customer events on the large
volume data. It mainly incorporates ETL processing using hive and pig scripts and optimization
recommendations.
Responsibilities:
• Designed and created Hive table
• Transformation is performed using Hive
• Transformation is performed using Pig
• Cluster monitoring and benchmarking is done using Ganglia
Hadoop Developer
Spark-POC
Client : Internal
Description: Log Analysis is done using Spark streaming and Spark-Sql
Responsibilities:
• Implemented Spark streaming using Spark streaming API
• Spark Sql is used to analyse the data
Career Profile
Company Accenture
Designation Senior Analyst
Location Bangalore
Duration May 2015 till date
6. Company Cognizant Technologies
Designation Associate-Project
Location Cochin
Duration November 2014 - May 2015
Company Tata Consultancy services Ltd
Designation Systems Engineer
Location Chennai , Cochin
Duration September 2011 – October 2014
Personal Details
Name Vipin KP
Date of Birth 30-Dec-1987
Nationality Indian
Sex Male
Marital Status Single
Positive Points Determinant, sincere, good listener, leadership
Qualities
Passport Details K1297963
Educational Qualification
Degree and Date University Specialization
Bachelor of Technology,June
2011
Kerala University Computer Science
7. Company Cognizant Technologies
Designation Associate-Project
Location Cochin
Duration November 2014 - May 2015
Company Tata Consultancy services Ltd
Designation Systems Engineer
Location Chennai , Cochin
Duration September 2011 – October 2014
Personal Details
Name Vipin KP
Date of Birth 30-Dec-1987
Nationality Indian
Sex Male
Marital Status Single
Positive Points Determinant, sincere, good listener, leadership
Qualities
Passport Details K1297963
Educational Qualification
Degree and Date University Specialization
Bachelor of Technology,June
2011
Kerala University Computer Science