Rajesh Kumar CH Email: rchodabathula89@gmail.com
Mobile: 7075616979
Summary:
• Technically accomplished professional with 2 years of Big Data Eco Systems experience in ingestion,
storage, querying, processing and analysis of Big Data.
• In-depth knowledge and hands-on experience in dealing with Apache Hadoop components like HDFS,
MapReduce, HiveQL, Hbase, Pig, Hive, Sqoop, Oozie, Flume, and Spark.
• Hands on experience in writing MapReduce programs, Pig&Hive scripts.
• Designing and creating Hive external tables using shared meta-store instead of derby with
partitioning, dynamic partitioning and buckets.
• Extending Hive and Pig core functionality by writing custom UDFs.
• Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS
and vice-versa.
• Experience in building Pig scripts to extract, transform and load data onto HDFS for processing.
• Experience in writing HiveQL queries to store processed data into Hive tables for analysis.
• Excellent understanding and knowledge of NOSQL databases like Hbase and MongoDB.
• Expertise in writing Stored Procedures, Functions, DDL, DML, SQL queries.
• Having good knowledge on Spark (RDD) and its working functionality.
• Having good exposure on Scala and R programming.
• Worked extensively with CDH3, CDH4.
Technical Skills
Hadoop/Big Data
Technologies
HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Hbase, Oozie, Zookeeper.
Programming Languages Java, SQL, PL/SQL
Operating Systems UNIX, WINDOWS, LINUX
Application Servers IBM Web sphere, Tomcat.
Databases Oracle & MySQL
Java IDE Eclipse, IBM RAD
Professional Experience:
Hadoop Developer - Cognizant Technology Solutions - Jan 2015 – till date
KeyBank is an American regional bank headquartered in Cleveland, Ohio. As of 2013, it is the 22nd
largest bank in the United States based on total deposits. It is the 29th largest bank in the United States
by total assets. This project focus on helping build business focused Big Data & analytical solutions
surrounding large scale, distributed software systems and data analytics.
Responsibilities:
• Worked on analysing Hadoop cluster using different big data analytic tools including Pig, Hive, and
MapReduce.
• Developed data pipeline using Flume, Sqoop to ingest customer behavioral data and purchase
histories into HDFS for analysis.
• Used Pig to perform data validation on the data ingested using Sqoop and flume and the cleansed
data set is pushed into Hbase.
• Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into
the tables and writing hive queries to further analyze the data based on client requirement.
• Involved in running MapReduce jobs for processing millions of records.
• Developed Hive queries and Pig scripts to analyze large datasets.
• Involved in importing and exporting the data from RDBMS to HDFS and vice versa using Sqoop.
• Involved in generating the Adhoc reports using Pig and Hive queries.
• Used Hive to analyze data ingested into Hbase by using Hive-Hbase integration and compute various
metrics for reporting on the dashboard.
• Loaded the aggregated data onto Oracle from Hadoop environment using Sqoop for reporting on the
dashboard.
Environment: Linux, HDFS, Map-Reduce, Hive, Java, Pig, Sqoop, Flume, Zookeeper, Oozie, Oracle, Hbase.
Hadoop Developer - Cognizant Technology Solutions - July 2014 to Jan 2015
HealthCare POC: This Project deals with unstructured data of patient information like Patient Id, Name,
Age, Gender, Hospital, type of disease and the address of patient. Collecting those data and process it
with MapReduce and Pig and making into structured data by using Hive tables according to their
respective Age, Type of disease and Admitted Date. And store the summarized results in Hive External
Tables. Export the resultant datasets to RDBMS (MySQL) using Sqoop.
• Involved in loading data from Unix File System into HDFS with different format of data (Avro,
Parquet)
• Developed the Map Reduce Code for analyzing XML, JSON formats of data.
• Expertise in transforming the data from unstructured to structured format using Hive Functions,
SerDe and Regular Expressions.
• Developed Hive UDF’s to extend the core functionalities.
• Integrated Hive and Hbase for performing CRUD operations.
• Loaded the resultant tables into Hive External tables for querying.
Environment: Linux, HDFS, Hive, Hbase, Pig, Sqoop, Flume, MySQL.
Education 2010-2014
• Bachelor of Technology, Computer Science &Engineering, Bapatla Engineering College, India.

new_Rajesh_Hadoop Developer_2016

  • 1.
    Rajesh Kumar CHEmail: rchodabathula89@gmail.com Mobile: 7075616979 Summary: • Technically accomplished professional with 2 years of Big Data Eco Systems experience in ingestion, storage, querying, processing and analysis of Big Data. • In-depth knowledge and hands-on experience in dealing with Apache Hadoop components like HDFS, MapReduce, HiveQL, Hbase, Pig, Hive, Sqoop, Oozie, Flume, and Spark. • Hands on experience in writing MapReduce programs, Pig&Hive scripts. • Designing and creating Hive external tables using shared meta-store instead of derby with partitioning, dynamic partitioning and buckets. • Extending Hive and Pig core functionality by writing custom UDFs. • Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa. • Experience in building Pig scripts to extract, transform and load data onto HDFS for processing. • Experience in writing HiveQL queries to store processed data into Hive tables for analysis. • Excellent understanding and knowledge of NOSQL databases like Hbase and MongoDB. • Expertise in writing Stored Procedures, Functions, DDL, DML, SQL queries. • Having good knowledge on Spark (RDD) and its working functionality. • Having good exposure on Scala and R programming. • Worked extensively with CDH3, CDH4. Technical Skills Hadoop/Big Data Technologies HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Hbase, Oozie, Zookeeper. Programming Languages Java, SQL, PL/SQL Operating Systems UNIX, WINDOWS, LINUX Application Servers IBM Web sphere, Tomcat. Databases Oracle & MySQL Java IDE Eclipse, IBM RAD Professional Experience: Hadoop Developer - Cognizant Technology Solutions - Jan 2015 – till date KeyBank is an American regional bank headquartered in Cleveland, Ohio. As of 2013, it is the 22nd largest bank in the United States based on total deposits. It is the 29th largest bank in the United States by total assets. This project focus on helping build business focused Big Data & analytical solutions surrounding large scale, distributed software systems and data analytics.
  • 2.
    Responsibilities: • Worked onanalysing Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce. • Developed data pipeline using Flume, Sqoop to ingest customer behavioral data and purchase histories into HDFS for analysis. • Used Pig to perform data validation on the data ingested using Sqoop and flume and the cleansed data set is pushed into Hbase. • Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyze the data based on client requirement. • Involved in running MapReduce jobs for processing millions of records. • Developed Hive queries and Pig scripts to analyze large datasets. • Involved in importing and exporting the data from RDBMS to HDFS and vice versa using Sqoop. • Involved in generating the Adhoc reports using Pig and Hive queries. • Used Hive to analyze data ingested into Hbase by using Hive-Hbase integration and compute various metrics for reporting on the dashboard. • Loaded the aggregated data onto Oracle from Hadoop environment using Sqoop for reporting on the dashboard. Environment: Linux, HDFS, Map-Reduce, Hive, Java, Pig, Sqoop, Flume, Zookeeper, Oozie, Oracle, Hbase. Hadoop Developer - Cognizant Technology Solutions - July 2014 to Jan 2015 HealthCare POC: This Project deals with unstructured data of patient information like Patient Id, Name, Age, Gender, Hospital, type of disease and the address of patient. Collecting those data and process it with MapReduce and Pig and making into structured data by using Hive tables according to their respective Age, Type of disease and Admitted Date. And store the summarized results in Hive External Tables. Export the resultant datasets to RDBMS (MySQL) using Sqoop. • Involved in loading data from Unix File System into HDFS with different format of data (Avro, Parquet) • Developed the Map Reduce Code for analyzing XML, JSON formats of data. • Expertise in transforming the data from unstructured to structured format using Hive Functions, SerDe and Regular Expressions. • Developed Hive UDF’s to extend the core functionalities. • Integrated Hive and Hbase for performing CRUD operations. • Loaded the resultant tables into Hive External tables for querying. Environment: Linux, HDFS, Hive, Hbase, Pig, Sqoop, Flume, MySQL. Education 2010-2014 • Bachelor of Technology, Computer Science &Engineering, Bapatla Engineering College, India.