sam_resume - updated

SAM
Samk.bigdata@gmail.com
614-664-6543
Professional Summary:
 8+ yearsof overall experience in Financial, Marketing and Enterprise Application Development in
diverse industries whichincludes hands on experience in Big data ecosystem related technologies.
 3+ years of Data Analyticsexperience in ApacheHadoopClouderaandHortonworksDistributions
 Expertise in core Hadoop and Hadoop technology stackwhich includes HDFS, MapReduce,Oozie,
Hive,Sqoop,Pig, Flume,HBase, Spark,Storm, Kafka and Zookeeper.
 Experience in AWS cloudenvironment and on s3 storage and ec2 instances and deploying in it.
 In-depth knowledge of Statistics, MachineLearning,Datamining.
 Well versed in installation, configuration, supporting and managing of Big Data and underlying
infrastructure of HadoopCluster.
 Experienced in implementing complex algorithms on semi/unstructured data using Mapreduce
programs.
 Experienced in working withstructureddata using HiveQL, joinoperations,HiveUDFs,partitions,
bucketing and internal/external tables.
 Experienced in migrating ETL kind of operations using Pig transformations, operations and UDF's.
 Good knowledge on Python.
 Designed and developed ETLprocesses to extract and load data from Legacy System by using Talend
data integration tool.
 SparkStreaming collectsthis data from Kafkain near-real-time and performs necessary
transformations and aggregation on the fly to build the common learner data model and persists the
data in NoSQL store(Hbase).
 Specialization in Data Ingestion, Processing, Development from Various RDBMS data sources into a
Hadoop Cluster using MapReduce/Pig/Hive/Sqoop
 Configured different topologies forStormclusterand deployed them on regular basis.
 Experienced in implementing unified data platform to get data from different data sources using Apache
Kafka brokers, cluster, Javaproducers and Consumers.
 Excellent WorkingKnowledge in Spark Core,SparkSQL, SparkStreaming using Scala.
 Experienced in working within-memory processing frame worklike Sparktransformations,SprakQL
and Sparkstreaming.
 Experienced in proving User based recommendation by implementing collaborativefiltering and matrix
factorizationand different classificationtechniques like random forest, SVM, K-NN using Spark
Mliblibrary.
 Excellent understanding and knowledge of NOSQL databases like HBase,Cassandra,Mongo DB,
Teradataand onData warehouse.
 Installed and configured Cassandraandgood knowledgeabout Cassandra architecture, read, write
paths and query.
 Implemented Frameworks using java and python to automate the ingestion flow.
 Involvedin NoSQL(Datastax Cassandra) database design, integration and implementation and written
scripts and invoked them using CQLSH.
 Involvedin data modeling in Cassandraand Involvedin implementing sharding and replication
strategies in MongoDB.

 Developed fan-out workflow using flumeforingesting data fromvarious data sources like
Webservers,RestAPI by using different sources and ingested data into Hadoop with HDFS sink.
 Experienced in implementing custom interceptors and sterilizers in flumeforspecific customer
requirements.
 Toolmonitored log input from several datacenters, via SparkStream, was analyzed in Apache Storm
and data was parsed and saved into Cassandra.
 Experience in importing and exporting data using SqoopfromHDFS to Relational DatabaseSystems
MYSQL,SQL SERVER and vice versa.
 Involvedin Database Designing including ERDiagram and Database Normalization (3NF).
 Experience in developing strategies forExtraction, Transformation and Loading (ETL) data from
various sources into Data Warehouse and Data Marts using informatica.
 Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS,
JobTracker,TaskTracker,Name Node,Data Nodeand MapReduce programming paradigm.
 Good Exposure on ApacheHadoopMapReduceprogramming,PIG ScriptingandDistribute
ApplicationandHDFS.
 Experience in managing HadoopclustersusingClouderaManagerTool.
 Very good experience in complete project life cycle(design, development, testing and implementation)
of Client Server and Web applications.
 Developed Tableau dashboards with combination charts for clear understanding.
 Improved Reports performance in Tableau using extracts and context filters
 Worked on Cluster co-ordination services through Zookeeper.
 Actively involvedin coding using CoreJavaand collectionAPI'ssuch as Lists, Sets and Maps.
 Hands on experience in application development using Java, RDBMS,andLinuxshell scripting.
 Experience on different operating systems like UNIX, Linuxand Windows.
 Experience on Java Multi-Threading,Collection,Interfaces,Synchronization,andException
Handling.
 Involvedin writing PL/SQLstored procedures,triggersandcomplexqueries.
 Worked in Agileenvironmentwithactivescrum participation.
Technical Skills:
Hadoop/BigData HDFS, Map reduce, HBase, Pig, Hive, Sqoop, MongoDB, Cassandra, Flume,
Oozie, Zookeeper, AWS, Spark, Kafka, Teradata, Storm, ETL,Informatica,
Tableau, Talend, Scala.
Java & J2EE Technologies Core Java,Servlets, JSP,JDBC, JavaBeans, Maven, Gradle, JUnit,TestNG.
IDE’s Eclipse, Net beans, Intellij Idea.
Frameworks MVC, Struts, Hibernate, Spring.
Programminglanguages C,C++, Java,Python,Ant scripts, Linux shell scripts
Databases Oracle 11g/10g/9i, MYSQL, DB2, MS-SQL SERVER
Web Servers Web Logic, Web Sphere, Apache Tomcat,
Web Technologies HTML, XML,JavaScript, AJAX,SOAP, WSDL,JAX-RS, Restful, JAX-WS.
NetworkProtocols TCP/IP,UDP,HTTP,DNS, DHCP
VersionControls CVS, SVN, GIT.
Work Experience:

Client: CVS,Greensboro,NC Feb’ 15 to Till Date
Role:HadoopDeveloper
Responsibilities:
 Worked on analyzing Hadoopclusterand different big data analytic tools including Pig,Hbase
databaseand Sqoop,Cassandra,zookeeper,AWS.
 Evaluated business requirements and prepared detailed specifications that follow projectguidelines
required to develop written programs.
 Responsible forbuilding scalabledistributed data solutions using Hadoop.
 Analyzed large amounts of data sets to determine optimal way to aggregate and report on it using Map
Reduceprograms.
 Implemented Map reduceprograms toretrieve Top-K results from unstructured data set.
 Migrating various hive UDF’sand queries into SparkSQLfor faster requests as part of POC
implementation using Scala.
 Involvedin Database Designing including ERDiagram and Database Normalization (3NF).
 Optimized Map ReduceJobs touse HDFS efficiently by using various compression mechanisms.
 Handled importing of data from various data sources, performed transformations using Hive,
MapReduce,loadeddatainto HDFS and Extractedthe data from HDFS to MYSQL,SQL SERVER using
Sqoop.
 FollowedAgilemethodology (ScrumStandups, Sprint Planning, Sprint Review,Sprint Showcase and
Sprint Retrospective meetings).
 Exported the analyzed data to the relational databases such as oracle, mysql using Sqoopfor
visualization and to generate reports for the BI team.
 Experience in AWS cloudenvironment and on s3 storage and ec2 instances
 Developed fan-out workflow using flumeforingesting data fromvarious data sources like
Webservers,RestAPI by using different sources and ingested data into Hadoop with HDFS sink.
 Involvedin migrating MongoDBversion2.4to 2.6 and implementing new security features and
designing more efficient groups.
paths and query.
 Developed Tableau dashboards with combination charts for clear understanding.
 Improved Reports performance in Tableau using extracts and context filters
 Implemented various ETL solutions as per the business requirement using informatica
 Experience with creating ETLjobs to load JSON data and serverdata into MongoDB andtransformed
MongoDBinto theDataWarehouse.
 Designed and developed ETLprocesses to extract and load data from Legacy System by using talend
 Extensively used components like tWaitForFile, tIterateToFlow,tFlowToIterate,tHashoutput,
tHashInput, tMap, tRunjob, tJava,tNormalize and tfile components to create talend jobs.
 Involvedin data modeling in CassandraandMongoDB andinvolvedin choosing indexes and primary
keys based on the client requirement.
 Configured Spark streaming to receive real time data fromthe Kafka and store the stream data to HDFS
using Scala.
 Used Sparkfor Parallel data processing and better performances using Scala.
 Extensively used Pig fordata cleansing and extract the data from the web server output files to load
into HDFS.
 Developed a data pipeline using Kafka and Stormto store data into HDFS.

 Implemented Kafka Javaproducers, create custom partitions, configured brokers and implemented
High level consumers to implement data platform.
 Implemented Storm topologies topreprocess data, implemented custom grouping to configure
partitions.
 Managed and reviewed Hadooplogfiles.
 Involvedin creating Hivetables, loading with data and writing hivequeries which willrun internally in
MapReduce way.
 Used Hiveto analyze the partitioned and bucketed data and compute various metrics forreporting.
 Installed and configured Pigand also written Pig Latin scripts.
 Responsible tomanage data coming fromdifferent sources such as SQL SERVER.
Environment:Hadoop,MapReduce, Agile, HDFS, Hive, Pig, Java,SQL SERVER, Sqoop, Java (jdk1.6),
Spark, kafka,AWS, MongoDB,Storm, Cassandra, ETL,Informatica, Python,Tableau, Talend, scala.
Client:BB&T Bank-Charlotte,NC Feb’ 14 to Jan’ 15
Responsibilities:
paths and quering using Cassandrashell.
 Worked on writing MapReducejobs to discover trends in data usage by customers.
 Worked on and designed Big Data analytics platform forprocessing customer interface preferences and
comments using Java, Hadoop,HiveandPig.
 Involvedin hive-Hbaseintegrationby creating hiveexternal tables and specifying storage as Hbase
format.
 Importing and exporting data into HDFS and Hiveusing Sqoopfromoracle and vice versa.
 Exported the analyzed data to the relational databases such as oracle using Sqoopforvisualization and
to generate reports for the BI team.
 Designed and developed ETLprocesses to extract and load data from Legacy System by using talend
 Experienced in defining job flowsto run multiple MapReduceand Pigjobsusing Oozie.
 Installed and configured Hiveand also written HiveQL scripts.
 Experience with loading the data into relational database for reporting, dash boarding and ad-hoc
analyses, whichrevealed waysto lower operating costs and offsetthe rising costof programming.
 Experience with creating ETLjobs to load JSON data and server data into MongoDB andtransformed
MongoDB intotheData Warehouse.
 Involvedin ETLcode deployment, PerformanceTuning of mappings in Informatica.
 Created reports and dashboards using structuredand unstructureddata.
 Experienced with performing analytics on Time Series data using HBase.
 Implemented HBaseco-processors, Observers to workas event based analysis.
 Hands on Installing and configuring nodes CDH4HadoopCluster onCentOS.
 Implemented HiveGeneric UDF'sto implement business logic.
 Experienced with accessing Hivetables to perform analytics from java applications using JDBC.
 Experienced in running batch processes using Pig Scripts and developed Pig UDFsfor data
manipulation according to Business Requirements.
 Experience with streaming workflow operations and Hadoop jobs using Oozieworkflow and
scheduled through AUTOSYSon a regular basis.

 Performed operation using Partitioning pattern in MapReduceto move records into different
categories.
 Developed SparkSQL scripts and involvedin converting hiveUDF’s to SparkSQL UDF’s.
 Responsible forbatch processing and real time processing in HDFS and NOSQL Databases.
 Responsible forretrieval of Data from Casandra and ingestion to PIG.
 Experience in customizing map reduce frameworkat various levels by generating Custom Input
formats, Record Readers, Partitioner and Data types.
 Experienced with multiple file in HIVE,AVRO,Sequence file formats.
 Created and maintained Technical documentation for launching HADOOPClusters and forexecuting
PigScript.
 Implemented business logic by writing PigUDF's in Javaand used various UDFs fromPiggybanks and
other sources.
Environment:Casandra, Map jobs, Spark SQL, Agile, ETL,Pig Scripts, Flume, Hadoop BI, Pig UDF’s,
Oozie, AVRO,Hive, Map Reduce, Java, Eclipse, Zookeeper, Informatica,oracle, Python,Talend.
Client: Epsilon,Danbury,CT Aug’ 12to Dec’ 13
Responsibilities:
• Involvedin the Complete Softwaredevelopment life cycle(SDLC) to develop the application.
• Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase
databaseand Sqoop,Cassandra,zookeeper.
• Involvedin loading data from LINUX filesystem to HDFS.
• Exported the analyzed data to the relational databases using Sqoopforvisualization and to generate
reports forthe BI team.
• FollowedAgilemethodology (ScrumStandups, Sprint Planning, Sprint Review,Sprint Showcase and
• Importing and exporting data into HDFS and Hiveusing SqoopfromOracleand viceversa.
• Implemented test scripts to support test driven development and continuous integration.
• Developed multiple MapReducejobs in java fordata cleaning.
• Installed and configured HadoopMapReduce,HDFS,DevelopedmultipleMapReducejobs injava
for data cleaning and preprocessing.
• Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
• Involvedin creating Hivetables, loadingwithdataand writinghivequeries that willrun internally
in Map Reduce way.
• Supported MapReducePrograms thoseare running on the cluster.
• Analyzed large data sets by running HivequeriesandPigscripts.
• Implemented Frameworks using java and python to automate the ingestion flow.
• Worked on tuning the performance Pig queries.
• Mentored analyst and test team for writing HiveQueries.
• Installed Oozieworkflow engine to run multiple Mapreducejobs.
• Worked withapplication teams to install operating system, Hadoop updates, patches, version upgrades
as required.
• Worked on zookeeper forcoordinatingbetween differentmaster node and datanodes

Environment:Hadoop,HDFS, Map Reduce, Agile, Hive, Pig, Sqoop, Linux, Java, Oozie,Hbase, zookeeper,
SQL SERVER,python.
Client:MeridianEnterprise,StLouis,MO Nov’ 10to Jul’ 12
Role:Java/J2EE Developer
Responsibilities:
 Workwith business users to determine requirements and technical solutions.
 Developed business components using core java concepts and classes like Inheritance,
Polymorphism,Collections,SerializationandMultithreadingetc.
 Used SPRING framework that handles application logic and makes calls to business make them as
SpringBeans.
 Implemented, configured data sources, session factory and used Hibernate Template to integrate
SpringwithHibernate.
 Developed web services to allow communication between applications through SOAP over HTTP with
JMS and mule ESB.
 Actively involved in coding using Core Java and collection API's such as Lists, Sets and Maps
 Developed a Web Service(SOAP,WSDL) that is shared between front end and cable bill review
system.
 Implemented Rest based web service using JAX-RS annotations, Jersey implementation for data
retrieval with JSON.
 Developed MAVENscriptsto build and deploy the application onto Web logic Application Server and
ran UNIX shell scripts and implemented auto deployment process.
 Used Mavenas the build tool and is scheduled/triggered by Jenkins(build tool).
 Develop JUNIT test cases for application unit testing.
 Implement Hibernatefordata persistence and management.
 Used SOAP UItool for testing web services connectivity.
 Used SVNas version control to checkin the code, Created branches and tagged the code in SVN.
 Used RESTFULServices to interact with the Client by providing the RESTFULURLmapping.
 Used Log4j frameworkto log/track application and debugging.
Environment:JDK1.6, Eclipse IDE,Core Java,J2EE,Spring, Hibernate, Unix, Web Services, SOAP UI,
Maven, Web logic Application Server, SQL Developer,Camel, Junit, SVN, Agile, SONAR, Log4j, REST,
Log4j, JSON, JBPM,Agile.
Client:MOBILINK – DSS MOBILE COMMUNICATIONS,INDIA Aug’ 07to Sep’ 10
Role:JuniorJavaDeveloper
Responsibilities:
 Involved in analysis, design and development of Expense Processing system.
 Created used interfaces using JSP.
 Developed the Web Interface using Servlets, Java Server Pages, HTML and CSS.
 Developed the DAO objects using JDBC.
 Business Services using the Servlets and Java.
 Design and development of User Interfaces and menus using HTML 5, JSP, Java Script, client side and
server side validations.
 Developed GUI using JSP, Struts frame work.
 Involvedin developing the presentation layer using SpringMVC/AngularJS/JQuery.
 Involved in designing the user interfaces using Struts Tiles Framework.
 Used Spring 2.0 Framework for Dependency injection and integrated with the Struts Framework and
Hibernate.
 Used Hibernate 3.0 in data access layer to access and update information in the database.
 Experience in SOA (Service Oriented Architecture) by creating the web services with SOAP and WSDL.

 Developed JUnit test cases for all the developed modules.
 Used Log4J to capture the log that includes runtime exceptions, monitored error logs and fixed the
problems.
 Used RESTFULServices to interact with the Client by providing the RESTFULURLmapping.
 Used CVS for version control across common source code used by developers.
 Used ANT scripts to build the application and deployed on Web logic Application Server 10.0.
Environment:-Struts1.2, Hibernate3.0, Spring2.5 ,JSP,Servlets, XML,SOAP,WSDL,JDBC, JavaScript,
HTML, CVS, Log4J, JUNIT,Web logic App server, Eclipse, Oracle,Restful.
Education:
Bachelors in Technology – Computer Science And Engineering
Jawaharlal Nehru Technological University
References:
 Provided upon request

sam_resume - updated

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to sam_resume - updated

Similar to sam_resume - updated (20)

sam_resume - updated