1. PRASHANT SHARMA
prasha2@ncsu.edu, (984-244-4707), www.linkedin.com/in/prasha3173
EDUCATION Aug 2015 – May 2017 (expected)
M.S. Operations Research GPA: 3.63/4
North Carolina State University, Raleigh, NC, USA
B.Tech. Civil Engineering Aug 2010 – May 2014
National Institute of Technology Silchar, India GPA: 3.7/4
TECHNICAL SKILLS
Languages: Python, SQL, Pig, Hive, Java
Machine Learning & Statistics: Logistic Regression, Decision Trees, Random Forest, SVM, Clustering
Neural network, ANOVA, Text Analytics, Linear Regression.
Technologies: Apache Spark, Hadoop, SAS, Graph Database Neo4j, MapReduce
Coursework: Data Mining, Algorithms, Non-Linear Programing, Linear Models, Statistics for
Engineering, Mathematical Modeling, Data Driven Decision Making,
EXPERIENCE TECHNICAL ANALYST /SUPERVISOR – Saluja Tech.
September 2014- March 2015
Assisted Project Manager with budget preparation and bidding document a new project.
Worked with a team of 15 people to proof check the technical design of a structure, subsequently
able to reduce the cost by 20% by modifying the design.
Prepared and Supervised the execution of project timeline while coordinating with different
vendors simultaneously.
TEACHING ASSISTANT (NON-PROFIT VOLUNTARY) - GYANDARSHAN
August 2013- December 2013
Worked with a Non-Profit Club to provide free Science education to economically
underprivileged students.
Taught Mathematics class at High School Level.
RELEVANT PROJECTS (https://github.com/prasha3173)
Tag Unsatisfied Customers from Imbalanced Data (Kaggle) [Python,Pandas,SkLearn]
● Built a Logistic Regression model to find out unsatisfied customers based on their transaction history .
● Successfully increased Performance metrics for model, Recall value, from 23% (baseline) to 65%. Recall value
was selected due to the class imbalance character of Dataset.
● PCA and Tree based Random Forest method were used for feature selection and used Grid Search and random
parameter optimization techniques for parameter estimation.
Predict next product purchase for a financial Institution (Kaggle) [Python,Pandas,SkLearn]
● Logistic Regression model was fitted to find out next possible product to be purchased by the customer based on
their transaction history.
● Achieved the accuracy of 86% by making correct predictions among 16 possible products.
● Some features had large numbers of categories. To avoid the data size increment by converting them to dummy
variables, those categories were replaced by mean value of dependent variable for respective categories
Predict ACT scores of students based on their demographic and education status [Python, SQL,SkLearn]
● Built a Linear Regression model to predict the ACT scores based on demographic knowledge of students.
● Enhanced R-square value from 63% to 75% by treating dataset for collinearity and other impurities.
Recommend Wikipedia articles based on their similarity. (Coursera)[Python, SkLearn, Pandas]
● A model was developed to find the next article, which a user would like to read based on their present choice. A
collection of 10,000 Wikipedia articles was used for this model.
● In this analysis, cosine distance between two documents was used as similarity measure between the documents.
It was found that efficiency of model with TIFDF technique was better than ‘bag of words’ model.