Parmanand Sahu is seeking a data science role and has a MS in Computer Science from UT Dallas. He has 2+ years of experience in data science roles at Capital One and VHSS Lab at UT Dallas. His technical skills include Python, SQL, MongoDB, TensorFlow and he has experience in machine learning algorithms like random forest, logistic regression, RNNs and CNNs. He has experience building models for tasks like named entity recognition, fraud detection and question answering. He also has relevant academic projects on tasks like image classification and decision trees.
Widal Agglutination Test: A rapid serological diagnosis of typhoid fever
Resume
1. PARMANAND SAHU
Dallas,TX | parmanand.sahu@utdallas.edu | 972-730-3967 | linkedin.com/in/parmanandsahu/ | github.com/analystanand
EDUCATION
THE UNIVERSITY OF TEXAS AT DALLAS, Richardson, TX Aug 2019 - May 2021
Master of Science in Computer Science 3.61/4
Related Coursework: Machine Learning, Database Design, Algorithms and Data Structure, CNN
NATIONAL INSTITUTE OF TECHNOLOGY, Raipur, IN Jul 2009 - Jul 2013
Bachelor of Technology (Hons.) in Metallurgical Engineering 8.35/10
TECHNICAL SKILLS
Languages Python, C, MATLAB, Node.js
Databases MongoDB, MySQL, DynamoDB, Neptune, ElasticSearch, Neo4j
Libraries Numpy, Pandas, Matplotlib, NLTK, Gensim, Sklearn, Spacy, Scipy, Tensorflow, PyTorch
Algorithms Random Forest Logistic Regression,SVM, K-means,Decision Tree, RNN, Attention Mechanism, Anomaly Detection
Technologies Linux, Git,Jupyter Lab, RASA, Docker, MS-Excel, ML-flow, ELK, Django, Rest, Spark, Flask
WORK EXPERIENCE
Capital One Jun 2020 - Aug 2020
Data Science Intern McLean,VA
Mortgage Based Security Prepayment Model (Pandas | DNN | AWS S3 | Tensorflow | Matplotlib)
– Analyze and preprocesses specific category of Mortgage Based Securities(MBS) with 150+ features
– Train and evaluate Neural Network model on preprocessed MBS data
– Automate report generation for evaluation of models for comparison
VHSS Lab at Center for Modeling and Simulation, UTD Sep 2019 - May 2020
Machine Learning Specialist | Student Assistant Richardson,Tx
Conversational Emotive Virtual Reality patient (NSF funded)(Transformer | Virtual Assistant)
– Research and Train transformer-based model for virtual patients interacting with medical students using Pytorch.
– Supervise a project for assessing medical students based on their responses to virtual patients.
Huddl Enterprise Communication Pvt. Ltd Apr 2018 - Jun 2019
Artificial Intelligence Engineer Hyderabad, India
Named Entity Recognition for Voice Assistant(Python | Node.js | Regex | DynamoDB | Virtual Assistant)
– Supervised data preparation team and retrained custom spacy model for Named-Entity-Recognition.
– Built module using Levenshtein Distance and Phonetic similarity to fix incorrect transcription for recognized entities.
– Developed micro-service using Node.js and DynamoDB to use as gazetteer in NER.
– Designed module using regex for extracting entities like Time and Date from voice commands.
Reverse image search for information retrieval(Elastic Search | K-means | Neptune | OCR | Sci-kit )
– Developed parser for OCR response and utilized K-means to classify content to reduce false positive.
– Built keyword extraction using RAKE and graph-based algorithm for ranking meetings on search results.
Action Item Detection in Meeting Transcript(Python | LSTM | RNN | Mlflow)
– Trained LSTM-RNN based model to classify the action items in the meeting transcript 95%.
– Deployed ML-flow for internal use and track experiments with multiple hyper-parameters.
CoArtha Technosolution Pvt. Ltd. Sep 2017 - Apr 2018
Associate Data Science Engineer Hyderabad, India
Semantic Understanding of Job Description for ranking resumes (Python | Naive Bayes | Beautiful Soup)
– Built pipeline to scrape 10+ job boards and pre-processed data using Selenium and Beautiful Soup.
– Trained model using Naive Bayes to classify sentences in job desc. & polarity with 90%+ acc.
– Assisted in developing scoring logic to match job descriptions with resumes.
Candidate screening from audio interviews (Python | Random Forest | Librosa)
– Employed Random Forest for classifying candidates using interview response audio with 90+% acc.
2. CoArtha Technosolution Pvt. Ltd. Sep 2016 - Aug 2017
Associate Software Engineer Hyderabad, India
Knowledge Graph from job descriptions for ranking resumes (Python | Neo4j | ElasticSearch| MongoDB)
– Built Knowledge graph using Neo4j with skills, job titles & education entities from 100k+ job descriptions.
– Designed pipeline to semi automatically update knowledge graph for resume ranking.
Semantic parsing of resumes(Python | Regex | Logistic Regression | TF-IDF)
– Trained ensemble model to identify sections(contact, education, experience and skill)
– Implemented solution using regex for extracting entities and parsed table to correlate extracted entities.
DigiFledged Jul 2015 - Aug 2016
Founder Bhilai, India
– Managed daily operation and acquired technical & functional requirements of projects from new clients.
– Led a team to deliver 5+ web development, 17+ freelancing projects and establish a blog with 60K+ page-views.
JSW Steel Ltd. Feb 2014 - Apr 2015
Junior Manager Bellary, India
– Analyzed production reports discovering insights through exploratory data analysis using MS Excel and R.
– Assisted in evaluating Implementation of Level 2 automation in Secondary Steel Making.
PROJECTS
Question Answering on SQUAD 1.0 (LSTM | RNN | Self Attention | Pytorch)
– Preprocessed and extracted custom features along with pre-trained word embedding(Glove). embedding(Glove).
– Trained QA model(simplified Stanford Attentive Reader) with 70% F1 Score.
Credit Card Transaction Fraud Detection(Random Forest | Logistic Regression | Feature Engineering | Sci-kit)
– Imputed missing data,created custom features,normalize and encode features
– Performed exploratory data analysis of features
– Handled imbalanced data using SMOTE and custom loss function.
– Train and evaluate linear/tree based classifier methods
– Analyzed feature importance w.r.t to dependent variable
Named Entity Recognition on CONLL 2003 (RNN | GRU | Pytorch)
– Preprocess and prepare vocabulary for embedding layer
– Trained and compared Vanilla RNN(83%) and GRU RNN(86%)
Image Identification on CIFAR-10 dataset (Pytorch | Convolutions Neural Network )
– Augmented image data using transformation technique(random crop,vertical flip)
– Implemented and trained RESNET family of architecture for image detection with 86% accuracy
Ensemble Method and Decision Tree from scratch (Decision Tree | Bagging | Adaboost | Sci-Kit)
– Implemented fixed depth decision(ID3) tree from scratch for monk’s classification dataset
– Implemented Bagging and AdaBoost and compared with Sci-kit implementation for Mushroom bruises
Web Development and Deployment
– Developed MVC based web application for managing contacts using Django
– Deployed Serverless Web Application using AWS Lambda, Amazon API Gateway, Amazon Dynamo-DB, Amazon
Cognito, and AWS Amplify Console.
CERTIFICATIONS AND ACTIVITIES
– Data Scientist with python track by DataCamp
– Architecting in AWS by The University of Texas at Dallas
– PyTorch, SQL, Django offered through Linkedin Learning
– Volunteered for Lone Star Parity Project for technical assistance.
– Linked Data Engineering by Hasso-Plattner Institute: Building Knowledge Graph,2016.
– M101: MongoDB for Developers by MongoDB University
– Neural Network and Deep Learning and Machine Learning offered through Coursera
– Volunteered as Member of AnalyticsVidhya.com (Data Science Community, Hyderabad Chapter.)
– Digital Marketing Certification from Delhi School of Digital Marketing, Delhi, INDIA.