700 Health Sciences Drive, Chapin
B 2022, Stony Brook NY 11790 SA N T H O S H K U M A R
(631) 974-4880
santhoshkumarml@gmail.com
EMPLOYMENT
Software Engineer, Intern Informatica Summer 2015
Enterprise Information Catalogue (EIC): (Generic Platform using SOLR, Spark And Titan DB)
ď‚· Developed Spark Jobs (Scala) to ingest data in Apache SOLR from Titan Graph DB
ď‚· Ranking algorithm to representatively sample values from multiple exchange documents (EIC Data Format)
ď‚· Developed a Distributed Notification Service using Zookeeper and support faceted Search using Apache Solr.
Graduate Researcher DATA lab, Stony Brook Univ Spring 2015 - Current
ď‚· Working under Prof. Leman Akoglu on multi dimensional Time Series Anomaly Mining Techniques (SDAR, AR,
Granger) on a variety of behavioral features and use this info in graph to effectively find the spammers.
Senior Software Developer Informatica Apr 2013 – July 2014
ď‚· Developed a Query Capability Framework for Java Modeled Objects written using hibernate to query
modeled objects instead of querying databases rows/columns.
ď‚· Developed an analytic service (Phone Home Analytics) to find interesting information from the metadata
which is leveraged to find frequently used business objects by customers.
 Led and mentored interns/Junior members on: Optimizing Serialization/Deserialization using Google’s Proto
buffer and Apache’s Thrift and Enhancing the metadata upgrade framework for destructive model changes.
ď‚· Developed a Rule Manager Capability to define rules to manage metadata and auto governance.
Software Developer Informatica July 2011 – Apr 2013
ď‚· Modeling framework to generate platform agnostic code with services like object traverser/cloner/serializer.
ď‚· Developed a bulk fetch algorithm using locks/transactions to bulk retrieve objects from databases.
ď‚· Used idle nodes in the network to perform time consuming developer builds using a load balancer.
EDUCATION
Stony Brook, NY State Univ of New York, Stony Brook Fall 2014 – Dec 2015*
ď‚· Masters in Computer and Information Science. GPA: 3.71
ď‚· Graduate Coursework: Data mining, Machine Learning, Natural Language Processing, Artificial Intelligence,
Operating Systems, Asynchronous Systems, Analysis of Algorithms and Computational Linguistics (Current).
Chennai, India College of Engg., Guindy, Anna Univ Aug 2007 – Apr 2011
ď‚· Bachelor of Engineering in Computer Science. GPA: 8.7/10.0
ACADEMIC PROJECTS
ď‚· Detecting Fake Reviews in YELP: Enhanced LOOPY BELIEF PROPAGATION of Markov Network by temporal
separation of User-Business Graph and find fake reviews in yelp.com reviews.
ď‚· Predicting Literary Success using Stylometry: Predict the success of literary novels using statistical
stylometry by a variety of syntactic and semantic linguistic features.
ď‚· SBUnix: Implemented a 1:1 threading model with thread library and sync primitives (Mutex, Futex). Also
implemented the following: Memory management, Process Scheduling, File system Ops, Network Driver.
 “Chain Replication for Supporting High Throughput and Availability” for fault tolerant bank servers.
ď‚· Implemented Machine Learning ( Naive Bayes, Linear, Logistic Regression, Ada Boost, KNN, Gaussian mixture
models (EM), Decision Tree and K Means) and AI Algorithms (BFS, DFS, UCS, A*, CSP) for PacMan and Sudoku.
ADDITIONAL EXPERIENCE AND AWARDS
ď‚· APPRECIATION AWARD for prototyping /implementing solution for cycle elimination in build architecture.
ď‚· Nominated for INFA STAR award for my expertise and contribution as a team player.
Languages and Technologies
ď‚· Languages: Java; Python; C++; C; Scala
ď‚· Technologies: Spark; SOLR; Spring; Hibernate; Oracle; SQL Server; DB2; MYSQL; H2; HSQLDB; Titan

Santhosh_Resume Current

  • 1.
    700 Health SciencesDrive, Chapin B 2022, Stony Brook NY 11790 SA N T H O S H K U M A R (631) 974-4880 santhoshkumarml@gmail.com EMPLOYMENT Software Engineer, Intern Informatica Summer 2015 Enterprise Information Catalogue (EIC): (Generic Platform using SOLR, Spark And Titan DB)  Developed Spark Jobs (Scala) to ingest data in Apache SOLR from Titan Graph DB  Ranking algorithm to representatively sample values from multiple exchange documents (EIC Data Format)  Developed a Distributed Notification Service using Zookeeper and support faceted Search using Apache Solr. Graduate Researcher DATA lab, Stony Brook Univ Spring 2015 - Current  Working under Prof. Leman Akoglu on multi dimensional Time Series Anomaly Mining Techniques (SDAR, AR, Granger) on a variety of behavioral features and use this info in graph to effectively find the spammers. Senior Software Developer Informatica Apr 2013 – July 2014  Developed a Query Capability Framework for Java Modeled Objects written using hibernate to query modeled objects instead of querying databases rows/columns.  Developed an analytic service (Phone Home Analytics) to find interesting information from the metadata which is leveraged to find frequently used business objects by customers.  Led and mentored interns/Junior members on: Optimizing Serialization/Deserialization using Google’s Proto buffer and Apache’s Thrift and Enhancing the metadata upgrade framework for destructive model changes.  Developed a Rule Manager Capability to define rules to manage metadata and auto governance. Software Developer Informatica July 2011 – Apr 2013  Modeling framework to generate platform agnostic code with services like object traverser/cloner/serializer.  Developed a bulk fetch algorithm using locks/transactions to bulk retrieve objects from databases.  Used idle nodes in the network to perform time consuming developer builds using a load balancer. EDUCATION Stony Brook, NY State Univ of New York, Stony Brook Fall 2014 – Dec 2015*  Masters in Computer and Information Science. GPA: 3.71  Graduate Coursework: Data mining, Machine Learning, Natural Language Processing, Artificial Intelligence, Operating Systems, Asynchronous Systems, Analysis of Algorithms and Computational Linguistics (Current). Chennai, India College of Engg., Guindy, Anna Univ Aug 2007 – Apr 2011  Bachelor of Engineering in Computer Science. GPA: 8.7/10.0 ACADEMIC PROJECTS  Detecting Fake Reviews in YELP: Enhanced LOOPY BELIEF PROPAGATION of Markov Network by temporal separation of User-Business Graph and find fake reviews in yelp.com reviews.  Predicting Literary Success using Stylometry: Predict the success of literary novels using statistical stylometry by a variety of syntactic and semantic linguistic features.  SBUnix: Implemented a 1:1 threading model with thread library and sync primitives (Mutex, Futex). Also implemented the following: Memory management, Process Scheduling, File system Ops, Network Driver.  “Chain Replication for Supporting High Throughput and Availability” for fault tolerant bank servers.  Implemented Machine Learning ( Naive Bayes, Linear, Logistic Regression, Ada Boost, KNN, Gaussian mixture models (EM), Decision Tree and K Means) and AI Algorithms (BFS, DFS, UCS, A*, CSP) for PacMan and Sudoku. ADDITIONAL EXPERIENCE AND AWARDS  APPRECIATION AWARD for prototyping /implementing solution for cycle elimination in build architecture.  Nominated for INFA STAR award for my expertise and contribution as a team player. Languages and Technologies  Languages: Java; Python; C++; C; Scala  Technologies: Spark; SOLR; Spring; Hibernate; Oracle; SQL Server; DB2; MYSQL; H2; HSQLDB; Titan