SlideShare a Scribd company logo
1 of 2
Download to read offline
Mengling Hettinger
6501 Silver Ln
Plano, TX 75023
menglingzhang@gmail.com
Summary
In applying for this position, I will be utilizing knowledge and problem solving skills that I acquired through my
career at AT&T big data group and PhD degree in physics. I developed in-depth computer programming and
statistical analysis skills during my working and learning experience. Variety of supervised and unsupervised
projects I have worked on provide me with hands-on experience in applying machine learning techniques to both
standard and large data sets. My extensive experience working on real data analysis using tools like R, python,
Pig, Hive and H2o, my academic background, as well as the communication skills I developed from teaching
physics courses to non- physics major students, will enable me to bring a full range of skills to the position.
Employment History
August 2014 – Present: Professional Data Scientist at AT&T Big Data
May 2014 – August 2014: Data Science Intern at AT&T Big Data
August 2009 – April 2014: Research assistant/Teaching assistant at Michigan State University
Education
September 2009 – December 2014: PhD in physics at Michigan State University
Thesis: “Fluctuations in superconductors above paramagnetic limit”
Summer School
July 8th – 10th, 2013 VSCSE Data Intensive Summer School
Core Competencies
Strategic Thinking: From rich data sets, be able to create and implementing the strategic direction of the
company which leads to the growth including revenue and profits.
Modeling: Design and implement statistical/predictive models using cutting edge algorithms to predict demand,
risk and price elasticity, find association rules and implement cluster analysis
Analytics: Utilize analytical applications like R or python or data mining packages to identify trends and
relationships between different pieces of data, draw appropriate conclusions and translate analytical findings into
risk management and marketing strategies that drive value.
Data munging: Experience with collecting, cleaning, augmenting and transforming data using scripting
languages such as Python. Semantic technologies are used to discover structures from unstructured data.
Communications and Project Management: Capable of turning dry analysis into plots which are informative
and easy to visualize. Collaborate with teams to develop and support data platform and analyses.
Professional Skills
Programming languages : Python, R, Matlab, Java, SQL
Large Data: NoSql, Hadoop, Pig, Hive, H2O
Platform:Unix
Statistics: statistical model, data analysis, Bayesian statistical methods
Machine learning and data mining: predictive modeling, cluster analysis, association analysis, anomaly
detection
Data Mining Software: H2o, Weka, SVMlight, LibSVM, Cluster (R package), igraph (R package)
Mathematical physics: linear algebra, differential equation, Fourier transformation, calculus
Data Mining Experience
list of a couple examples that I worked on (https://github.com/MenglingHettinger):
KDD 2012 Weibo data: I use user profile and item keyword information, calculate the distance between the
user's keyword and item's keyword, extract user's personal information used as features to make predictions for a
given item that recommended to a user. The total training set has 73 million records and the testing dataset is 1
million records. The correct prediction rate is 64%.
Amazon co-purchasing network data from the Stanford Large Network Dataset Collection are used to
reproduce the groups of each products. 548,552 products in meta data and 403,394 nodes and 33,873,398 edges
in co-purchase data are analyzed. Due to the large dataset, K means and CLARA algorithms are used in both
pure link analysis and additional features extracted from the product meta-data.
AT&T Fleet Preventative Maintenance: AT&T has one of the largest fleet in the nation with 75000 vehicles.
We collect demand repair data (>2 million records/year), weather data and refueling data daily and sensor data
every 2 minutes (10 Gb/day). Using these data sources, I have successfully built battery failure prediction model
using the combination of random forest and time series analysis with an 82% accuracy. I also built a integrated
model to predict all the the subsystem failures for a given vehicle. I also helped my coworkers to improve the
brake system model, fuel systems and other parts of the analysis.
Courses Related to Data Science
CSE 881 Data Mining : Predictive Modeling (Classification)/Association Analysis/Cluster Analysis/Anomaly
Detection/Network Mining
CSE 891 Computational Techniques for Large Scale Data Analysis: Programming skills and tools for
collecting, storing, querying and analyzing large scale data/General concepts and methods for large-scale
computational data analysis
Qualification/Certification
August 2013: Machine Learning
Organization: PROVIDED BY STANFORD UNIVERSITY THROUGH COURSERA INC.
May 2014: The Data Scientist's Toolbox
Organization: PROVIDED BY STANFORD UNIVERSITY THROUGH COURSERA INC.
May 2014: R Programming
Organization: PROVIDED BY STANFORD UNIVERSITY THROUGH COURSERA INC.
May 2014: Getting and Cleaning Data
Organization: PROVIDED BY STANFORD UNIVERSITY THROUGH COURSERA INC.
Immigration / Work Status
Permanent Resident - United States

More Related Content

What's hot

Using semantic web technologies for exploratory olap a survey
Using semantic web technologies for exploratory olap a surveyUsing semantic web technologies for exploratory olap a survey
Using semantic web technologies for exploratory olap a survey
ieeepondy
 
20170110_IOuellette_CV
20170110_IOuellette_CV20170110_IOuellette_CV
20170110_IOuellette_CV
Ian Ouellette
 
Finding statistics2
Finding statistics2Finding statistics2
Finding statistics2
lmk7
 

What's hot (20)

Data science using r multisoft systems
Data science using r  multisoft systemsData science using r  multisoft systems
Data science using r multisoft systems
 
Big data landscape map collection by aibdp
Big data landscape map collection by aibdpBig data landscape map collection by aibdp
Big data landscape map collection by aibdp
 
Data analysis with pandas and scikit-learn
Data analysis with pandas and scikit-learnData analysis with pandas and scikit-learn
Data analysis with pandas and scikit-learn
 
Using semantic web technologies for exploratory olap a survey
Using semantic web technologies for exploratory olap a surveyUsing semantic web technologies for exploratory olap a survey
Using semantic web technologies for exploratory olap a survey
 
An Efficient Approach for Clustering High Dimensional Data
An Efficient Approach for Clustering High Dimensional DataAn Efficient Approach for Clustering High Dimensional Data
An Efficient Approach for Clustering High Dimensional Data
 
Quick presentation for the OpenML workshop in Eindhoven 2014
Quick presentation for the OpenML workshop in Eindhoven 2014Quick presentation for the OpenML workshop in Eindhoven 2014
Quick presentation for the OpenML workshop in Eindhoven 2014
 
Python tool to data analysis and artificial intelligence
Python tool to data analysis and artificial intelligencePython tool to data analysis and artificial intelligence
Python tool to data analysis and artificial intelligence
 
Data lake
Data lakeData lake
Data lake
 
Beyond stream analytics
Beyond stream analyticsBeyond stream analytics
Beyond stream analytics
 
applications and advantages of python
applications and advantages of pythonapplications and advantages of python
applications and advantages of python
 
ComputableFacts: a Secure System to Store Documents and Graphs
ComputableFacts: a Secure System to Store Documents and GraphsComputableFacts: a Secure System to Store Documents and Graphs
ComputableFacts: a Secure System to Store Documents and Graphs
 
20170110_IOuellette_CV
20170110_IOuellette_CV20170110_IOuellette_CV
20170110_IOuellette_CV
 
Paper id 26201475
Paper id 26201475Paper id 26201475
Paper id 26201475
 
Event Data - Crossref LIVE South Africa
Event Data - Crossref LIVE South Africa Event Data - Crossref LIVE South Africa
Event Data - Crossref LIVE South Africa
 
Fundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopFundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and Hadoop
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
NitinTak_A3
NitinTak_A3NitinTak_A3
NitinTak_A3
 
Finding statistics2
Finding statistics2Finding statistics2
Finding statistics2
 
Resume_Weixiang Ding
Resume_Weixiang DingResume_Weixiang Ding
Resume_Weixiang Ding
 
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
 

Similar to resume_MH

CV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCLCV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCL
Han Yang
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data Mining
Editor IJCATR
 

Similar to resume_MH (20)

KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
 
Using R for Classification of Large Social Network Data
Using R for Classification of Large Social Network DataUsing R for Classification of Large Social Network Data
Using R for Classification of Large Social Network Data
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdf
 
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...
IRJET-  	  Comparative Analysis of Various Tools for Data Mining and Big Data...IRJET-  	  Comparative Analysis of Various Tools for Data Mining and Big Data...
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...
 
Data science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxData science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptx
 
BIG DATA and USE CASES
BIG DATA and USE CASESBIG DATA and USE CASES
BIG DATA and USE CASES
 
Data Science and Analysis.pptx
Data Science and Analysis.pptxData Science and Analysis.pptx
Data Science and Analysis.pptx
 
Ashisdeb analytics new_cv_doc
Ashisdeb analytics new_cv_docAshisdeb analytics new_cv_doc
Ashisdeb analytics new_cv_doc
 
Ashisdeb analytics new_cv_doc
Ashisdeb analytics new_cv_docAshisdeb analytics new_cv_doc
Ashisdeb analytics new_cv_doc
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First Course
 
Official resume titash_mandal_
Official resume titash_mandal_Official resume titash_mandal_
Official resume titash_mandal_
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
DataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdfDataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdf
 
Survey on MapReduce in Big Data Clustering using Machine Learning Algorithms
Survey on MapReduce in Big Data Clustering using Machine Learning AlgorithmsSurvey on MapReduce in Big Data Clustering using Machine Learning Algorithms
Survey on MapReduce in Big Data Clustering using Machine Learning Algorithms
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data Analytics
 
Data dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLData dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNL
 
useR 2014 jskim
useR 2014 jskimuseR 2014 jskim
useR 2014 jskim
 
CV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCLCV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCL
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data Mining
 

resume_MH

  • 1. Mengling Hettinger 6501 Silver Ln Plano, TX 75023 menglingzhang@gmail.com Summary In applying for this position, I will be utilizing knowledge and problem solving skills that I acquired through my career at AT&T big data group and PhD degree in physics. I developed in-depth computer programming and statistical analysis skills during my working and learning experience. Variety of supervised and unsupervised projects I have worked on provide me with hands-on experience in applying machine learning techniques to both standard and large data sets. My extensive experience working on real data analysis using tools like R, python, Pig, Hive and H2o, my academic background, as well as the communication skills I developed from teaching physics courses to non- physics major students, will enable me to bring a full range of skills to the position. Employment History August 2014 – Present: Professional Data Scientist at AT&T Big Data May 2014 – August 2014: Data Science Intern at AT&T Big Data August 2009 – April 2014: Research assistant/Teaching assistant at Michigan State University Education September 2009 – December 2014: PhD in physics at Michigan State University Thesis: “Fluctuations in superconductors above paramagnetic limit” Summer School July 8th – 10th, 2013 VSCSE Data Intensive Summer School Core Competencies Strategic Thinking: From rich data sets, be able to create and implementing the strategic direction of the company which leads to the growth including revenue and profits. Modeling: Design and implement statistical/predictive models using cutting edge algorithms to predict demand, risk and price elasticity, find association rules and implement cluster analysis Analytics: Utilize analytical applications like R or python or data mining packages to identify trends and relationships between different pieces of data, draw appropriate conclusions and translate analytical findings into risk management and marketing strategies that drive value. Data munging: Experience with collecting, cleaning, augmenting and transforming data using scripting languages such as Python. Semantic technologies are used to discover structures from unstructured data. Communications and Project Management: Capable of turning dry analysis into plots which are informative and easy to visualize. Collaborate with teams to develop and support data platform and analyses. Professional Skills Programming languages : Python, R, Matlab, Java, SQL Large Data: NoSql, Hadoop, Pig, Hive, H2O
  • 2. Platform:Unix Statistics: statistical model, data analysis, Bayesian statistical methods Machine learning and data mining: predictive modeling, cluster analysis, association analysis, anomaly detection Data Mining Software: H2o, Weka, SVMlight, LibSVM, Cluster (R package), igraph (R package) Mathematical physics: linear algebra, differential equation, Fourier transformation, calculus Data Mining Experience list of a couple examples that I worked on (https://github.com/MenglingHettinger): KDD 2012 Weibo data: I use user profile and item keyword information, calculate the distance between the user's keyword and item's keyword, extract user's personal information used as features to make predictions for a given item that recommended to a user. The total training set has 73 million records and the testing dataset is 1 million records. The correct prediction rate is 64%. Amazon co-purchasing network data from the Stanford Large Network Dataset Collection are used to reproduce the groups of each products. 548,552 products in meta data and 403,394 nodes and 33,873,398 edges in co-purchase data are analyzed. Due to the large dataset, K means and CLARA algorithms are used in both pure link analysis and additional features extracted from the product meta-data. AT&T Fleet Preventative Maintenance: AT&T has one of the largest fleet in the nation with 75000 vehicles. We collect demand repair data (>2 million records/year), weather data and refueling data daily and sensor data every 2 minutes (10 Gb/day). Using these data sources, I have successfully built battery failure prediction model using the combination of random forest and time series analysis with an 82% accuracy. I also built a integrated model to predict all the the subsystem failures for a given vehicle. I also helped my coworkers to improve the brake system model, fuel systems and other parts of the analysis. Courses Related to Data Science CSE 881 Data Mining : Predictive Modeling (Classification)/Association Analysis/Cluster Analysis/Anomaly Detection/Network Mining CSE 891 Computational Techniques for Large Scale Data Analysis: Programming skills and tools for collecting, storing, querying and analyzing large scale data/General concepts and methods for large-scale computational data analysis Qualification/Certification August 2013: Machine Learning Organization: PROVIDED BY STANFORD UNIVERSITY THROUGH COURSERA INC. May 2014: The Data Scientist's Toolbox Organization: PROVIDED BY STANFORD UNIVERSITY THROUGH COURSERA INC. May 2014: R Programming Organization: PROVIDED BY STANFORD UNIVERSITY THROUGH COURSERA INC. May 2014: Getting and Cleaning Data Organization: PROVIDED BY STANFORD UNIVERSITY THROUGH COURSERA INC. Immigration / Work Status Permanent Resident - United States