Shirley Zang is seeking an entry-level data scientist position with strong skills in Python, R, MATLAB and SQL. She has a Master of Science degree in Biomedical Engineering from Tufts University and a Bachelor's degree in Computer Science from Northeastern University in China. Her professional experience includes research assisting on diffusion optical imaging at Tufts and intern work as a radiologic technologist in China. Some of her projects include using Bayesian probability to analyze local attitudes from Craigslist data and improving classifier performance using adaptive boosting algorithms.
This infographic tells the story of how leaps in techniques/capabilities net big results for supply chain professionals. For example, did you know that the combination of demand modeling and machine learning could reduce forecast errors by 33%?
This infographic tells the story of how leaps in techniques/capabilities net big results for supply chain professionals. For example, did you know that the combination of demand modeling and machine learning could reduce forecast errors by 33%?
How much of a difference can you really make, simply by liking an organizatio...meganhartwick
How much of a difference can you really make, simply by liking an organization's Facebook page or retweeting something they post? The answer may be more than you think.
How much of a difference can you really make, simply by liking an organizatio...meganhartwick
How much of a difference can you really make, simply by liking an organization's Facebook page or retweeting something they post? The answer may be more than you think.
Proceedings of the 2015 Industrial and Systems Engineering Res.docxwkyra78
Proceedings of the 2015 Industrial and Systems Engineering Research Conference
S. Cetinkaya and J. K. Ryan, eds.
Use of Symbolic Regression for Lean Six Sigma Projects
Daniel Moreno-Sanchez, MSc.
Jacobo Tijerina-Aguilera, MSc.
Universidad de Monterrey
San Pedro Garza Garcia, NL 66238, Mexico
Arlethe Yari Aguilar-Villarreal, MEng.
Universidad Autonoma de Nuevo Leon
San Nicolas de los Garza, NL 66451, Mexico
Abstract
Lean Six Sigma projects and the quality engineering profession have to deal with an extensive selection of tools
most of them requiring specialized training. The increased availability of standard statistical software motivates the
use of advanced data science techniques to identify relationships between potential causes and project metrics. In
these circumstances, Symbolic Regression has received increased attention from researchers and practitioners to
uncover the intrinsic relationships hidden within complex data without requiring specialized training for its
implementation. The objective of this paper is to evaluate the advantages and drawbacks of using computer assisted
Symbolic Regression within the Analyze phase of a Lean Six Sigma project. An application of this approach in a
service industry project is also presented.
Keywords
Symbolic Regression, Data Science, Lean Six Sigma
1. Introduction
Lean Six Sigma (LSS) has become a well-known hybrid methodology for quality and productivity improvement in
organizations. Its wide adoption in several industries has shaped Process Innovation and Operational Excellence
initiatives, enabling LSS to become a main topic in quality practitioner sites of interest [1], recognized Six Sigma
(SS) certification body of knowledge contents [2], and professional society conferences [3].
However LSS projects and the quality engineering profession have to deal with an extensive selection of tools most
of them requiring specialized training. To assist LSS practitioners it is common to categorize tools based on the
traditional DMAIC model which stands for Define, Measure, Analyze, Improve, and Control phases. Table 1
presents an overview of the main tools that are commonly used in each phase of a LSS project, allowing team
members to progressively develop an understanding between realizing each phase’s intent and how the selected
tools can contribute to that purpose.
This paper focuses on the Analyze phase where tools for statistical model building are most likely to be selected.
The increased availability of standard statistical software motivates the use of advanced data science techniques to
identify relationships between potential causes and project metrics. In these circumstances Symbolic Regression
(SR) has received increased attention from researchers and practitioners even though SR is still in an early stage of
commercial availability.
The objective of this paper is to evaluate the advantages and drawbacks o ...
1. Shirley (Xuan) Zang
41 Bonner Ave, Medford, MA 02155 | 732.485.0906 | xuan.zang@tufts.edu
Objective: Dedicated and motivated engineering graduate seeking entry level
data scientist position
EDUCATION
Tufts University, Medford, MA, USA Expected May 2016
Candidate for Master of Science, Dept. of Biomedical Engineering, GPA 3.72/4.0
(Focused on medical signal processing and image analysis)
Northeastern University, Shenyang, P.R. China Jun 2014
Bachelor of Engineering, Dept. of Computer Science, GPA 3.9/4.0
SKILLS
Programming Languages: Proficient in Python, R, MATLAB and SQL. Familiar with C and Java
Tools & Platforms: Advanced user of MS Office applications, SAS, PyLab, SQLite and Google Maps API
Related Courses: Machine Learning, Probability and Statistics, Database, Algorithm, Financial Management
PROFESSIONAL EXPERIENCE
Research Assistant Feb 2015 - Present
Diffusion Optical Imaging Laboratory in Tufts University
Applied diffusion theory, Hilbert Transform, modified Beer-Lambert law and inverse model based on Levenberg-
Marquardt algorithm to quantify and differentiate cerebral function in multiple layers of brain using MATLAB
Validated result through Monte Carlo simulations, ANOVA test, repeated experiments on phantoms and human
Coauthored two papers to be published in SPIE conference and PHOTOPTICS conference
Intern Radiologic Technologist Dec 2013 - Mar 2014
Department of Imaging and Radio Diagnosis in Huaiyin Hospital (China)
Assisted with radiologic technologists to conduct CT and MRI imaging on patients
Analyzed CT/MRI images and cooperated with the experts in imaging and nuclear medicine to identify diseases
PROJECT EXPERIENCE
Revealing Local Attitudes from Craigslist Based on Bayesian Probability Theory Dec 2015
Designed an UI in Python that shows most descriptive word according to the city name user inputs
Imported RSS feeds from Craigslist and transformed them into tidy format for further training and testing
Extracted features from above dataset in Python by estimating conditional probability of each feature word for
that city based on Bayesian theory
Improving Simple Classifier Performance with the Adaptive Boosting Meta-algorithm Nov 2015
Implemented adaptive boost algorithm in Python through iterative calculating weighted sum of all simple
classifiers and resigning weights to initial training set based on classifying accuracy until error below 1%
Trained and predicted on dataset about survival rate of horse with colic, accuracy improved from 66% to 80%
Cough Detection Algorithm for Monitoring Patient Recovery from Pulmonary Tuberculosis Aug 2015
Examined acoustic signals in time & frequency domain and combined two metrics (change in energy at start of
cough event and amplitude ratio of peak to average) for better discriminating true cough events from noise
Generated ROC curves to get better insight of performance with different threshold values
Business Plan for Flower Tea Company (first prize among all the business plan) Apr 2015
Analyzed market demand and viable competitors comprehensively and developed detailed business plan
Created 5 year pro forma including income statements, balance sheets, cash flows and investor return analysis
Presented the business plan to the venture capitalist and professors from Harvard Business School
HONORS
Tuition Scholarship from Tufts University (merit based) Sep 2014 - May 2016
National Scholarship (the highest level of scholarship in China) Oct 2012
First Prize at National Math Contest for College Students Sep 2012