SlideShare a Scribd company logo
PETER YUN- SHAO SUNG
112-20 72th Drive, Apt D13 U.S. Permanent Resident 347.393.9026
Forest Hills, NY, 11375 LinkedIn GitHub GoogleScholar yss265@nyu.edu
EDUCATION
NEW YORK UNIVERSITY, COURANT INSTITUTE 2016
Master of Science, Degree in Computer Science, GPA: 3.74
CORNELL UNIVERSITY 2010
Master of Engineering, Degree in Biomedical Engineering, GPA: 3.5
NATIONAL TAIWAN UNIVERSITY 2007
Bachelor of Science, Degree in Engineering Science/Ocean Engineering, GPA: 3.2
RELATED COURSES
Fundamental Algorithms Computational Machine Learning Operation System Heuristic Problem Solving(link)
Real Time Big Data(link) Production Quality Software Deep Learning Search Engine Architecture(link)
PROFESSIONAL EXPERIENCES
Bioinformatic Analyst 2010-present
Department of Pathology, Memorial Sloan-Kettering Cancer Center, NYC
• Designed and automated novel analysis pipeline for genome mutation diagnosis from data (>100M reads) of clinical tumors
• Published over 30 papers on identifying novel genomic signatures leading to sarcoma development
Software Engineer Intern 2015-2015 Sep
Orderhood, LLC (link)
• Developed on-demand delivery service by Node and React. Implemented and realtime rendered runner best route
• Achieved 52% dashboard efficiency improvement by backend refactoring and developed API for robust RPC handling
Software Engineer 2014-2015
Massive Bio, LLC (link)
• Designed and developed open-source software modules for various steps in a standard bioinformatics pipeline
• Designed and benchmarked performance of NGS tools for mutation detection up to 99% of sensitivity and 90% specificity
Bioinformatician 2013-2014
Institute of Computational Biomedicine, Weill Cornell Medical College, NYC
• Improved C++ open soured tools for handling > 100M reads and developed pipelines for >50 times efficiency increment
MACHINE LEARNING PROJECTS
Learning on Music Structure with Spectral Clustering (link)
• Invested a novel model for training machine to identify batches of musical structures based on Laplacian spectral clustering
Music Genre Classification (link)
• Invented novel scatter feature extraction with VLAD for efficient learning, 82% accuracy achieved than original 60%
Search Engine Based Movie Recommendation System (link)
• Implemented and deployed our MapReduce method on AWS wrapped with self-designed scalable file system
• Build RESTful website classify user preferences and made recommendation accordingly
Big Data for Stock Analysis (link)
• MapReduce on US stock correlation analysis, and sentiment analysis for news from the past 5 years
SELECTED PUBLICATIONS, over 33 publications Complete	list
1. Identification of Recurrent NAB2-STAT6 Gene Fusions in Solitary Fibrous Tumor/Hemangiopericytoma by Clinical
Sequencing. Natural Genetics 45, 180–185 (2013). (Impact factor: 35.5)
2. Monoclonality of Multifocal Epithelioid Hemangioendothelioma of the Liver by Analysis of WWTR1-CAMTA1
Breakpoints. Cancer Genetics. 2012 (Equally Contributing Author)
AWARD AND ACHIEVEMENTS
Tuition reimbursement, Memorial Sloan-Kettering Cancer Center, NYC 2013
Top 0.97% over 70,000 test takers on Mathematic of Department Required Test, Taiwan 2004
Member in the school team of International Physics Olympiad, Taiwan 2003
LEADERSHIP
Student Leader 2005-2006
Ten Outstanding Young Leaders Foundation, Taipei, Taiwan
• Conducted workshops and reported achievements to board of directors including president in Legislative Yuan of Taiwan
EXPERIENCE AND SKILLS
Technology: Python, C++, Torch, JavaScript, React, Node, Perl, R, FLUX, Shell, MapReduce, EC2, Beanstalk, S3, Hive, Hadoop
Skills: CNN, RNN, LSTM, GAN, VLAD, SGD, kmeans, kmeans++, Principal Component Analysis
Language: Mandarin (fluent), English (fluent), Taiwanese (fluent)
1

More Related Content

What's hot

UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...CTSI at UCSF
 
Resume
ResumeResume
Introduction to Research methodology: Orientation for Doctoral Program Course...
Introduction to Research methodology: Orientation for Doctoral Program Course...Introduction to Research methodology: Orientation for Doctoral Program Course...
Introduction to Research methodology: Orientation for Doctoral Program Course...
niloysarkar
 
UKON 2014
UKON 2014UKON 2014
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...
arx-deidentifier
 
Internet Search: the past, present and the future
Internet Search: the past, present and the futureInternet Search: the past, present and the future
Internet Search: the past, present and the future
PayamBarnaghi
 
JPROT-TargetedProteomics-CallforPapers
JPROT-TargetedProteomics-CallforPapersJPROT-TargetedProteomics-CallforPapers
JPROT-TargetedProteomics-CallforPapersmanrai1953
 
GWAS in a model organism: Arabidopsis thaliana
GWAS in a model organism: Arabidopsis thalianaGWAS in a model organism: Arabidopsis thaliana
GWAS in a model organism: Arabidopsis thaliana
Golden Helix Inc
 
Pacific Research Platform Application Drivers
Pacific Research Platform Application DriversPacific Research Platform Application Drivers
Pacific Research Platform Application Drivers
Larry Smarr
 
Master's Thesis - deep genomics: harnessing the power of deep neural networks...
Master's Thesis - deep genomics: harnessing the power of deep neural networks...Master's Thesis - deep genomics: harnessing the power of deep neural networks...
Master's Thesis - deep genomics: harnessing the power of deep neural networks...
Enrico Busto
 
Postdoctoral Position in the Translational Glycomaterials Laboratory
Postdoctoral Position in the Translational Glycomaterials LaboratoryPostdoctoral Position in the Translational Glycomaterials Laboratory
Postdoctoral Position in the Translational Glycomaterials Laboratory
Lohitash Karumbaiah
 
2018 NF Conference Cutaneous Neurofibroma
2018 NF Conference Cutaneous Neurofibroma2018 NF Conference Cutaneous Neurofibroma
2018 NF Conference Cutaneous Neurofibroma
Robert Allaway
 
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Remedy Informatics
 
Assessing Drug Safety Using AI
Assessing Drug Safety Using AIAssessing Drug Safety Using AI
Assessing Drug Safety Using AI
Databricks
 
Big Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and ChallengesBig Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and Challenges
Matthieu Schapranow
 
Experimenta
ExperimentaExperimenta
Experimenta
GuttiPavan
 
Transparency in the Data Supply Chain
Transparency in the Data Supply ChainTransparency in the Data Supply Chain
Transparency in the Data Supply Chain
Paul Groth
 

What's hot (19)

UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
UCSF Informatics Day 2014 - Ida Sim, "Informatics Technologies: From a Data-C...
 
Resume
ResumeResume
Resume
 
Introduction to Research methodology: Orientation for Doctoral Program Course...
Introduction to Research methodology: Orientation for Doctoral Program Course...Introduction to Research methodology: Orientation for Doctoral Program Course...
Introduction to Research methodology: Orientation for Doctoral Program Course...
 
UKON 2014
UKON 2014UKON 2014
UKON 2014
 
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...
A Tool for Optimizing De-Identified Health Data for Use in Statistical Classi...
 
Internet Search: the past, present and the future
Internet Search: the past, present and the futureInternet Search: the past, present and the future
Internet Search: the past, present and the future
 
Beyond the PDF 2, 2013
Beyond the PDF 2, 2013Beyond the PDF 2, 2013
Beyond the PDF 2, 2013
 
JPROT-TargetedProteomics-CallforPapers
JPROT-TargetedProteomics-CallforPapersJPROT-TargetedProteomics-CallforPapers
JPROT-TargetedProteomics-CallforPapers
 
GWAS in a model organism: Arabidopsis thaliana
GWAS in a model organism: Arabidopsis thalianaGWAS in a model organism: Arabidopsis thaliana
GWAS in a model organism: Arabidopsis thaliana
 
Pacific Research Platform Application Drivers
Pacific Research Platform Application DriversPacific Research Platform Application Drivers
Pacific Research Platform Application Drivers
 
CON
CONCON
CON
 
Master's Thesis - deep genomics: harnessing the power of deep neural networks...
Master's Thesis - deep genomics: harnessing the power of deep neural networks...Master's Thesis - deep genomics: harnessing the power of deep neural networks...
Master's Thesis - deep genomics: harnessing the power of deep neural networks...
 
Postdoctoral Position in the Translational Glycomaterials Laboratory
Postdoctoral Position in the Translational Glycomaterials LaboratoryPostdoctoral Position in the Translational Glycomaterials Laboratory
Postdoctoral Position in the Translational Glycomaterials Laboratory
 
2018 NF Conference Cutaneous Neurofibroma
2018 NF Conference Cutaneous Neurofibroma2018 NF Conference Cutaneous Neurofibroma
2018 NF Conference Cutaneous Neurofibroma
 
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
Ontology-Driven Clinical Intelligence: Removing Data Barriers for Cross-Disci...
 
Assessing Drug Safety Using AI
Assessing Drug Safety Using AIAssessing Drug Safety Using AI
Assessing Drug Safety Using AI
 
Big Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and ChallengesBig Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and Challenges
 
Experimenta
ExperimentaExperimenta
Experimenta
 
Transparency in the Data Supply Chain
Transparency in the Data Supply ChainTransparency in the Data Supply Chain
Transparency in the Data Supply Chain
 

Similar to Peter (Yun-shao) Sung's Resume 2016III

D1803012022
D1803012022D1803012022
D1803012022
IOSR Journals
 
CV_10/17
CV_10/17CV_10/17
CV_alexander_venzin_10_2016
CV_alexander_venzin_10_2016CV_alexander_venzin_10_2016
CV_alexander_venzin_10_2016Alexander Venzin
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Golden Helix Inc
 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker, Inc.
 
Ijricit 01-002 enhanced replica detection in short time for large data sets
Ijricit 01-002 enhanced replica detection in  short time for large data setsIjricit 01-002 enhanced replica detection in  short time for large data sets
Ijricit 01-002 enhanced replica detection in short time for large data sets
Ijripublishers Ijri
 
012517 ResumeJH Amex DS-ML
012517 ResumeJH Amex DS-ML012517 ResumeJH Amex DS-ML
012517 ResumeJH Amex DS-MLJeremy Hadidjojo
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Carole Goble
 
EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD”...
EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD”...EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD”...
EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD”...
ChemAxon
 
BioAssay Research Database Presentation at the Chem Axon UGM 2013
BioAssay Research Database Presentation at the Chem Axon UGM 2013BioAssay Research Database Presentation at the Chem Axon UGM 2013
BioAssay Research Database Presentation at the Chem Axon UGM 2013Andrea de Souza
 
Prote-OMIC Data Analysis and Visualization
Prote-OMIC Data Analysis and VisualizationProte-OMIC Data Analysis and Visualization
Prote-OMIC Data Analysis and Visualization
Dmitry Grapov
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiers
Benjamin Good
 
Opportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckOpportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deck
Pistoia Alliance
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
Carole Goble
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research
Yannick Wurm
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
ExternalEvents
 
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...European School of Oncology
 
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
Dmitry Grapov
 

Similar to Peter (Yun-shao) Sung's Resume 2016III (20)

D1803012022
D1803012022D1803012022
D1803012022
 
Cv long
Cv longCv long
Cv long
 
CV_10/17
CV_10/17CV_10/17
CV_10/17
 
Resume
ResumeResume
Resume
 
CV_alexander_venzin_10_2016
CV_alexander_venzin_10_2016CV_alexander_venzin_10_2016
CV_alexander_venzin_10_2016
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce Hoff
 
Ijricit 01-002 enhanced replica detection in short time for large data sets
Ijricit 01-002 enhanced replica detection in  short time for large data setsIjricit 01-002 enhanced replica detection in  short time for large data sets
Ijricit 01-002 enhanced replica detection in short time for large data sets
 
012517 ResumeJH Amex DS-ML
012517 ResumeJH Amex DS-ML012517 ResumeJH Amex DS-ML
012517 ResumeJH Amex DS-ML
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD”...
EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD”...EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD”...
EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD”...
 
BioAssay Research Database Presentation at the Chem Axon UGM 2013
BioAssay Research Database Presentation at the Chem Axon UGM 2013BioAssay Research Database Presentation at the Chem Axon UGM 2013
BioAssay Research Database Presentation at the Chem Axon UGM 2013
 
Prote-OMIC Data Analysis and Visualization
Prote-OMIC Data Analysis and VisualizationProte-OMIC Data Analysis and Visualization
Prote-OMIC Data Analysis and Visualization
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiers
 
Opportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckOpportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deck
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
 
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
 

Peter (Yun-shao) Sung's Resume 2016III

  • 1. PETER YUN- SHAO SUNG 112-20 72th Drive, Apt D13 U.S. Permanent Resident 347.393.9026 Forest Hills, NY, 11375 LinkedIn GitHub GoogleScholar yss265@nyu.edu EDUCATION NEW YORK UNIVERSITY, COURANT INSTITUTE 2016 Master of Science, Degree in Computer Science, GPA: 3.74 CORNELL UNIVERSITY 2010 Master of Engineering, Degree in Biomedical Engineering, GPA: 3.5 NATIONAL TAIWAN UNIVERSITY 2007 Bachelor of Science, Degree in Engineering Science/Ocean Engineering, GPA: 3.2 RELATED COURSES Fundamental Algorithms Computational Machine Learning Operation System Heuristic Problem Solving(link) Real Time Big Data(link) Production Quality Software Deep Learning Search Engine Architecture(link) PROFESSIONAL EXPERIENCES Bioinformatic Analyst 2010-present Department of Pathology, Memorial Sloan-Kettering Cancer Center, NYC • Designed and automated novel analysis pipeline for genome mutation diagnosis from data (>100M reads) of clinical tumors • Published over 30 papers on identifying novel genomic signatures leading to sarcoma development Software Engineer Intern 2015-2015 Sep Orderhood, LLC (link) • Developed on-demand delivery service by Node and React. Implemented and realtime rendered runner best route • Achieved 52% dashboard efficiency improvement by backend refactoring and developed API for robust RPC handling Software Engineer 2014-2015 Massive Bio, LLC (link) • Designed and developed open-source software modules for various steps in a standard bioinformatics pipeline • Designed and benchmarked performance of NGS tools for mutation detection up to 99% of sensitivity and 90% specificity Bioinformatician 2013-2014 Institute of Computational Biomedicine, Weill Cornell Medical College, NYC • Improved C++ open soured tools for handling > 100M reads and developed pipelines for >50 times efficiency increment MACHINE LEARNING PROJECTS Learning on Music Structure with Spectral Clustering (link) • Invested a novel model for training machine to identify batches of musical structures based on Laplacian spectral clustering Music Genre Classification (link) • Invented novel scatter feature extraction with VLAD for efficient learning, 82% accuracy achieved than original 60% Search Engine Based Movie Recommendation System (link) • Implemented and deployed our MapReduce method on AWS wrapped with self-designed scalable file system • Build RESTful website classify user preferences and made recommendation accordingly Big Data for Stock Analysis (link) • MapReduce on US stock correlation analysis, and sentiment analysis for news from the past 5 years SELECTED PUBLICATIONS, over 33 publications Complete list 1. Identification of Recurrent NAB2-STAT6 Gene Fusions in Solitary Fibrous Tumor/Hemangiopericytoma by Clinical Sequencing. Natural Genetics 45, 180–185 (2013). (Impact factor: 35.5) 2. Monoclonality of Multifocal Epithelioid Hemangioendothelioma of the Liver by Analysis of WWTR1-CAMTA1 Breakpoints. Cancer Genetics. 2012 (Equally Contributing Author) AWARD AND ACHIEVEMENTS Tuition reimbursement, Memorial Sloan-Kettering Cancer Center, NYC 2013 Top 0.97% over 70,000 test takers on Mathematic of Department Required Test, Taiwan 2004 Member in the school team of International Physics Olympiad, Taiwan 2003 LEADERSHIP Student Leader 2005-2006 Ten Outstanding Young Leaders Foundation, Taipei, Taiwan • Conducted workshops and reported achievements to board of directors including president in Legislative Yuan of Taiwan EXPERIENCE AND SKILLS Technology: Python, C++, Torch, JavaScript, React, Node, Perl, R, FLUX, Shell, MapReduce, EC2, Beanstalk, S3, Hive, Hadoop Skills: CNN, RNN, LSTM, GAN, VLAD, SGD, kmeans, kmeans++, Principal Component Analysis Language: Mandarin (fluent), English (fluent), Taiwanese (fluent) 1