Eric Fitzgerald has over 10 years of experience in data science and physics research. He currently works as a Data Scientist for Backcountry.com, where he has developed models and tools to improve customer retention, attribution of marketing spend, and analytics reporting. Previously he held research positions at Brandeis University, MIT, and the University of Oregon, conducting experimental particle physics and publishing his PhD work.
This is my class project using UCI Mashable dataset to determine what constitutes popular news. In this project, I used (1) multiple regression and model building and (2) PCA and factor analysis.
Data Analytics Tools: SAS and R
This is my class project using UCI Mashable dataset to determine what constitutes popular news. In this project, I used (1) multiple regression and model building and (2) PCA and factor analysis.
Data Analytics Tools: SAS and R
Jean-Claude Bradley presents "Accelerating Discovery by Sharing: a case for Open Notebook Science" at the National Breast Cancer Coalition Annual Advocacy Conference in Arlington, VA on May 1, 2011.
Jean-Claude Bradley presented at a panel on New Forms of Scholarly Communication in Science at the Special Libraries Association meeting on June 15, 2011. The talk covered the role of trust in science, with a focus on the validation of melting point data. Where the literature was unable to reconcile measurements, Open Notebook Science was used to clarify. The collection of an Open Dataset of melting point measurements for 20,000 compounds was described as well as ongoing curation efforts and corresponding web services. (collaborators Andrew Lang and Antony Williams)
Slides from journal club on "Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation" by Vasile F et al. from Criteo on RecSys 2016
Fast top k path-based relevance query on massive graphsieeechennai
Fast top k path-based relevance query on massive graphs
+91-9994232214,7806844441, ieeeprojectchennai@gmail.com,
www.projectsieee.com, www.ieee-projects-chennai.com
IEEE PROJECTS 2016-2017
-----------------------------------
Contact:+91-9994232214,+91-7806844441
Email: ieeeprojectchennai@gmail.com
Supporting Change Impact Analysis Using a Recommendation System - An Industri...Markus Borg
Journal first presentation at ICSE'17 in Buenos Aires, Argentina.
M. Borg, K. Wnuk, B. Regnell, and P. Runeson. Supporting Change Impact Analysis Using a Recommendation System: An Industrial Case Study in a Safety-Critical Context, IEEE Transactions on Software Engineering, 43(6), pp. 675-700, 2017.
Advancing Foundation and Practice of Software AnalyticsTao Xie
Vision Statement Presentation on "Advancing Foundation & Practice of Software Analytics" at the 2nd International NSF sponsored Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE 2013) http://promisedata.org/raise/2013/
Curate Science - Transparency Labels for Science U of Bordeaux 2019 talkEtienne LeBel
Curate Science - Transparency Labels for Science invited talk given for Université of Bordeaux "Rethinking Robustness and Reliability in Research: Facing the Reproducibility Crisis" (March 29, 2019). I review the crucial need for - and benefits of - a crowdsourcing platform to label and link the transparency and replication of empirical research.
(Jaume Sala). The initial definition of this project consisted on three questions: How can the city administration connect/combine own data sets within the existing IT structure in order to make multidimensional analysis? How can we (the government of Schiedam) combine these datasets with datasets from several stakeholders? And finally, what kind of new information can become available? The objectives of the project were the following: Implement a tool to achieve the visual representation of georeferenced datasets, analyze the possibility to combine multiple datasets in the same graphical representation, and propose a new datasets organization related to smart city indicators and geospatial data.
Hobbit presentation at Apache Big Data Europe 2016.
This work was supported by grants from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227).
A summary of the workshop as presented at the 1st International Workshop on Benchmarking Linked Data (BLINK).
(HOBBIT project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688227.)
An overview of the workshop as presented at the 1st International Workshop on Benchmarking Linked Data (BLINK).
(HOBBIT project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688227.)
Jean-Claude Bradley presents "Accelerating Discovery by Sharing: a case for Open Notebook Science" at the National Breast Cancer Coalition Annual Advocacy Conference in Arlington, VA on May 1, 2011.
Jean-Claude Bradley presented at a panel on New Forms of Scholarly Communication in Science at the Special Libraries Association meeting on June 15, 2011. The talk covered the role of trust in science, with a focus on the validation of melting point data. Where the literature was unable to reconcile measurements, Open Notebook Science was used to clarify. The collection of an Open Dataset of melting point measurements for 20,000 compounds was described as well as ongoing curation efforts and corresponding web services. (collaborators Andrew Lang and Antony Williams)
Slides from journal club on "Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation" by Vasile F et al. from Criteo on RecSys 2016
Fast top k path-based relevance query on massive graphsieeechennai
Fast top k path-based relevance query on massive graphs
+91-9994232214,7806844441, ieeeprojectchennai@gmail.com,
www.projectsieee.com, www.ieee-projects-chennai.com
IEEE PROJECTS 2016-2017
-----------------------------------
Contact:+91-9994232214,+91-7806844441
Email: ieeeprojectchennai@gmail.com
Supporting Change Impact Analysis Using a Recommendation System - An Industri...Markus Borg
Journal first presentation at ICSE'17 in Buenos Aires, Argentina.
M. Borg, K. Wnuk, B. Regnell, and P. Runeson. Supporting Change Impact Analysis Using a Recommendation System: An Industrial Case Study in a Safety-Critical Context, IEEE Transactions on Software Engineering, 43(6), pp. 675-700, 2017.
Advancing Foundation and Practice of Software AnalyticsTao Xie
Vision Statement Presentation on "Advancing Foundation & Practice of Software Analytics" at the 2nd International NSF sponsored Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE 2013) http://promisedata.org/raise/2013/
Curate Science - Transparency Labels for Science U of Bordeaux 2019 talkEtienne LeBel
Curate Science - Transparency Labels for Science invited talk given for Université of Bordeaux "Rethinking Robustness and Reliability in Research: Facing the Reproducibility Crisis" (March 29, 2019). I review the crucial need for - and benefits of - a crowdsourcing platform to label and link the transparency and replication of empirical research.
(Jaume Sala). The initial definition of this project consisted on three questions: How can the city administration connect/combine own data sets within the existing IT structure in order to make multidimensional analysis? How can we (the government of Schiedam) combine these datasets with datasets from several stakeholders? And finally, what kind of new information can become available? The objectives of the project were the following: Implement a tool to achieve the visual representation of georeferenced datasets, analyze the possibility to combine multiple datasets in the same graphical representation, and propose a new datasets organization related to smart city indicators and geospatial data.
Hobbit presentation at Apache Big Data Europe 2016.
This work was supported by grants from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227).
A summary of the workshop as presented at the 1st International Workshop on Benchmarking Linked Data (BLINK).
(HOBBIT project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688227.)
An overview of the workshop as presented at the 1st International Workshop on Benchmarking Linked Data (BLINK).
(HOBBIT project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688227.)
In this deck from the HPC User Forum, Rick Stevens from Argonne presents: AI for Science.
"Artificial Intelligence (AI) is making strides in transforming how we live. From the tech industry embracing AI as the most important technology for the 21st century to governments around the world growing efforts in AI, initiatives are rapidly emerging in the space. In sync with these emerging initiatives including U.S. Department of Energy efforts, Argonne has launched an “AI for Science” initiative aimed at accelerating the development and adoption of AI approaches in scientific and engineering domains with the goal to accelerate research and development breakthroughs in energy, basic science, medicine, and national security, especially where we have significant volumes of data and relatively less developed theory. AI methods allow us to discover patterns in data that can lead to experimental hypotheses and thus link data driven methods to new experiments and new understanding."
Watch the video: https://wp.me/p3RLHQ-kQi
Learn more: https://www.anl.gov/topic/science-technology/artificial-intelligence
and
http://hpcuserforum.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...EarthCube
Ilya Zaslavsky, David Valentine, Amarnath Gupta, Stephen Richard, Tanu Malik
Presentation given in the afternoon Architecture Forum Session on Day 1, June 24 at the EarthCube All-Hands Meeting
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...IJEACS
The huge amount of library data stored in our modern research and statistic centers of organizations is springing up on daily bases. These databases grow exponentially in size with respect to time, it becomes exceptionally difficult to easily understand the behavior and interpret data with the relationships that exist between attributes. This exponential growth of data poses new organizational challenges like the conventional record management system infrastructure could no longer cope to give precise and detailed information about the behavior data over time. There is confusion and novel concern in selecting tools that can support and handle big data visualization that deals with multi-dimension. Viewing all related data at once in a database is a problem that has attracted the interest of data professionals with machine learning skills. This is a lingering issue in the data industry because the existing techniques cannot be used to remove or filter noise from relevant data and pad up missing values in order to get the required information. The aim is to develop a stacked generalization model that combines the functionality of random forest and decision tree to visualization library database visualization. In this paper, the random forest and decision tree techniques were employed to effectively visualize large amounts of school library data. The proposed system was implemented with a few lines of Python code to create visualizations that can help users at a glance understand and interpret the behavior of data and its relationships. The model was trained and tested to learn and extract hidden patterns of data with a cross-validation test. It combined the functionalities of both models to form a stacked generalization model that performed better than the individual techniques. The stacked model produced 95% followed by the RF which produced a 95% accuracy rate and 0.223600 RMSE error value in comparison with the DT which recorded an 80.00% success rate and 0.15990 RMSE value.
High Performance Data Analytics and a Java Grande Run TimeGeoffrey Fox
There is perhaps a broad consensus as to important issues in practical parallel computing as applied to large scale simulations; this is reflected in supercomputer architectures, algorithms, libraries, languages, compilers and best practice for application development.
However the same is not so true for data intensive even though commercially clouds devote many more resources to data analytics than supercomputers devote to simulations.
Here we use a sample of over 50 big data applications to identify characteristics of data intensive applications and to deduce needed runtime and architectures.
We propose a big data version of the famous Berkeley dwarfs and NAS parallel benchmarks.
Our analysis builds on the Apache software stack that is well used in modern cloud computing.
We give some examples including clustering, deep-learning and multi-dimensional scaling.
One suggestion from this work is value of a high performance Java (Grande) runtime that supports simulations and big data
1. Eric Fitzgerald
(503) 819 – 5893
eric.av.fitzgerald@gmail.com
3474 S 2000 E
Salt Lake City, UT 84109
Summary
Seasoned Data Scientist in the Marketing Department of Backcountry.com, a large online retailer.
Cross-team collaborator who improves spending efficiency and key performance metrics by apply-
ing advanced statistical techniques to analyze customer behavior. PhD in experimental high energy
physics earned searching for new particles at CERN, the world’s leading high-energy physics facil-
ity. Effective communicator across departments and stakeholders to encourage buy-in and results.
Designed and wrote dozens of projects and thousands of lines of analysis code in C++/Python with
regular version control. Quick and avid learner of new tools and techniques, whether breaking new
ground or optimizing existing projects.
Skills
Programming: Python (SciPy/NumPy), SQL (Oracle), C++ (ROOT), Shell Scripting
Software: Spark, R, Excel, Git, Adobe Analytics (Omniture), OS X, Linux/Unix
Analytics: Statistics, Modeling, Machine Learning, Predictive Analytics, Data Mining
Professional Experience
Data Scientist, Marketing, 6/2015 – Present
Statistical Analyst, Marketing, 9/2014 – 6/2015
Backcountry.com
• Created and implemented unique efficiency algorithms using Bayesian statistics, decision
theory, and linear algebra in an analytic hierarchy framework for the Search Engine Mar-
keting (SEM) team to increase customer retention. This helped drive an additional 12%
increase above expected growth in the customer base, and a 6% increase in new customer
retention (re-order) rates.
• Developed a non-linear customer-level model with per-click behavior to attribute revenue and
marketing spend across channels beyond last-click attribution, saving thousands of dollars
over third-party consultants.
• Built reporting and analytics tools for the Retention and Acquisition teams, harmonizing
myriad data sources and outputting directly actionable numbers for peak retail holiday
traffic, resulting in the best holiday peak and overall quarter in Backcountry.com history.
Collaborated with the SEM and Business Intelligence (database) teams to build a new set
of reporting tables in the Oracle EDW, assisting with both engineering inputs and resulting
insights.
• Tested collaborative filtering recommendation algorithms for on-site and email personaliza-
tion projects, with >200% increase in click-through-rate and 10% increase in revenue over
the third-party vendor. Continued to refine the features and segmentation of these models.
2. Research Experience
Research Assistant, Brandeis University, Department of Physics, 6/2011 – 2/2014
• Modeled the large expected backgrounds in the LHC environment using the ROOT software
with a multivariate regression to fit over control regions and extrapolate to poorly understood
regions. Isolated key features and variables to improve performance of selected events; my
recommendations led to a 15% gain in selection efficiency and up to 8% gain in measurement
precision.
• Produced the final data files and Monte Carlo pseudo-data for analysis team, with appro-
priate error measurement and propagation. Set world-leading 95% confidence level limits on
several hypotheses of Beyond the Standard Model physics. Created the final visualization
plots for presentation.
• Developed and managed the muon channel of the flagship analysis of the Exotics group
in the ATLAS collaboration. Presented to the relevant working groups and collaboration
meetings and incorporated their feedback, with the results approved for publication and
collaboration use. PhD work published in peer-reviewed publication (Phys. Rev. D 90,
052005; ArXiv: 1405.4123).
Research Fellow/Assistant, MIT, Department of Physics, 9/2004 – 5/2011
• Calculated energy level shifts in a diatomic molecule due to the Casimir Effect.
• Computed quantum field theory corrections to spin operators in bound states.
• Numerically estimated solutions of supersymmetric gauged linear sigma models.
Undergraduate Researcher, University of Oregon, Department of Physics, 1/2002 – 6/2004
• Designed and prototyped readout circuitry for a silicon wafer detector.
• Constructed and calibrated new detector test stand.
Education
Doctorate of Philosophy, Experimental Particle Physics, February 2014
Brandeis University, Waltham, MA 02453
Dissertation: A Search for New Physics in the Dilepton Channel with the ATLAS
Detector at the LHC
Master of Science, Theoretical Physics, May 2011
Massachusetts Institute of Technology, Cambridge, MA 02139
Thesis: A Quantum Top in a Casimir-Induced Quadrupole Field
Bachelor of Science, Physics & Mathematics, June 2003
University of Oregon, Eugene, OR 97403
Cum Laude, Departmental Honors in Physics
DeCou Prize for Outstanding Graduating Senior in Mathematics
Thesis: Symmetry Breaking in Scalar QED