UVA Data Science Institute MSDS student Abhimanyu Roy ('18) presented a talk at the 2018 Tom Tom Applied Machine Learning Conference in Charlottesville, Va. His presentation highlights how data science can be used to predict results in sporting events.
Learn more about Abhimanyu at https://dsi.virginia.edu/people/abhimanyu-roy.
Design and Implementation of a Predictive Model for Nigeria Local Football Le...CSCJournals
Sports prediction has become more interesting especially in the era of statistical information about the sport, players, teams and seasons are readily available. Sport analysts have opted out in their traditional ways of analyzing sport events and tends to leverage on the advantages of sports data; this enables more realistic analysis beyond sentiments. However, football game was considered in this research. Data from Nigerian Professional Football League (NPLF) was used to predict result based on different conditions such as home win, draw and away win of teams in the league. Machine Learning, k-Nearest Neighbor and mathematical Poisson distribution algorithm was hybridized using data mining tools together with Anaconda packages. The model accuracy was compared with other online bookmarkers, and it yielded 93.33% accuracy which will be helpful in making substantial profits in within the economy through the betting industries. This model is practically based on the home and away matches coupled with historical trends of goals scored and winning of previous matches, by implication, Nigerian football league will be more enhanced to catch up with their international counterparts and the players tends to get more feasibility from match result predictions for international participation and employment opportunities.
NIT1201 Introduction to Database System Assignment by USA ExpertsJohnsmith5188
The objective of this assignment is for you to put into practice the many different skills that you are learning in this unit into a single cohesive database project.
Design and Implementation of a Predictive Model for Nigeria Local Football Le...CSCJournals
Sports prediction has become more interesting especially in the era of statistical information about the sport, players, teams and seasons are readily available. Sport analysts have opted out in their traditional ways of analyzing sport events and tends to leverage on the advantages of sports data; this enables more realistic analysis beyond sentiments. However, football game was considered in this research. Data from Nigerian Professional Football League (NPLF) was used to predict result based on different conditions such as home win, draw and away win of teams in the league. Machine Learning, k-Nearest Neighbor and mathematical Poisson distribution algorithm was hybridized using data mining tools together with Anaconda packages. The model accuracy was compared with other online bookmarkers, and it yielded 93.33% accuracy which will be helpful in making substantial profits in within the economy through the betting industries. This model is practically based on the home and away matches coupled with historical trends of goals scored and winning of previous matches, by implication, Nigerian football league will be more enhanced to catch up with their international counterparts and the players tends to get more feasibility from match result predictions for international participation and employment opportunities.
NIT1201 Introduction to Database System Assignment by USA ExpertsJohnsmith5188
The objective of this assignment is for you to put into practice the many different skills that you are learning in this unit into a single cohesive database project.
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...University of Salerno
A new approach in team sports analysis consists in studying positioning and movements of players during the game in relation to team performance. State of the art tracking systems produce spatio-temporal traces of players that have facilitated a variety of research aimed to extract insights from trajectories. Several methods borrowed from machine learning, network and complex systems, geographic information system, computer vision and statistics have been proposed. After having reviewed the state of the art in those niches of literature aiming to extract useful information to analysts and experts in terms of relation between players' trajectories and team performance, this paper presents preliminary results from analysing trajectories data and sheds light on potential future research in this eld of study. In particular, using convex hulls, we find interesting regularities in players' movement patterns.
Data mining techniques are very effective and useful for forecasting in many domains or fields. In this
research, prediction of Spanish la liga football match outcomes is carried out using various data mining techniques
(Multilayer Perception, Decision Tables, Random Forest, Reptree and Meta. Bagging) to determine the most accurate
among these techniques.
Connecting citizens with public data to drive policy changeMelissa Moody
UVA Data Science Institute Master of Science in Data Science researchers Lucas Beane and Elena Gillis undertook a capstone project to investigate possible reasons for the stagnation of the Charlottesville Open Data Portal.
Data Collection Methods for Building a Free Response Training SimulationMelissa Moody
Master of Science in Data Science capstone project researchers Vaibhav Sharma, Beni Shpringer, and Michael Yang, along with UVA School of Engineering M.S. student Martin Bolger and Ph.D. students Sodiq Adewole and Erfaneh Gharavi, sought to develop new methods for collecting, generating, and labeling data to aid in the creation of educational, free-input dialogue simulations.
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Melissa Moody
Researchers Navin Kasa, Andrew Dahbura, and Charishma Ravoori undertook a capstone project—part of the UVA Data Science Institute Master of Science in Data Science program—that addresses credit card fraud detection through a semi-supervised approach, in which clusters of account profiles are created and used for modeling classifiers.
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...Melissa Moody
UVA Data Science Institute Master of Science in Data Science students Sean Mullane, Ruoyan Chen and Sri Vaishnavi Vemulapalli were motivated to apply data science tools and techniques to the problem, and see if protein structures can be quantitatively described, compared and otherwise analyzed in a more robust, efficient and automated manner. Potential applications include more effectively designed drugs to inhibit disease-related proteins, or even newly engineered ones.
The researchers received the award for Best Paper in the Data Science for Health category at the 2019 Systems & Information Design Symposium (SIEDS) meeting. Their project, "Machine Learning for Classification of Protein Helix Capping Motifs," focused on small segments of a protein called secondary structural elements. These structural elements are the basic molecular-scale building blocks that all proteins—and therefore life—build upon.
Automatic detection of online abuse and analysis of problematic users in wiki...Melissa Moody
For their 2019 capstone project, DSI Master of Science in Data Science students Charu Rawat, Arnab Sarkar, and Sameer Singh proposed a framework to understand and detect such abuse in the English Wikipedia community.
Rawat, Sarkar, and Singh received the award for Best Paper in the Data Science for Society category at the 2019 Systems & Information Design Symposium (SIEDS). In "Automatic Detection of Online Abuse and Analysis of Problematic Users in Wikipedia," the team presented an analysis of user misconduct in Wikipedia and a system for the automated early detection of inappropriate behavior.
Plans for the University of Virginia School of Data ScienceMelissa Moody
The University of Virginia, through the largest gift in the University’s history, has the opportunity to play a national and international leadership role in data science training, research, and service by expanding the already successful Data Science Institute (DSI) to become a School of Data Science (SDS). When first presented to then President-elect James Ryan, he pointed out that a gift alone does not make a school. Particular concerns were sustainability and the impact on other schools of the University. Throughout 2018 and early 2019, we have crafted a proposal for the SDS that is financially and academically sustainable and that works in concert with all schools to enrich every student’s experience at a time when our society is increasingly data driven.
A presentation by UVA Data Science Institute MSDS 2019 students Charu Rawat, Arnab Sarkar, and Sameer Singh, advised by DSI professor Raf Alvarado and researcher Lane Rasberry, at the 2019 Tom Tom Applied Machine Learning Conference in Charlottesville, VA.
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...Melissa Moody
A presentation by UVA Data Science Institute 2019-20 Presidential Fellow in Data Science Tianlu Wang, at the 2019 Tom Tom Applied Machine Learning Conference in Charlottesville, VA. Learn more at datascience.virginia.edu.
Collective Biographies of Women: A Deep Learning Approach to Paragraph Annota...Melissa Moody
A presentation by UVA Data Science Institute MSDS 2019 students Sakshi Jawarani, Murugesan Ramakrishnan, and Varshini Sriram, advised by MSDS Program Director and professor Rafael Alvarado, at the 2019 Tom Tom Applied Machine Learning Conference in Charlottesville, VA.
Ethical Priniciples for the All Data RevolutionMelissa Moody
A presentation by Stephanie Shipp, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
More Related Content
Similar to How to Beat the House: Predicting Football Results with Hyperparameter Optimization
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...University of Salerno
A new approach in team sports analysis consists in studying positioning and movements of players during the game in relation to team performance. State of the art tracking systems produce spatio-temporal traces of players that have facilitated a variety of research aimed to extract insights from trajectories. Several methods borrowed from machine learning, network and complex systems, geographic information system, computer vision and statistics have been proposed. After having reviewed the state of the art in those niches of literature aiming to extract useful information to analysts and experts in terms of relation between players' trajectories and team performance, this paper presents preliminary results from analysing trajectories data and sheds light on potential future research in this eld of study. In particular, using convex hulls, we find interesting regularities in players' movement patterns.
Data mining techniques are very effective and useful for forecasting in many domains or fields. In this
research, prediction of Spanish la liga football match outcomes is carried out using various data mining techniques
(Multilayer Perception, Decision Tables, Random Forest, Reptree and Meta. Bagging) to determine the most accurate
among these techniques.
Connecting citizens with public data to drive policy changeMelissa Moody
UVA Data Science Institute Master of Science in Data Science researchers Lucas Beane and Elena Gillis undertook a capstone project to investigate possible reasons for the stagnation of the Charlottesville Open Data Portal.
Data Collection Methods for Building a Free Response Training SimulationMelissa Moody
Master of Science in Data Science capstone project researchers Vaibhav Sharma, Beni Shpringer, and Michael Yang, along with UVA School of Engineering M.S. student Martin Bolger and Ph.D. students Sodiq Adewole and Erfaneh Gharavi, sought to develop new methods for collecting, generating, and labeling data to aid in the creation of educational, free-input dialogue simulations.
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Melissa Moody
Researchers Navin Kasa, Andrew Dahbura, and Charishma Ravoori undertook a capstone project—part of the UVA Data Science Institute Master of Science in Data Science program—that addresses credit card fraud detection through a semi-supervised approach, in which clusters of account profiles are created and used for modeling classifiers.
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...Melissa Moody
UVA Data Science Institute Master of Science in Data Science students Sean Mullane, Ruoyan Chen and Sri Vaishnavi Vemulapalli were motivated to apply data science tools and techniques to the problem, and see if protein structures can be quantitatively described, compared and otherwise analyzed in a more robust, efficient and automated manner. Potential applications include more effectively designed drugs to inhibit disease-related proteins, or even newly engineered ones.
The researchers received the award for Best Paper in the Data Science for Health category at the 2019 Systems & Information Design Symposium (SIEDS) meeting. Their project, "Machine Learning for Classification of Protein Helix Capping Motifs," focused on small segments of a protein called secondary structural elements. These structural elements are the basic molecular-scale building blocks that all proteins—and therefore life—build upon.
Automatic detection of online abuse and analysis of problematic users in wiki...Melissa Moody
For their 2019 capstone project, DSI Master of Science in Data Science students Charu Rawat, Arnab Sarkar, and Sameer Singh proposed a framework to understand and detect such abuse in the English Wikipedia community.
Rawat, Sarkar, and Singh received the award for Best Paper in the Data Science for Society category at the 2019 Systems & Information Design Symposium (SIEDS). In "Automatic Detection of Online Abuse and Analysis of Problematic Users in Wikipedia," the team presented an analysis of user misconduct in Wikipedia and a system for the automated early detection of inappropriate behavior.
Plans for the University of Virginia School of Data ScienceMelissa Moody
The University of Virginia, through the largest gift in the University’s history, has the opportunity to play a national and international leadership role in data science training, research, and service by expanding the already successful Data Science Institute (DSI) to become a School of Data Science (SDS). When first presented to then President-elect James Ryan, he pointed out that a gift alone does not make a school. Particular concerns were sustainability and the impact on other schools of the University. Throughout 2018 and early 2019, we have crafted a proposal for the SDS that is financially and academically sustainable and that works in concert with all schools to enrich every student’s experience at a time when our society is increasingly data driven.
A presentation by UVA Data Science Institute MSDS 2019 students Charu Rawat, Arnab Sarkar, and Sameer Singh, advised by DSI professor Raf Alvarado and researcher Lane Rasberry, at the 2019 Tom Tom Applied Machine Learning Conference in Charlottesville, VA.
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...Melissa Moody
A presentation by UVA Data Science Institute 2019-20 Presidential Fellow in Data Science Tianlu Wang, at the 2019 Tom Tom Applied Machine Learning Conference in Charlottesville, VA. Learn more at datascience.virginia.edu.
Collective Biographies of Women: A Deep Learning Approach to Paragraph Annota...Melissa Moody
A presentation by UVA Data Science Institute MSDS 2019 students Sakshi Jawarani, Murugesan Ramakrishnan, and Varshini Sriram, advised by MSDS Program Director and professor Rafael Alvarado, at the 2019 Tom Tom Applied Machine Learning Conference in Charlottesville, VA.
Ethical Priniciples for the All Data RevolutionMelissa Moody
A presentation by Stephanie Shipp, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
Assessing the reproducibility of DNA microarray studiesMelissa Moody
A presentation by Eva Lancaster, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
Modeling the Impact of R & Python Packages: Dependency and Contributor NetworksMelissa Moody
A presentation by Gizem Korkmaz, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
A Modified K-Means Clustering Approach to Redrawing US Congressional DistrictsMelissa Moody
UVA Data Science Institute MSDS student Jack Prominski ('18) presented a talk at the 2018 Tom Tom Applied Machine Learning Conference in Charlottesville, Va. His talk highlights how data science can create a more equitable redistricting process.
Learn more about Jack at https://dsi.virginia.edu/people/jack-prominski.
Joining Separate Paradigms: Text Mining & Deep Neural Networks to Character...Melissa Moody
UVA Data Science Institute MSDS students Caitlin Dreisbach ('18), Morgan Wall ('18), and Ali Zaidi ('18) presented a talk based on their capstone research project, part of the MSDS program, at the 2018 Tom Tom Applied Machine Learning Conference in Charlottesville, Va.
Learn more about the project at https://dsi.virginia.edu/projects/connecting-mind-and-body.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
How to Beat the House: Predicting Football Results with Hyperparameter Optimization
1. How to Beat the
House
Predicting Football results with
Hyperparameter Optimization
Abhimanyu Roy, Data Science Institute
2. “There are no shortcuts to building a team each season. You build the foundation
brick by brick.”
- Bill Belicheck
TIME SERIES!
Min et. al. (2008) used a rule based approach (NBC) combined with an in-
game time series component to predict results from the 2002 Soccer world
cup with 70% accuracy
~2014 - Enter Deep Learning
4
It takes time to build a winning team
3. 8
Neural Networks
Some Input variables (30 years of NFL, NCAA, CFL) -
1. Number of games played in the season before the observation
under consideration
2. Number of wins and losses in the season
3. Win/loss ratio against the opponent
4. Number of wins in the last 5 games
5. Number of players from NFL Fantasy rankings in team
6. Number of top 10 players from NFL Fantasy rankings in team
4. 4
Recurrent Neural Networks
• Type of neural network where connections between units form a
directed graph along a sequence
• Can model the behavior of a time sequence
• RNNs can use their internal state to remember or forget sequences of
inputs
• Long-short Term Memory, Gated Recurrent Unit, Generic RNN Cell
(Simple RNN). Same concept the only difference is cell design
• Hyperparameter Tuning to get the best parameters
• Activation function: Defines the output of a node given an input or set
of inputs
• Recurrent Activation function: Defines the output of a node when
receiving input from nodes in the same hidden layer
• Learning Rate: Adjustment of weights
• Loss function: Cost of inaccurate prediction