A presentation by UVA Data Science Institute MSDS 2019 students Sakshi Jawarani, Murugesan Ramakrishnan, and Varshini Sriram, advised by MSDS Program Director and professor Rafael Alvarado, at the 2019 Tom Tom Applied Machine Learning Conference in Charlottesville, VA.
Connecting citizens with public data to drive policy changeMelissa Moody
UVA Data Science Institute Master of Science in Data Science researchers Lucas Beane and Elena Gillis undertook a capstone project to investigate possible reasons for the stagnation of the Charlottesville Open Data Portal.
Data Collection Methods for Building a Free Response Training SimulationMelissa Moody
Master of Science in Data Science capstone project researchers Vaibhav Sharma, Beni Shpringer, and Michael Yang, along with UVA School of Engineering M.S. student Martin Bolger and Ph.D. students Sodiq Adewole and Erfaneh Gharavi, sought to develop new methods for collecting, generating, and labeling data to aid in the creation of educational, free-input dialogue simulations.
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Melissa Moody
Researchers Navin Kasa, Andrew Dahbura, and Charishma Ravoori undertook a capstone project—part of the UVA Data Science Institute Master of Science in Data Science program—that addresses credit card fraud detection through a semi-supervised approach, in which clusters of account profiles are created and used for modeling classifiers.
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...Melissa Moody
UVA Data Science Institute Master of Science in Data Science students Sean Mullane, Ruoyan Chen and Sri Vaishnavi Vemulapalli were motivated to apply data science tools and techniques to the problem, and see if protein structures can be quantitatively described, compared and otherwise analyzed in a more robust, efficient and automated manner. Potential applications include more effectively designed drugs to inhibit disease-related proteins, or even newly engineered ones.
The researchers received the award for Best Paper in the Data Science for Health category at the 2019 Systems & Information Design Symposium (SIEDS) meeting. Their project, "Machine Learning for Classification of Protein Helix Capping Motifs," focused on small segments of a protein called secondary structural elements. These structural elements are the basic molecular-scale building blocks that all proteins—and therefore life—build upon.
Automatic detection of online abuse and analysis of problematic users in wiki...Melissa Moody
For their 2019 capstone project, DSI Master of Science in Data Science students Charu Rawat, Arnab Sarkar, and Sameer Singh proposed a framework to understand and detect such abuse in the English Wikipedia community.
Rawat, Sarkar, and Singh received the award for Best Paper in the Data Science for Society category at the 2019 Systems & Information Design Symposium (SIEDS). In "Automatic Detection of Online Abuse and Analysis of Problematic Users in Wikipedia," the team presented an analysis of user misconduct in Wikipedia and a system for the automated early detection of inappropriate behavior.
Plans for the University of Virginia School of Data ScienceMelissa Moody
The University of Virginia, through the largest gift in the University’s history, has the opportunity to play a national and international leadership role in data science training, research, and service by expanding the already successful Data Science Institute (DSI) to become a School of Data Science (SDS). When first presented to then President-elect James Ryan, he pointed out that a gift alone does not make a school. Particular concerns were sustainability and the impact on other schools of the University. Throughout 2018 and early 2019, we have crafted a proposal for the SDS that is financially and academically sustainable and that works in concert with all schools to enrich every student’s experience at a time when our society is increasingly data driven.
A presentation by UVA Data Science Institute MSDS 2019 students Charu Rawat, Arnab Sarkar, and Sameer Singh, advised by DSI professor Raf Alvarado and researcher Lane Rasberry, at the 2019 Tom Tom Applied Machine Learning Conference in Charlottesville, VA.
Connecting citizens with public data to drive policy changeMelissa Moody
UVA Data Science Institute Master of Science in Data Science researchers Lucas Beane and Elena Gillis undertook a capstone project to investigate possible reasons for the stagnation of the Charlottesville Open Data Portal.
Data Collection Methods for Building a Free Response Training SimulationMelissa Moody
Master of Science in Data Science capstone project researchers Vaibhav Sharma, Beni Shpringer, and Michael Yang, along with UVA School of Engineering M.S. student Martin Bolger and Ph.D. students Sodiq Adewole and Erfaneh Gharavi, sought to develop new methods for collecting, generating, and labeling data to aid in the creation of educational, free-input dialogue simulations.
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Melissa Moody
Researchers Navin Kasa, Andrew Dahbura, and Charishma Ravoori undertook a capstone project—part of the UVA Data Science Institute Master of Science in Data Science program—that addresses credit card fraud detection through a semi-supervised approach, in which clusters of account profiles are created and used for modeling classifiers.
Deep Learning Meets Biology: How Does a Protein Helix Know Where to Start and...Melissa Moody
UVA Data Science Institute Master of Science in Data Science students Sean Mullane, Ruoyan Chen and Sri Vaishnavi Vemulapalli were motivated to apply data science tools and techniques to the problem, and see if protein structures can be quantitatively described, compared and otherwise analyzed in a more robust, efficient and automated manner. Potential applications include more effectively designed drugs to inhibit disease-related proteins, or even newly engineered ones.
The researchers received the award for Best Paper in the Data Science for Health category at the 2019 Systems & Information Design Symposium (SIEDS) meeting. Their project, "Machine Learning for Classification of Protein Helix Capping Motifs," focused on small segments of a protein called secondary structural elements. These structural elements are the basic molecular-scale building blocks that all proteins—and therefore life—build upon.
Automatic detection of online abuse and analysis of problematic users in wiki...Melissa Moody
For their 2019 capstone project, DSI Master of Science in Data Science students Charu Rawat, Arnab Sarkar, and Sameer Singh proposed a framework to understand and detect such abuse in the English Wikipedia community.
Rawat, Sarkar, and Singh received the award for Best Paper in the Data Science for Society category at the 2019 Systems & Information Design Symposium (SIEDS). In "Automatic Detection of Online Abuse and Analysis of Problematic Users in Wikipedia," the team presented an analysis of user misconduct in Wikipedia and a system for the automated early detection of inappropriate behavior.
Plans for the University of Virginia School of Data ScienceMelissa Moody
The University of Virginia, through the largest gift in the University’s history, has the opportunity to play a national and international leadership role in data science training, research, and service by expanding the already successful Data Science Institute (DSI) to become a School of Data Science (SDS). When first presented to then President-elect James Ryan, he pointed out that a gift alone does not make a school. Particular concerns were sustainability and the impact on other schools of the University. Throughout 2018 and early 2019, we have crafted a proposal for the SDS that is financially and academically sustainable and that works in concert with all schools to enrich every student’s experience at a time when our society is increasingly data driven.
A presentation by UVA Data Science Institute MSDS 2019 students Charu Rawat, Arnab Sarkar, and Sameer Singh, advised by DSI professor Raf Alvarado and researcher Lane Rasberry, at the 2019 Tom Tom Applied Machine Learning Conference in Charlottesville, VA.
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...Melissa Moody
A presentation by UVA Data Science Institute 2019-20 Presidential Fellow in Data Science Tianlu Wang, at the 2019 Tom Tom Applied Machine Learning Conference in Charlottesville, VA. Learn more at datascience.virginia.edu.
Ethical Priniciples for the All Data RevolutionMelissa Moody
A presentation by Stephanie Shipp, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
Assessing the reproducibility of DNA microarray studiesMelissa Moody
A presentation by Eva Lancaster, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
Modeling the Impact of R & Python Packages: Dependency and Contributor NetworksMelissa Moody
A presentation by Gizem Korkmaz, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
How to Beat the House: Predicting Football Results with Hyperparameter Optimi...Melissa Moody
UVA Data Science Institute MSDS student Abhimanyu Roy ('18) presented a talk at the 2018 Tom Tom Applied Machine Learning Conference in Charlottesville, Va. His presentation highlights how data science can be used to predict results in sporting events.
Learn more about Abhimanyu at https://dsi.virginia.edu/people/abhimanyu-roy.
A Modified K-Means Clustering Approach to Redrawing US Congressional DistrictsMelissa Moody
UVA Data Science Institute MSDS student Jack Prominski ('18) presented a talk at the 2018 Tom Tom Applied Machine Learning Conference in Charlottesville, Va. His talk highlights how data science can create a more equitable redistricting process.
Learn more about Jack at https://dsi.virginia.edu/people/jack-prominski.
Joining Separate Paradigms: Text Mining & Deep Neural Networks to Character...Melissa Moody
UVA Data Science Institute MSDS students Caitlin Dreisbach ('18), Morgan Wall ('18), and Ali Zaidi ('18) presented a talk based on their capstone research project, part of the MSDS program, at the 2018 Tom Tom Applied Machine Learning Conference in Charlottesville, Va.
Learn more about the project at https://dsi.virginia.edu/projects/connecting-mind-and-body.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in De...Melissa Moody
A presentation by UVA Data Science Institute 2019-20 Presidential Fellow in Data Science Tianlu Wang, at the 2019 Tom Tom Applied Machine Learning Conference in Charlottesville, VA. Learn more at datascience.virginia.edu.
Ethical Priniciples for the All Data RevolutionMelissa Moody
A presentation by Stephanie Shipp, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
Assessing the reproducibility of DNA microarray studiesMelissa Moody
A presentation by Eva Lancaster, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
Modeling the Impact of R & Python Packages: Dependency and Contributor NetworksMelissa Moody
A presentation by Gizem Korkmaz, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
How to Beat the House: Predicting Football Results with Hyperparameter Optimi...Melissa Moody
UVA Data Science Institute MSDS student Abhimanyu Roy ('18) presented a talk at the 2018 Tom Tom Applied Machine Learning Conference in Charlottesville, Va. His presentation highlights how data science can be used to predict results in sporting events.
Learn more about Abhimanyu at https://dsi.virginia.edu/people/abhimanyu-roy.
A Modified K-Means Clustering Approach to Redrawing US Congressional DistrictsMelissa Moody
UVA Data Science Institute MSDS student Jack Prominski ('18) presented a talk at the 2018 Tom Tom Applied Machine Learning Conference in Charlottesville, Va. His talk highlights how data science can create a more equitable redistricting process.
Learn more about Jack at https://dsi.virginia.edu/people/jack-prominski.
Joining Separate Paradigms: Text Mining & Deep Neural Networks to Character...Melissa Moody
UVA Data Science Institute MSDS students Caitlin Dreisbach ('18), Morgan Wall ('18), and Ali Zaidi ('18) presented a talk based on their capstone research project, part of the MSDS program, at the 2018 Tom Tom Applied Machine Learning Conference in Charlottesville, Va.
Learn more about the project at https://dsi.virginia.edu/projects/connecting-mind-and-body.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Collective Biographies of Women: A Deep Learning Approach to Paragraph Annotation
1. Collective Biographies Of Women
A Deep Learning Approach to
Paragraph Annotation
Members: Murugesan Ramakrishnan, Sakshi Jawarani, Varshini Sriram
Advisor: Prof. Rafael Alvarado
Sponsored by: Institute for Advanced Technology in the Humanities
17. Conclusion
● Clustering reduced the cardinality of the classes dramatically and helped
obtain better results.
● We strongly recommend these modified categories for further paragraph
annotation for the CBW project
● Our results will be used to further annotate 13,000 chapters of women
biographies