1. The document discusses future directions for software engineering research, including tools to support "citizen scientists" and proposed services for next-generation data repositories.
2. It suggests that data mining tools could provide more services beyond data repositories, such as supporting verification, compression, privacy, and streaming of data.
3. The talk outlines several topics, including software tools for citizen scientists, issues around decision software, and lessons learned regarding certification envelopes, goals, locality, and the need for repair and verification tools.
GALE: Geometric active learning for Search-Based Software EngineeringCS, NcState
Multi-objective evolutionary algorithms (MOEAs) help software engineers find novel solutions to complex problems. When automatic tools explore too many options, they are slow to use and hard to comprehend. GALE is a near-linear time MOEA that builds a piecewise approximation to the surface of best solutions along the Pareto frontier. For each piece, GALE mutates solutions towards the better end. In numerous case studies, GALE finds comparable solutions to standard methods (NSGA-II, SPEA2) using far fewer evaluations (e.g. 20 evaluations, not 1,000). GALE is recommended when a model is expensive to evaluate, or when some audience needs to browse and understand how an MOEA has made its conclusions.
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...CS, NcState
Discussions about sharing
- Too much fear
- Not enough about benefits
Can we learn more from sharing that hoarding ?
- Yes (results from SE)
Three laws of trusted data sharing:
- For SE quality prediction..
- Better models from shared privatized data that from all raw data
Q: does this work for other kinds of data?
A: don’t know… yet
Agile Data Science is a lean methodology that is adopted from Agile Software Development. At the core it centers around people, interactions, and building minimally viable products to ship fast and often to solicit customer feedback. In this presentation, I describe how this work was done in the past with examples. Get started today with our help by visiting http://www.alpinenow.com
Is Agile Data Science just two buzzwords put together? I argue that agile is a very practical and applicable methodology, that does work well in the real world for all sorts of Analytics and Data Science workflows.
http://theinnovationenterprise.com/summits/digital-web-analytics-summit-london-2015/schedule
Data Science in the Real World: Making a Difference Srinath Perera
We use the terms “Big Data” and “Data Science” for use of data processing to make sense of the world around us. Spanning many fields, Big Data brings together technologies like Distributed Systems, Machine Learning, Statistics, and Internet of Things together. It is a multi-billion-dollar industry including use cases like targeted advertising, fraud detection, product recommendations, and market surveys. With new technologies like Internet of Things (IoT), these use cases are expanding to scenarios like Smart Cities, Smart health, and Smart Agriculture.
These usecases use basic analytics, advanced statistical methods, and predictive technologies like Machine Learning. However, it is not just about crunching the data. Some usecases like Urban Planning can be slow, and there is enough time to process the data. However, with use cases like traffic, patient monitoring, surveillance the the value of results degrades much faster with time and needs results within milliseconds to seconds. Collecting data from many sources, cleaning them up, processing them using computation clusters, and doing all these fast is a major challenge.
This talk will discuss motivation behind big data and data science and how it can make a difference. Then it will discuss the challenges, systems, and methodologies for implementing and sustaining a data science pipeline.
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)Ogechi Onuoha
An Introduction to the focus of my research. I presented this to the members of the Pipeline research group, University of Lagos Nigeria. I will be making subsequent presentations as well as paper reviews on the same topic.
GALE: Geometric active learning for Search-Based Software EngineeringCS, NcState
Multi-objective evolutionary algorithms (MOEAs) help software engineers find novel solutions to complex problems. When automatic tools explore too many options, they are slow to use and hard to comprehend. GALE is a near-linear time MOEA that builds a piecewise approximation to the surface of best solutions along the Pareto frontier. For each piece, GALE mutates solutions towards the better end. In numerous case studies, GALE finds comparable solutions to standard methods (NSGA-II, SPEA2) using far fewer evaluations (e.g. 20 evaluations, not 1,000). GALE is recommended when a model is expensive to evaluate, or when some audience needs to browse and understand how an MOEA has made its conclusions.
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...CS, NcState
Discussions about sharing
- Too much fear
- Not enough about benefits
Can we learn more from sharing that hoarding ?
- Yes (results from SE)
Three laws of trusted data sharing:
- For SE quality prediction..
- Better models from shared privatized data that from all raw data
Q: does this work for other kinds of data?
A: don’t know… yet
Agile Data Science is a lean methodology that is adopted from Agile Software Development. At the core it centers around people, interactions, and building minimally viable products to ship fast and often to solicit customer feedback. In this presentation, I describe how this work was done in the past with examples. Get started today with our help by visiting http://www.alpinenow.com
Is Agile Data Science just two buzzwords put together? I argue that agile is a very practical and applicable methodology, that does work well in the real world for all sorts of Analytics and Data Science workflows.
http://theinnovationenterprise.com/summits/digital-web-analytics-summit-london-2015/schedule
Data Science in the Real World: Making a Difference Srinath Perera
We use the terms “Big Data” and “Data Science” for use of data processing to make sense of the world around us. Spanning many fields, Big Data brings together technologies like Distributed Systems, Machine Learning, Statistics, and Internet of Things together. It is a multi-billion-dollar industry including use cases like targeted advertising, fraud detection, product recommendations, and market surveys. With new technologies like Internet of Things (IoT), these use cases are expanding to scenarios like Smart Cities, Smart health, and Smart Agriculture.
These usecases use basic analytics, advanced statistical methods, and predictive technologies like Machine Learning. However, it is not just about crunching the data. Some usecases like Urban Planning can be slow, and there is enough time to process the data. However, with use cases like traffic, patient monitoring, surveillance the the value of results degrades much faster with time and needs results within milliseconds to seconds. Collecting data from many sources, cleaning them up, processing them using computation clusters, and doing all these fast is a major challenge.
This talk will discuss motivation behind big data and data science and how it can make a difference. Then it will discuss the challenges, systems, and methodologies for implementing and sustaining a data science pipeline.
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)Ogechi Onuoha
An Introduction to the focus of my research. I presented this to the members of the Pipeline research group, University of Lagos Nigeria. I will be making subsequent presentations as well as paper reviews on the same topic.
Deep Learning Use Cases - Data Science Pop-up SeattleDomino Data Lab
Companies like Google, Microsoft, Amazon and Facebook are in fierce competition for teams that can build deep-learning applications. Because of deep learning's general usefulness in pattern recognition, those applications are surprisingly diverse, ranging from image recognition to machine translation. This talk will explore deep learning use cases for the major data types -- image, sound, text and time series -- as they're emerging in the private sector. Presented by Chris Nicholson, Co-Founder and CEO at Skymind.
machine learning in the age of big data: new approaches and business applicat...Armando Vieira
Presentation at University of Lisbon on Machine Learning and big data.
Deep learning algorithms and applications to credit risk analysis, churn detection and recommendation algorithms
Curious about Data Science? Self-taught on some aspects, but missing the big picture? Well, you’ve got to start somewhere and this session is the place to do it.
This session will cover, at a layman’s level, some of the basic concepts of Data Science. In a conversational format, we will discuss: What are the differences between Big Data and Data Science – and why aren’t they the same thing? What distinguishes descriptive, predictive, and prescriptive analytics? What purpose do predictive models serve in a practical context? What kinds of models are there and what do they tell us? What is the difference between supervised and unsupervised learning? What are some common pitfalls that turn good ideas into bad science?
During this session, attendees will learn the difference between k-nearest neighbor and k-means clustering, understand the reasons why we do normalize and don’t overfit, and grasp the meaning of No Free Lunch.
This talk presents areas of investigation underway at the Rensselaer Institute for Data Exploration and Applications. First presented at Flipkart, Bangalore India, 3/2015.
Data Science Popup Austin: Conflict in Growing Data Science Organizations Domino Data Lab
Watch talk ➟ http://bit.ly/1NKPpQh
Eduardo Arino De La Rubia, VP of Product and Data Scientist in residence at Domino Data Lab talks about how to manage conflict in growing data science teams.
These are slides for a guest talk I gave for course 15.S14: Global Business of Artificial Intelligence and Robotics (GBAIR) taught in Spring 2017. Here is the YouTube video (filmed in 360/VR): https://youtu.be/s3MuSOl1Rog
Data Science Popup Austin: Back to The Future for Data and AnalyticsDomino Data Lab
Big data and analytics—companies everywhere are talking about it, but what are they really delivering? JD Stanley will share his perspective, exploring how emerging tools can reduce inefficiencies and administration in upfront or post algorithm iterative improvement processes. In this session, expect to hear about practical methods of investigative analysis to drive discoveries of attributes and characteristics of the disparate data to achieve quicker business or science efficiencies. JD will highlight machine learning approaches to deriving the propensity of a connected data landscape (or “propensity of connectedness” through co-occurence data scoring) and mixing statistics and modeling into a composition approach to improve the upfront part of analysis and analytics.
In this talk I review some of the early visions of the Semantic Web, some of the different views, and I follow through on a thread of how Semantic Web technology has been adopted in search engines (and other companies). I end with a challenge to the research community to keep pursuing this research, rather than letting industry take over the "low end" and keep new work from flourishing.
Towards Mining Software Repositories Research that MattersTao Xie
Towards Mining Software Repositories Research that Matters. Talk slides at Next Generation of Mining Software Repositories '14 (Pre-FSE 2014 Event), Nov 15–16. HKUST, Hong Kong http://ng2014.msrworld.org/
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...AI Frontiers
Intel Nervana has built a competitive deep learning platform to make it easy for data scientists to start from the iterative, investigatory phase and take models all the way to deployment. Nervana’s platform is designed for speed and scale, and serves as a catalyst for all types of organizations to benefit from the full potential of deep learning. Example of supported applications include but not limited to automotive speech interfaces, image search, language translation, agricultural robotics and genomics, financial document summarization, and finding anomalies in IoT data.
Presented 2015-08-24 at SF Bay ACM, held at the eBay south campus in San Jose.
http://meetup.com/SF-Bay-ACM/events/221693508/
Project Jupiter https://jupyter.org/ evolved from IPython notebooks, and now supports a wide variety of programming language back-ends. Notebooks have proven to be effective tools used in Data Science, providing convenient packages for what Don Knuth coined as "literate programming" in the 1980s: code plus exposition in markdown. Results of running the code appear in-line as interactive graphics -- all packaged as collaborative, web-based documents. Some have said that the introduction of cloud-based notebooks is nearly as large of a fundamental change in software practice as the introduction of spreadsheets.
O'Reilly Media has been considering the question, "What comes after books and video?" Or, as one might imagine more pointedly, what comes after Kindle? To that point we have collaborated with Project Jupyter to integrate notebooks into our content management process, allowing authors to generate articles, tutorials, reports, and other media products as notebooks that also incorporate video segments. Code dependencies are containerized using Docker, and all of the content gets managed in Git repositories. We have added another layer, an open source project called Thebe that provides a kind of "media player" for embedding the containerized notebooks into web pages
How to Become a Data Scientist
SF Data Science Meetup, June 30, 2014
Video of this talk is available here: https://www.youtube.com/watch?v=c52IOlnPw08
More information at: http://www.zipfianacademy.com
Zipfian Academy @ Crowdflower
Deep Learning Use Cases - Data Science Pop-up SeattleDomino Data Lab
Companies like Google, Microsoft, Amazon and Facebook are in fierce competition for teams that can build deep-learning applications. Because of deep learning's general usefulness in pattern recognition, those applications are surprisingly diverse, ranging from image recognition to machine translation. This talk will explore deep learning use cases for the major data types -- image, sound, text and time series -- as they're emerging in the private sector. Presented by Chris Nicholson, Co-Founder and CEO at Skymind.
machine learning in the age of big data: new approaches and business applicat...Armando Vieira
Presentation at University of Lisbon on Machine Learning and big data.
Deep learning algorithms and applications to credit risk analysis, churn detection and recommendation algorithms
Curious about Data Science? Self-taught on some aspects, but missing the big picture? Well, you’ve got to start somewhere and this session is the place to do it.
This session will cover, at a layman’s level, some of the basic concepts of Data Science. In a conversational format, we will discuss: What are the differences between Big Data and Data Science – and why aren’t they the same thing? What distinguishes descriptive, predictive, and prescriptive analytics? What purpose do predictive models serve in a practical context? What kinds of models are there and what do they tell us? What is the difference between supervised and unsupervised learning? What are some common pitfalls that turn good ideas into bad science?
During this session, attendees will learn the difference between k-nearest neighbor and k-means clustering, understand the reasons why we do normalize and don’t overfit, and grasp the meaning of No Free Lunch.
This talk presents areas of investigation underway at the Rensselaer Institute for Data Exploration and Applications. First presented at Flipkart, Bangalore India, 3/2015.
Data Science Popup Austin: Conflict in Growing Data Science Organizations Domino Data Lab
Watch talk ➟ http://bit.ly/1NKPpQh
Eduardo Arino De La Rubia, VP of Product and Data Scientist in residence at Domino Data Lab talks about how to manage conflict in growing data science teams.
These are slides for a guest talk I gave for course 15.S14: Global Business of Artificial Intelligence and Robotics (GBAIR) taught in Spring 2017. Here is the YouTube video (filmed in 360/VR): https://youtu.be/s3MuSOl1Rog
Data Science Popup Austin: Back to The Future for Data and AnalyticsDomino Data Lab
Big data and analytics—companies everywhere are talking about it, but what are they really delivering? JD Stanley will share his perspective, exploring how emerging tools can reduce inefficiencies and administration in upfront or post algorithm iterative improvement processes. In this session, expect to hear about practical methods of investigative analysis to drive discoveries of attributes and characteristics of the disparate data to achieve quicker business or science efficiencies. JD will highlight machine learning approaches to deriving the propensity of a connected data landscape (or “propensity of connectedness” through co-occurence data scoring) and mixing statistics and modeling into a composition approach to improve the upfront part of analysis and analytics.
In this talk I review some of the early visions of the Semantic Web, some of the different views, and I follow through on a thread of how Semantic Web technology has been adopted in search engines (and other companies). I end with a challenge to the research community to keep pursuing this research, rather than letting industry take over the "low end" and keep new work from flourishing.
Towards Mining Software Repositories Research that MattersTao Xie
Towards Mining Software Repositories Research that Matters. Talk slides at Next Generation of Mining Software Repositories '14 (Pre-FSE 2014 Event), Nov 15–16. HKUST, Hong Kong http://ng2014.msrworld.org/
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...AI Frontiers
Intel Nervana has built a competitive deep learning platform to make it easy for data scientists to start from the iterative, investigatory phase and take models all the way to deployment. Nervana’s platform is designed for speed and scale, and serves as a catalyst for all types of organizations to benefit from the full potential of deep learning. Example of supported applications include but not limited to automotive speech interfaces, image search, language translation, agricultural robotics and genomics, financial document summarization, and finding anomalies in IoT data.
Presented 2015-08-24 at SF Bay ACM, held at the eBay south campus in San Jose.
http://meetup.com/SF-Bay-ACM/events/221693508/
Project Jupiter https://jupyter.org/ evolved from IPython notebooks, and now supports a wide variety of programming language back-ends. Notebooks have proven to be effective tools used in Data Science, providing convenient packages for what Don Knuth coined as "literate programming" in the 1980s: code plus exposition in markdown. Results of running the code appear in-line as interactive graphics -- all packaged as collaborative, web-based documents. Some have said that the introduction of cloud-based notebooks is nearly as large of a fundamental change in software practice as the introduction of spreadsheets.
O'Reilly Media has been considering the question, "What comes after books and video?" Or, as one might imagine more pointedly, what comes after Kindle? To that point we have collaborated with Project Jupyter to integrate notebooks into our content management process, allowing authors to generate articles, tutorials, reports, and other media products as notebooks that also incorporate video segments. Code dependencies are containerized using Docker, and all of the content gets managed in Git repositories. We have added another layer, an open source project called Thebe that provides a kind of "media player" for embedding the containerized notebooks into web pages
How to Become a Data Scientist
SF Data Science Meetup, June 30, 2014
Video of this talk is available here: https://www.youtube.com/watch?v=c52IOlnPw08
More information at: http://www.zipfianacademy.com
Zipfian Academy @ Crowdflower
A Web 2.0 Personal Learning Environment for Classical Chinese PoetryRalf Klamma
A Web 2.0 Personal Learning Environmentfor Classical Chinese Poetry
Yiwei Cao 曹怡蔚, Ralf Klamma, Yan Gao 高岩, Rynson W.H. Lau 劉永雄, and Matthias Jarke
Informatik 5, RWTH Aachen University
Department of Computer Science, City University of Hong Kong
Aachen, Germany
ICWL 2009
20.08.2009
Reliability is concerned with decreasing faults and their impact. The earlier the faults are detected the better. That's why this presentation talks about automated techniques using machine learning to detect faults as early as possible.
Concepts, use cases and principles to build big data systems (1)Trieu Nguyen
1) Introduction to the key Big Data concepts
1.1 The Origins of Big Data
1.2 What is Big Data ?
1.3 Why is Big Data So Important ?
1.4 How Is Big Data Used In Practice ?
2) Introduction to the key principles of Big Data Systems
2.1 How to design Data Pipeline in 6 steps
2.2 Using Lambda Architecture for big data processing
3) Practical case study : Chat bot with Video Recommendation Engine
4) FAQ for student
Introduction of streaming data, difference between batch processing and stream processing, Research issues in streaming data processing, Performance evaluation metrics , tools for stream processing.
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
What are the design considerations that go into architecting a modern data warehouse? This presentation will cover some of the requirements analysis, design decisions, and execution challenges of building a modern data lake/data warehouse.
Data Science at Scale - The DevOps ApproachMihai Criveti
DevOps Practices for Data Scientists and Engineers
1 Data Science Landscape
2 Process and Flow
3 The Data
4 Data Science Toolkit
5 Cloud Computing Solutions
6 The rise of DevOps
7 Reusable Assets and Practices
8 Skills Development
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...Mihai Criveti
Automate your Data Science pipeline with Ansible, Python and Kubernetes - ODSC Talk
What is Data Science and the Data Science Landscape
Process and Flow
Understanding Data
The Data Science Toolkit
The Big Data Challenge
Cloud Computing Solutions
The rise of DevOps in Data Science
Automate your data pipeline with Ansible
Speaker: Philippe Mizrahi - Associate Product Manager - Lyft
Abstract: Philippe Mizrahi works on Lyft’s data discovery and metadata engine, Amundsen. With the help of a Neo4j graph database, Amundsen has improved Lyft’s data discovery by reducing time to discover data by 10x.
During this session, Philippe will dive deep into Amundsen’s use cases, impact, and architecture, which effectively combines a comprehensive knowledge graph based upon Neo4j, centralized metadata and other search ranking optimizations to discover data quickly.
Talk on Data Discovery and Metadata by Mark Grover from July 2019.
Goes into detail of the problem, build/buy/adopt analysis and Lyft's solution - Amundsen, along with thoughts on the future.
Doing Analytics Right - Building the Analytics EnvironmentTasktop
Implementing analytics for development processes is challenging. As in discussed in the previous webinars, the right analytics are determined by the goals of the organization, not by the available data. So implementing your analytics solutions will require an efficient analytics and data architecture, including the ability to combine and stage data from heterogeneous sources. An architecture that excludes the ability to gain access to the necessary data will create a barrier to deploying your newly designed analytics program, and will force you back into the “light is brighter here” anti-pattern.
This webinar will describe the technical considerations of implementing the data architecture for your analytics program, and explain how Tasktop can help.
Fixing data science & Accelerating Artificial Super Intelligence DevelopmentManojKumarR41
This presentation discusses Challenges, Problems, Issues, Measures, Mistakes, Opportunities, Ideas, Technologies, Research and Visions around Data Science
HashGraph, Data Mesh, Data Trajectories, Citrix HDX and Anonos BigPrivacy
Combination of these 5 and few other ideas will ultimately lead us to the VGB Platform. Will soon come up with other document explaining the vision and how exactly work on the vision to gradually develop this Platform, which fixes Data Science Efforts Globally.
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...Amazon Web Services
AWS has a large and growing portfolio of big data management and analytics services, designed to be integrated into solution architectures that meet the needs of your business. In this session, we look at analytics through the eyes of a business intelligence analyst, a data scientist, and an application developer, and we explore how to quickly leverage Amazon Redshift, Amazon QuickSight, RStudio, and Amazon Machine Learning to create powerful, yet straightforward, business solutions.
172529main ken and_tim_software_assurance_research_at_west_virginiaCS, NcState
SA @ WV(software assurance research at West Virginia)
Kenneth McGill
NASA IV&V Facility Research Lead
304.367.8300
Kenneth.McGill@ivv.nasa.gov
Dr. Tim Menzies Ph.D. (WVU)
Software Engineering Research Chair
tim@menzies.us
Next Generation “Treatment Learning” (finding the diamonds in the dust)CS, NcState
Q: How have dummies (like me) managed to gain (some) control over a (seemingly) complex world?
A:The world is simpler than we think.
◆ Models contain clumps
◆ A few collar variables decide which clumps to use.
ICSE’14 Workshop Keynote Address: Emerging Trends in Software Metrics (WeTSOM’14).
Data about software projects is not stored in metrc1, metric2,…,
but is shared between them in some shared, underlying,shape.
Not every project has thesame underlying simple shape; many projects have different,
albeit simple, shapes.
We can exploit that shape, to great effect: for better local predictions; for transferring
lessons learned; for privacy-preserving data mining/
In the age of Big Data, what role for Software Engineers?CS, NcState
ABSTRACT:
Consider the premise of Big Data:
better conclusions = same algorithms + more data + more cpu
If this were always true, then there would be no role for human analysts
that reflected over the domain to offer insights that produce better solutions
(since all such insight is now automatically generated from the CPUs).
This talk proposes a marriage of sorts between Big Data and software
engineering. It reviews over a decade of work by the author in exploring
user goals using CPU-intensive methods. It will be shown that analyst-insight was
useful from building “better" tools (where “better” means generate
more succinct recommendations, runs faster, scales to much larger problems).
The conclusion will be that in the age of big data, human analysis is still
useful and necessary. But a new kind of software engineering analyst is required- one
that know how to take full advantage of the power of Big Data.
ABOUT THE AUTHOR:
Tim Menzies (P.hD., UNSW) is a Professor in CS at WVU; the author of
over 230 referred publications; and is one of the 50 most cited
authors in software engineering (out of 50,000+ researchers, see
http://goo.gl/wqpQl). At WVU, he has been a lead researcher on
projects for NSF, NIJ, DoD, NASA, USDA, as well as joint research work
with private companies. He teaches data mining and artificial
intelligence and programming languages.
Prof. Menzies is the co-founder of the PROMISE conference series
devoted to reproducible experiments in software engineering (see
http://promisedata.googlecode.com). He is an associate editor of IEEE
Transactions on Software Engineering, Empirical Software Engineering
and the Automated Software Engineering Journal. In 2012, he served as
co-chair of the program committee for the IEEE Automated Software
Engineering conference. In 2015, he will serve as co-chair for the
ICSE'15 NIER track. For more information, see his web site
http://menzies.us or his vita at http://goo.gl/8eNhY or his list of
pubs at http://goo.gl/0SWJ2p.
Scalable Product Line Configuration:
A Straw to Break the Camel’s Back
Abdel Salam Sayyad
Joseph Ingram
Tim Menzies
Hany Ammar
IEEE Automated SE,
Palo Alto, CA
Nov 2013
Class Level Fault Prediction using Software Clustering
for
IEEE ASE 2013
by
Giuseppe Scanniello (1) Carmine Gravino (2) Andrian Marcus (3) Tim Menzies (4)
from
1 University of Basilicata, Italy
2 Italy University of Salerno, Italy
3 Wayne State University, USA
4 West Virginia University, USA
On why computer science is DANGEROUS and why we should FORBID our children to study it just in case they become EVIL GENIUSES and try to TAKE OVER THE WORLD.
Warning: includes designs for building hydrogen bombs.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
Presented at NUS: Fuzzing and Software Security Summer School 2024
This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfKamal Acharya
The College Bus Management system is completely developed by Visual Basic .NET Version. The application is connect with most secured database language MS SQL Server. The application is develop by using best combination of front-end and back-end languages. The application is totally design like flat user interface. This flat user interface is more attractive user interface in 2017. The application is gives more important to the system functionality. The application is to manage the student’s details, driver’s details, bus details, bus route details, bus fees details and more. The application has only one unit for admin. The admin can manage the entire application. The admin can login into the application by using username and password of the admin. The application is develop for big and small colleges. It is more user friendly for non-computer person. Even they can easily learn how to manage the application within hours. The application is more secure by the admin. The system will give an effective output for the VB.Net and SQL Server given as input to the system. The compiled java program given as input to the system, after scanning the program will generate different reports. The application generates the report for users. The admin can view and download the report of the data. The application deliver the excel format reports. Because, excel formatted reports is very easy to understand the income and expense of the college bus. This application is mainly develop for windows operating system users. In 2017, 73% of people enterprises are using windows operating system. So the application will easily install for all the windows operating system users. The application-developed size is very low. The application consumes very low space in disk. Therefore, the user can allocate very minimum local disk space for this application.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
Maintaining high-quality standards in the production of TMT bars is crucial for ensuring structural integrity in construction. Addressing common defects through careful monitoring, standardized processes, and advanced technology can significantly improve the quality of TMT bars. Continuous training and adherence to quality control measures will also play a pivotal role in minimizing these defects.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
2. 2ai4se.net
slides= tiny.cc/se15
Data mining tools should,
and can, do much more
• Operating systems do more than just schedule processes:
– Editors
– Compilers
– File systems,
– Network
connections,
– Memory
management
– Etc
• What services should be standard in data mining tools?
ai4se.net
4. 4
slides= tiny.cc/se15
4ai4se.net
Not in this talk:
not what everyone else is talking about
• Principles for designing
case studies
• Visualizations
• Data mining
• Big Data
• Qualitative methods
see parts1+2
6. 6
slides= tiny.cc/se15
6ai4se.net
1. Software tools for
“citizen scientists”.
2. Beyond mere
data repositories
3. What happens when decision
software goes wrong?
4. Proposed services for
nextgen repositories
5. The Future?
ai4se.net
7. 7
slides= tiny.cc/se15
7ai4se.net
1. Software tools for
“citizen scientists”.
2. Beyond mere
data repositories
3. What happens when decision
software goes wrong?
4. Proposed services for
nextgen repositories
5. The Future?
ai4se.net
8. 8
slides= tiny.cc/se15
8ai4se.net
Software tools for “citizen scientists”
• Science has escaped the lab
– roaming free in the world.
• When every citizen can be a
scientist (making
generalizations from data)
– Then it should be possible to
audit those conclusions
• Want to mistrust the
conclusions of citizen scientists
– Just as we mistrust and
evaluate, review, explore, evolve
the conclusions of any other
scientist.
9. 9ai4se.net
slides= tiny.cc/se15
Software mediates what we see
and how we act in the world
1. Silicon valley developers view every new
feature as an experiment, to be tested
within some mash up.
2. Chemists win Nobel Prize for software
sims http://goo.gl/Lwensc
3. Engineers use software to optical
tweezers, radiation therapy, remote
sensing, chip design,
http://goo.gl/qBMyIZ
4. Web analysts use software to analyze
clickstreams to improve sales and
marketing strategies;
http://goo.gl/b26CfY
5. Stock traders write software to simulate
trading strategies
http://www.quantopian.com
6. Analysts write software to mine labor
statistics data to review proposed gov
policies http://goo.gl/X4kgnc
7. Journalists use software to analyze
economic data, make visualizations of their
news stories http://fivethirtyeight.com
8. In London or New York, ambulances wait
for your call at a location determined by a
software model http://goo.gl/8SMd1p
9. Etc etc etc
10. 10
slides= tiny.cc/se15
10ai4se.net
Important to understand how
software can divides us
See also “Facebook emotion study breached
ethical guidelines, researchers say” June 30,
2014, The Guardian http://goo.gl/gTRkmp
12. 12ai4se.net
slides= tiny.cc/se15
Better SE = better data science
= better science
• A data scientist isa
engineer
– Delivering, under
constraints, to
acceptable quality
standards
• A data scientist isa
software developer
– Complex scripts, test-
driven development,
version control
• A data scientist isa
requirements
engineering
– Understanding and
navigating and trading
off between user goals
• A data scientist isa agile
programmer
– Uses feedback from
writing, running code
and query results to
constantly revise goals
and code
Data scientist isa software engineering
13. 13
slides= tiny.cc/se15
13ai4se.net
1. Software tools for
“citizen scientists”.
2. Beyond mere
data repositories
3. What happens when decision
software goes wrong?
4. Proposed services for
nextgen repositories
5. The Future?
ai4se.net
14. 14ai4se.net
slides= tiny.cc/se15
#storeYourData
• URL openscience.us/repo
• Data from 100s of projects
• E.g. EUSE: 250,000K+ spreadsheets
• E.g. Softgoals: 150+ softgoal models
• Oldest continuous repository of SE data (2004)
14
http://openscience.us/repo
16. To design those
tools, ask:
1. What problems
are seen when
people try to share
data and
conclusions?
2. What minimal data
structures address
those problems?
Let’s talk tools
ai4se.net
17. 17
slides= tiny.cc/se15
17ai4se.net
1. Software tools for
“citizen scientists”.
2. Beyond mere
data repositories
3. What happens when decision
software goes wrong?
4. Proposed services for
nextgen repositories
5. The Future?
ai4se.net
18. 18
slides= tiny.cc/se15
18ai4se.net
Models have “certification envelopes”
• Columbia ice strike
– Size: 1200 m2
– Speed: 477 mpg (relative to vehicle)
• Certified as “safe” by the CRATER micro-
meteorite model.
– A experiment in CRATER’s DB:
• Size: 3cm3
• Speed: under 100 mpg
• Columbia, and crew, dies on re-entry
• Lesson: conclusions should come with a
“certification envelope”
– If new tests outside of the envelope of
the training set
– Raise an alert
Bad things happen when you stretch the envelope
19. 19
slides= tiny.cc/se15
19ai4se.net
Goals matter
• Learners work
this way
– Users want it
that way
• Waste of time
learning models
users do not want
– Better to tune
learning methods
to goals of users
• Enter search-based
software
engineering
– Multi-goal
optimization
Learners learn for X, users want Y
20. 20
slides= tiny.cc/se15
20ai4se.net
Locality matters
(what is true there may not be true here)
• Devanbu et al. ASE’11
Ecological Inference
• Betternburg et al. MSR’12
Think local, act global,
• Menzies et al. TSE’13
Local versus Global learning,
• Yang et al. IST’13
Handling local bias,
• Minku et al. ICSE’14
Best Use of Cross-Company Data
Using ensemble data
Using local data
Error(lessisbetter)
Not general models ,but general methods for local models
21. 21
slides= tiny.cc/se15
21ai4se.net
Sharing matters
• How was the error found so fast?
– Open science
Given enough eyes, all bugs are shallow
When (2013) What
Mar 15 “Better cross-company learning”
accepted to MSR’13
Mar 29 Camera-ready submitted
?Apr 10 Pre-prints go on-line
Apr 29 Hyeongmin Jeon, graduate student
at Pusan Natl. Univ.emailed us: can’t
reproduce result
May 4 Fayola Peters, checking code, found
error. Manic week of experiments
follow
May 11 We conclude results definitely wrong
May 12 Email MSR organizers. Our penalty?
Present paper and its error.
22. 22
slides= tiny.cc/se15
22ai4se.net
Compression and privacy matter
• Facebook, Google, Netflix etc
• Small X% of all users are subjects in continual experiments:
testing new features
• Data from studies, retained indefinitely, warehoused
– Problems with volume (needs compression)
– Problems with confidentiality (needs privacy)
• If I want to challenge the conclusions made by Facebook,
Google, Netflix, etc
– I need to be able to access, privately, that data
– (needs trusted sharing)
Squeezing and secrets
23. 23
slides= tiny.cc/se15
23ai4se.net
Lessons learned
• Certification envelopes (when not to trust conclusions)
• Goals matter (not everything is “classification”)
• Locality matters (when their conclusions do not hold for you)
• Need “streaming tools” (continually stream over a never
ending sequence of new data)
• Need repair tools (to fix broken ideas)
• Verification matters (sooner or later, we all screw up)
• Need to transfer data (get by with a little help from your
friends)
• Need compression tools (to save space)
• Need privacy tools (so you can share)
What matters?
24. 24
slides= tiny.cc/se15
24ai4se.net
1. Software tools for
“citizen scientists”.
2. Beyond mere
data repositories
3. What happens when decision
software goes wrong?
4. Proposed services for
nextgen repositories
5. The Future?
ai4se.net
25. 25
slides= tiny.cc/se15
25ai4se.net
Digression: WHERE:
O(N)top-down divisive clusterings
• Fast: works on an approximation to eigenvectors (the FASTMAP heuristic)
Faloutsos [1995]. A O(N) generation of axis of large variability
• Pick any point X;
• Find E= East = furthest from X,
• Find W = West furthest from East.
• East, West = “the poles”
• All points have distance a,b to (E,W)
• c = dist(W,E)
• x = (a2 + c2 − b2)/2c
• Find median(x), recurse on each half
26. 26
slides= tiny.cc/se15
26ai4se.net
WHERE approximates data as multiple
linear models (drawn in eigenspace)
If
Platt 2005: FASTMP= Nystrom algorithm = approximations to PCA.
combines similar influences, ignores irrelevancies, outliers
27. 27
slides= tiny.cc/se15
27ai4se.net
If
Hold that thought
Underlying data structure
to much of my current thinking
• If cluster to leaves of size sqrt(n),
• Only need 2*sqrt(n)-1 nodes, each with 2 poles
• So 4*sqrt(n) – 2 examples
• Which we can reduce, later (see optimization)
28. 28
slides= tiny.cc/se15
28ai4se.net
Is Where a multi-objective optimization algorithm?
Mutate towards useful “end”?
Now can reason about combinations of user goals?
Krall (WVU), Menzies et al. TSE 2015, GALE.
Orders of magnitude faster than standard
optimizers. Just as effective
• Evolutionary optimizers = select,
crossover, mutate, repeat
• Select:
• Evaluate each pole as you
descend the tree
• Cull the half leading to the
worst pole
• Crossover, mutate
• In the surviving leaves,
• mutate examples towards to
the best pole
30. 30
slides= tiny.cc/se15
30ai4se.net
Is WHERE a compression algorithm?
Use it for the certification envelope?
Ship models with a summary of their training data?
• Call each leaf one “class”
• Run a decision tree learner to
find a model for the “classes”
Vasil Papakroni, WVU masters thesis, 2012
Prediction using WHERE’s clusters works
Just as well as other standard methods
(for software effort and defect estimation)
• Anything lost for (e.g.)
prediction?
31. 31
slides= tiny.cc/se15
31ai4se.net
Can WHERE support locality?
Deliver specialized lessons for different problems?
• Build one model per
cluster using your learner
de jour
• O(log(N)) indexing of new
data to old models
• Push test data down the tree
Butcher, Menzies et al. Local vs Global. TSE’13.
Local models have better medians and less
variance
32. 32
slides= tiny.cc/se15
32ai4se.net
Is WHERE a tool for privacy?
• Hide the individuals, preserves the
shape of the data
• Don’t share all the data, just the
poles.
• 100% privacy on data not in
poles
• Don’t share the poles exactly,
• Mutate them slightly, by no
more than half the axis length
• Predictions in reduced space work
as well as in raw data space
Peters, Menzies, TSE’13, Balancing privacy and utility
33. 33
slides= tiny.cc/se15
33ai4se.net
Is WHERE an anomaly detector?
• WHERE’s trees are a
O(log(N)) time index to
the leaves
• Test data is “alien” if, after
falling to its nearest leaf, it
is outside of the poles
Peters, Menzies, ICSE’15, LACE2
34. 34
slides= tiny.cc/se15
34ai4se.net
WHERE and “the sharing trick”
• Community of N data owners
• Pass around a cache in random
order
• Owner “I” just adds anomalous
data
• Then privatized as per above
• Cache size: < 5%
• Models learned from cache as
good or better than from all raw
Peters, Menzies, ICSE’15, LACE2
35. 35
slides= tiny.cc/se15
35ai4se.net
Is WHERE a pollution marking tool
(here thar be dragons, best not go thar)
• Mark in as polluted all
sub-trees with more than
X% anomalies
• When making conclusions,
stay away from the
polluted sub-trees
Kocaguneli, Menzies et al, Analogy Estimation, TSE12
36. 36
slides= tiny.cc/se15
36ai4se.net
Is WHERE an incremental learner?
(i.e. data mining for streams)
• Build models per subtree,
using your learner de jour
• In all sub-trees, keep a sample
of data plus any anomalies
• When too many pollution
markers, recluster just that
sub-tree
• Dianne Gordon-Spears (2002):
such hierarchical incremental
repair 10,000 times faster
than global reorganizations
38. 38
slides= tiny.cc/se15
38ai4se.net
Lessons learned
• Certification envelopes (when not to trust conclusions)
• Goals matter (not everything is “classification”)
• Locality matters (when their conclusions do not hold for you)
• Need “streaming tools” (continually stream over a never
ending sequence of new data)
• Need repair tools (to fix broken ideas)
• Verification matters (sooner or later, we all screw up)
• Need compression tools (to save space)
• Need privacy tools (so you can share)
What matters?
39. 39
slides= tiny.cc/se15
39ai4se.net
1. Software tools for
“citizen scientists”.
2. Beyond mere
data repositories
3. What happens when decision
software goes wrong?
4. Proposed services for
nextgen repositories
5. The Future?
ai4se.net
40. 40
slides= tiny.cc/se15
40ai4se.net
Confucius: “Study the past if
you would define the future.”
• History of SE
– X is not part of SE
– People are having trouble with X
– Experiments: Extend SE to include X
– Conclusion: “you know what? SE tool support makes X easier”
41. 41
slides= tiny.cc/se15
41ai4se.net
• Future of SE
– Software mediates what we see and how we act in the world
– Everyone with software is now a scientist
– Software supports communities as they judge conclusions
Confucius: “Study the past if
you would define the future.”
42. 42
slides= tiny.cc/se15
42ai4se.net
To find the future,
extrapolate the past
• Future of SE
– Software mediates how everyone sees and acts on the world
– Everyone with software is now a scientist
– Software supports communities as they judge conclusions
44. 44
slides= tiny.cc/se15
44ai4se.net
Software engineering researchers just studying
software is like astronomers just studying telescopes.
• After we grind the lenses, we should look through the scope.
• After we build the software, we see how people are using it
46. 46ai4se.net
slides= tiny.cc/se15
About me
• Full Prof in CS NC State. Teaches SE and automated SE.
• Researches synergies human+AI, with focus on data
mining for SE.
• Assoc editor IEEE Transactions on SE, Empirical SE, the
Automated SE Journal , Software Quality Journal
• Was co-PC-chair for ASE’12, ICSE'15 NIER track.
• Will be co-general chair of ICMSE'16.
• Author of 230+ referred pubs.
• One of the 100th most cited authors in SE (of 80,000
http://goo.gl/BnFJs).
• PI for NSF, NIJ, DoD, NASA, USDA, and research work
with private companies.
• Co-founder of the PROMISE conference series on
reproducible experiments in SE.
• Current curator PROMISE web site, SE
research data http://openscience.us/repo .
• Vita: http://goo.gl/8eNhYM
• Pubs: https://goo.gl/qNQAIq
• Home page: http://menzies.us
50. 50
slides= tiny.cc/se15
50ai4se.net
• ECL: a higher-level set-
based language (more
succinct)
• But if you can write it
quick,
– you can write it wrong, quick.
• Implications for
– markets, ambulances, government
policies, homeland security,
toasters. Air safety, Nobel prizes,
web-company advertising polices,
do we take the family to Cairo for a
holiday, etc etc
Note: not necessarily solved by
higher-level languages
51. Sheldon: a grand unified theory, insofar as
it explains everything, will ipso facto
explain neurobiology.
Amy: Yes, but if I’m successful….
I will be able to map and reproduce your
thought processes in deriving a grand
unified theory, and therefore, subsume
your conclusions under my paradigm.
Recall the words of
Dr. Amy Farrar Fowler, Ph.D.
Apologies to fans of the BBT:
This conversation occurred in JPL,
cafeteria, not Amy’s flat
ai4se.net
53. 53
slides= tiny.cc/se15
53ai4se.net
WHERE = fast analog for PCA
(so WHERE is a heuristic spectral learner)
53ai4se.net
Spectral learners : works on eigenvectors
• combine related influences
• ignore outliers and irrelevancies
55. 55
slides= tiny.cc/se15
55ai4se.net
Transfer matters (and is possible)
B.Turhan,
T.Menzies, A.
Bener, J. Di
Stefano. 2009.
On the relative
value of cross-
company and
within-
company data
for defect
prediction.
Empirical
Softw. Eng.
14(5) 2009,
When not enough local data, ask your friends
57. 57ai4se.net
slides= tiny.cc/se15
If it works, try to make it better
• “The following is my valiant
attempt to capture the
difference (between PROMISE
and MSR)”
• “To misquote George Box, I
hope my model is more useful
than it is wrong:
– For the most part, the MSR
community was mostly
concerned with the initial
collection of data sets from
software projects.
– Meanwhile, the PROMISE
community emphasized the
analysis of the data after it was
collected.”
• “The PROMISE people
routinely posted all their data
on a public repository
– their new papers would re-
analyze old data, in an attempt
to improve that analysis.
– In fact, I used to joke
“PROMISE. Australian for
repeatability” (apologies to the
Fosters Brewing company). “
57
Dr. Prem Devanbu
UC Davis
General chair, MSR’14
The PROMISE Project
58. 58ai4se.net
slides= tiny.cc/se15
58
Perspective on
Data Science
for Software
Engineering
Tim Menzies
Laurie Williams
Thomas
Zimmermann
2014 2015 2016
The PROMISE Project
Oursummary. Andotherrelatedbooks
The MSR
community
and others