This document discusses mining software engineering data to improve software reliability. It presents the CAR-Miner approach, which mines sequence association rules from exception handling code to detect defects. CAR-Miner constructs exception flow graphs, generates static traces of normal and exception execution paths, and mines the traces to find sequence association rules specifying common exception handling patterns. It can check if applications violate the mined rules to find exception handling defects.
TMPA-2017: Predicate Abstraction Based Configurable Method for Data Race Dete...Iosif Itkin
TMPA-2017: Tools and Methods of Program Analysis
3-4 March, 2017, Hotel Holiday Inn Moscow Vinogradovo, Moscow
Predicate Abstraction Based Configurable Method for Data Race Detection in Linux Kernel
Pavel Andrianov, Vadim Mutilin,Alexey Khoroshilov, Institute for System Programming
For video follow the link: https://youtu.be/SxDVQ7lSTqc
Would like to know more?
Visit our website:
www.tmpaconf.org
www.exactprosystems.com/events/tmpa
Follow us:
https://www.linkedin.com/company/exactpro-systems-llc?trk=biz-companies-cym
https://twitter.com/exactpro
Synergy of Human and Artificial Intelligence in Software EngineeringTao Xie
Keynote Talk by Tao Xie at International NSF sponsored Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE 2013) http://promisedata.org/raise/2013/
TMPA-2017: Predicate Abstraction Based Configurable Method for Data Race Dete...Iosif Itkin
TMPA-2017: Tools and Methods of Program Analysis
3-4 March, 2017, Hotel Holiday Inn Moscow Vinogradovo, Moscow
Predicate Abstraction Based Configurable Method for Data Race Detection in Linux Kernel
Pavel Andrianov, Vadim Mutilin,Alexey Khoroshilov, Institute for System Programming
For video follow the link: https://youtu.be/SxDVQ7lSTqc
Would like to know more?
Visit our website:
www.tmpaconf.org
www.exactprosystems.com/events/tmpa
Follow us:
https://www.linkedin.com/company/exactpro-systems-llc?trk=biz-companies-cym
https://twitter.com/exactpro
Synergy of Human and Artificial Intelligence in Software EngineeringTao Xie
Keynote Talk by Tao Xie at International NSF sponsored Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE 2013) http://promisedata.org/raise/2013/
After the computing industry got started, a new problem quickly emerged. How do you operate this machines and how to you program them. The development of operating systems was relatively slow compared to the advances in hardware. First system were primitive but slowly got better as demand for computing power incresed. The ideas of the Graphical User Interfaces or GUI (Gooey) go back to Doug Engelbarts Demo of the Century. However, this did not have much impact on the computer industry. One company though, Xerox, a photocopy company explored these ideas with Palo Alto Park. Steve Jobs of Apple and Bill Gates of Microsoft took notice and Apple introduced first Apple Lisa and the Macintosh. In this lecture on we look so lessons for the development of software, and see how our business theories apply.
In this lecture on we look so lessons for the development of software, and see how our business theories apply.
In the second part we look at where software is going, namely Artifical Intelligence. Resent developmens in AI are causing an AI boom and new AI application are coming all the time. We look at machine learning and deep learning to get an understanding of the current trends.
H2O Open New York - Keynote, Sri Ambati, CEO H2O.aiSri Ambati
Keynote for H2O first Community Event for AI
Open Source Cancer and Open Source Health Data.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Automated testing of software applications using machine learning editedMilind Kelkar
Machine Learning is the next internet. It is the backbone of search engines, driverless car, paperless banking, and facial recognition in forensics. Running automated software tests with lesser human intervention without the risk of schedule delays is now a reality. This presentation will explore several practical machine learning concepts that are being adopted to test software applications.
Modeling avengers – open source technology mix for saving the worldCédric Brun
Planet earth is facing massive challenges: global warming and scarcity of natural resources among others. Those challenges are reaching a level of complexity unknown yet and trying to address those requires deep scientific understanding, real world data, specialized tools, inter-disciplinary collaboration and the ability to evaluate “What If” scenarios.
In collaboration with scientists from INRA (the French National Institute for Agricultural Research) we experienced one of those challenges: the use of natural resources for agricultural activities, especially water consumption. While the scientists insight was required in smart technologies like smart farms, this understanding was required to be expressed at an higher level of abstraction through specific tooling. They felt that providing highly dedicated tools with a small budget would require super powers. To us modeling people it looked like a very good fit for DSL’s (Domain Specific Languages), hence suitable for an experiment : let’s build specific modeling tools for smart farming systems!
This experiment represents a few days of work bringing open-source technologies together: EMF, Xtext, Sirius, Gemoc (a model debugging environment, including specific features for concurrency constraints), OptaPlanner (a constraint satisfaction solver from the JBoss community) and Acceleo, resulting in a collection of Eclipse based tools for farming systems (published on github). Just like in The Avengers, each technology bring its own capability but it is the amalgamation of all of them which lead to amazing power!
The session will start with a demo of the Smart Farming System Tooling, an environment to model, analyze and simulate an agricultural exploitation, biomass growth and water consumption based on user input and open data. Then we will dig deeper in how the technologies are mixed and used, among other questions: which of the textual or graphical syntax is better suited for a given aspect? how can we achieve a “perfect blend” of those syntaxes? how OptaPlanner and EMF can create a powerful synergy? how data from INRA can be structured and fed into the tool?
The talk will then evaluate how useful open-source technologies are in addressing this class of problems and how modeling can be used to support sustainability, enable broader engagement of the community, and facilitate more informed decision-making.
Modeling avengers – open source technology mix for saving the world econ frCédric Brun
Planet earth is facing massive challenges: global warming and scarcity of natural resources among others. Those challenges are reaching a level of complexity unknown yet and trying to address those requires deep scientific understanding, real world data, specialized tools, inter-disciplinary collaboration and the ability to evaluate “What If” scenarios.
In collaboration with scientists from INRA (the French National Institute for Agricultural Research) we experienced one of those challenges: the use of natural resources for agricultural activities, especially water consumption. While the scientists insight was required in smart technologies like smart farms, this understanding was required to be expressed at an higher level of abstraction through specific tooling. They felt that providing highly dedicated tools with a small budget would require super powers. To us modeling people it looked like a very good fit for DSL’s (Domain Specific Languages), hence suitable for an experiment : let’s build specific modeling tools for smart farming systems!
This experiment represents a few days of work bringing open-source technologies together: EMF, Xtext, Sirius, Gemoc (a model debugging environment, including specific features for concurrency constraints), OptaPlanner (a constraint satisfaction solver from the JBoss community) and Acceleo, resulting in a collection of Eclipse based tools for farming systems (published on github). Just like in The Avengers, each technology bring its own capability but it is the amalgamation of all of them which lead to amazing power!
The session will start with a demo of the Smart Farming System Tooling, an environment to model, analyze and simulate an agricultural exploitation, biomass growth and water consumption based on user input and open data. Then we will dig deeper in how the technologies are mixed and used, among other questions: which of the textual or graphical syntax is better suited for a given aspect? how can we achieve a “perfect blend” of those syntaxes? how OptaPlanner and EMF can create a powerful synergy? how data from INRA can be structured and fed into the tool?
Advanced QUnit - Front-End JavaScript Unit TestingLars Thorup
Code: https://github.com/larsthorup/qunit-demo-advanced
Unit testing front-end JavaScript presents its own unique set of challenges. In this session we will look at number of different techniques to tackle these challenges and make our JavaScript unit tests fast and robust. We plan to cover the following subjects:
* Mocking and spy techniques to avoid dependencies on
- Functions, methods and constructor functions
- Time (new Date())
- Timers (setTimeout, setInterval)
- Ajax requests
- The DOM
- Events
* Structuring tests for reuse and readability
* Testing browser-specific behaviour
* Leak testing
You have your shiny new DSL up and running thanks to the Eclipse Modeling Technologies and you built a powerful tooling with graphical modelers, textual syntaxes or dedicated editors to support it. But how can you see what is going on when a model is executed ? Don't you need to simulate your design in some way ? Wouldn't you want to see your editors being animated directly within your modeling environment based on execution traces or simulator results?
A talk on static code analysis tools such as jshint, jscs, and eslint and how to use them to write good (stylish) code. Also introducing tools to enforce using the correct style via editorconfig or js-beautify to minimize efforts to write good code.
Compact Street Lights - 25W LED STELLAR STREET LIGHT SpecificationsCompact Lighting
Get the Compact Street Lights - 25W LED STELLAR STREET LIGHT at very affordable prices from largest lighting manufacturer, You can get here high quality street lights and at your budget price.
After the computing industry got started, a new problem quickly emerged. How do you operate this machines and how to you program them. The development of operating systems was relatively slow compared to the advances in hardware. First system were primitive but slowly got better as demand for computing power incresed. The ideas of the Graphical User Interfaces or GUI (Gooey) go back to Doug Engelbarts Demo of the Century. However, this did not have much impact on the computer industry. One company though, Xerox, a photocopy company explored these ideas with Palo Alto Park. Steve Jobs of Apple and Bill Gates of Microsoft took notice and Apple introduced first Apple Lisa and the Macintosh. In this lecture on we look so lessons for the development of software, and see how our business theories apply.
In this lecture on we look so lessons for the development of software, and see how our business theories apply.
In the second part we look at where software is going, namely Artifical Intelligence. Resent developmens in AI are causing an AI boom and new AI application are coming all the time. We look at machine learning and deep learning to get an understanding of the current trends.
H2O Open New York - Keynote, Sri Ambati, CEO H2O.aiSri Ambati
Keynote for H2O first Community Event for AI
Open Source Cancer and Open Source Health Data.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Automated testing of software applications using machine learning editedMilind Kelkar
Machine Learning is the next internet. It is the backbone of search engines, driverless car, paperless banking, and facial recognition in forensics. Running automated software tests with lesser human intervention without the risk of schedule delays is now a reality. This presentation will explore several practical machine learning concepts that are being adopted to test software applications.
Modeling avengers – open source technology mix for saving the worldCédric Brun
Planet earth is facing massive challenges: global warming and scarcity of natural resources among others. Those challenges are reaching a level of complexity unknown yet and trying to address those requires deep scientific understanding, real world data, specialized tools, inter-disciplinary collaboration and the ability to evaluate “What If” scenarios.
In collaboration with scientists from INRA (the French National Institute for Agricultural Research) we experienced one of those challenges: the use of natural resources for agricultural activities, especially water consumption. While the scientists insight was required in smart technologies like smart farms, this understanding was required to be expressed at an higher level of abstraction through specific tooling. They felt that providing highly dedicated tools with a small budget would require super powers. To us modeling people it looked like a very good fit for DSL’s (Domain Specific Languages), hence suitable for an experiment : let’s build specific modeling tools for smart farming systems!
This experiment represents a few days of work bringing open-source technologies together: EMF, Xtext, Sirius, Gemoc (a model debugging environment, including specific features for concurrency constraints), OptaPlanner (a constraint satisfaction solver from the JBoss community) and Acceleo, resulting in a collection of Eclipse based tools for farming systems (published on github). Just like in The Avengers, each technology bring its own capability but it is the amalgamation of all of them which lead to amazing power!
The session will start with a demo of the Smart Farming System Tooling, an environment to model, analyze and simulate an agricultural exploitation, biomass growth and water consumption based on user input and open data. Then we will dig deeper in how the technologies are mixed and used, among other questions: which of the textual or graphical syntax is better suited for a given aspect? how can we achieve a “perfect blend” of those syntaxes? how OptaPlanner and EMF can create a powerful synergy? how data from INRA can be structured and fed into the tool?
The talk will then evaluate how useful open-source technologies are in addressing this class of problems and how modeling can be used to support sustainability, enable broader engagement of the community, and facilitate more informed decision-making.
Modeling avengers – open source technology mix for saving the world econ frCédric Brun
Planet earth is facing massive challenges: global warming and scarcity of natural resources among others. Those challenges are reaching a level of complexity unknown yet and trying to address those requires deep scientific understanding, real world data, specialized tools, inter-disciplinary collaboration and the ability to evaluate “What If” scenarios.
In collaboration with scientists from INRA (the French National Institute for Agricultural Research) we experienced one of those challenges: the use of natural resources for agricultural activities, especially water consumption. While the scientists insight was required in smart technologies like smart farms, this understanding was required to be expressed at an higher level of abstraction through specific tooling. They felt that providing highly dedicated tools with a small budget would require super powers. To us modeling people it looked like a very good fit for DSL’s (Domain Specific Languages), hence suitable for an experiment : let’s build specific modeling tools for smart farming systems!
This experiment represents a few days of work bringing open-source technologies together: EMF, Xtext, Sirius, Gemoc (a model debugging environment, including specific features for concurrency constraints), OptaPlanner (a constraint satisfaction solver from the JBoss community) and Acceleo, resulting in a collection of Eclipse based tools for farming systems (published on github). Just like in The Avengers, each technology bring its own capability but it is the amalgamation of all of them which lead to amazing power!
The session will start with a demo of the Smart Farming System Tooling, an environment to model, analyze and simulate an agricultural exploitation, biomass growth and water consumption based on user input and open data. Then we will dig deeper in how the technologies are mixed and used, among other questions: which of the textual or graphical syntax is better suited for a given aspect? how can we achieve a “perfect blend” of those syntaxes? how OptaPlanner and EMF can create a powerful synergy? how data from INRA can be structured and fed into the tool?
Advanced QUnit - Front-End JavaScript Unit TestingLars Thorup
Code: https://github.com/larsthorup/qunit-demo-advanced
Unit testing front-end JavaScript presents its own unique set of challenges. In this session we will look at number of different techniques to tackle these challenges and make our JavaScript unit tests fast and robust. We plan to cover the following subjects:
* Mocking and spy techniques to avoid dependencies on
- Functions, methods and constructor functions
- Time (new Date())
- Timers (setTimeout, setInterval)
- Ajax requests
- The DOM
- Events
* Structuring tests for reuse and readability
* Testing browser-specific behaviour
* Leak testing
You have your shiny new DSL up and running thanks to the Eclipse Modeling Technologies and you built a powerful tooling with graphical modelers, textual syntaxes or dedicated editors to support it. But how can you see what is going on when a model is executed ? Don't you need to simulate your design in some way ? Wouldn't you want to see your editors being animated directly within your modeling environment based on execution traces or simulator results?
A talk on static code analysis tools such as jshint, jscs, and eslint and how to use them to write good (stylish) code. Also introducing tools to enforce using the correct style via editorconfig or js-beautify to minimize efforts to write good code.
Compact Street Lights - 25W LED STELLAR STREET LIGHT SpecificationsCompact Lighting
Get the Compact Street Lights - 25W LED STELLAR STREET LIGHT at very affordable prices from largest lighting manufacturer, You can get here high quality street lights and at your budget price.
Asynchronous operations are getting more and more popular. To the point that we are getting frameworks and environments revolving strictly around that concept. Boost.ASIO, Twisted and node.js are notable example. We will not explore that area. We will focus on techniques for making asynchronous more readable. We will present different currently used solutions. At the end we will introduce coroutines and explain the concept. We will show how these can be integrated with asynchronous code and what we benefit from using coroutines in asynchronous code.
Speaker: Sylvain Lebresne, Software Engineer at DataStax
Video: http://www.youtube.com/watch?v=4GSfAS4nFAs&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=18
Since its inception, the Cassandra Query Language (CQL) has grown and matured, resulting in the 3rd version of the language (CQL3) being finalized in Cassandra 1.2 and further improved in Cassandra 2.0. Compared to the legacy Thrift API, CQL3 aims at providing an API that is higher level, more user friendly, but still fully assumes the distributed nature of Cassandra and it's storage engine. This talk will present CQL3, describing the reasoning and goals behind the language as well as the language itself. We will also touch on CQL's relationship with Thrift and will present the CQL binary protocol that has been introduced in Cassandra 1.2. We will wrap up by discussing the future of CQL.
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...Tao Xie
MSR 2022 Foundational Contribution Award Talk on "Software Analytics: Reflection and Path Forward" by Dongmei Zhang and Tao Xie
https://conf.researchr.org/info/msr-2022/awards
Diversity and Computing/Engineering: Perspectives from AlliesTao Xie
Slides from the invited talk given on Feb 13, 2019 being part of a diversity and inclusion week - Infusion 2019. Infusion is a diversity focused week for the Illinois College of Engineering, hosted by the Dean's Student Advisory Committee of Engineering Council. This invited talk was co-hosted by the NSBE - UIUC chapter.
Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie
2018 Distinguished Speaker, the UC Irvine Institute for Software Research (ISR) Distinguished Speaker Series 2018-2019. "Intelligent Software Engineering: Synergy between AI and Software Engineering" http://isr.uci.edu/content/isr-distinguished-speaker-series-2018-2019
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...Tao Xie
Invited Talk at the 2018 Computing in the 21st Century Conference & Asia Faculty Summit on MSRA’s 20th Anniversary https://www.microsoft.com/en-us/research/event/computing-in-the-21st-century-conference-asia-faculty-summit-on-msras-20th-anniversary/#!agenda
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...Tao Xie
2018 Keynote Speaker, Symposium on Dependable Software Engineering - Theories, Tools and Applications (SETTA 2018). "Intelligent Software Engineering: Synergy between AI and Software Engineering" http://confesta2018.csp.escience.cn/dct/page/65581
Transferring Software Testing Tools to PracticeTao Xie
ACM SIGSOFT Webinar co-presented by Nikolai Tillmann (Microsoft), Judith Bishop (Microsoft Research), Pratap Lakshman (Microsoft), Tao Xie (University of Illinois at Urbana-Champaign) http://www.sigsoft.org/resources/webinars.html
Transferring Software Testing and Analytics Tools to PracticeTao Xie
Keynote Talk in the Workshop on Testing: Academia-Industry Collaboration, Practice and Research Techniques (TAIC PART 2016) http://www2016.taicpart.org/
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Improving Software Reliability via Mining Software Engineering Data
1. Improving Software Reliability
via Mining Software Engineering Data
Tao Xie
Department of Computer Science
North Carolina State University
Raleigh, USA
http://www.csc.ncsu.edu/faculty/xie
Joint work with Suresh Thummalapenta
2. 2
MAIN GOAL
Transform static record-
keeping SE data to active
data
Make SE data actionable
by uncovering hidden
patterns and trends
Mining Software Engineering Data
MailingsBugzilla
Code
repository
Execution
traces
CVS
3. Mining Software Engineering Data
code
bases
change
history
program
states
structural
entities
software engineering data
bug
reports/nl
programming defect detection testing debugging maintenance
software engineering tasks
data mining techniques
…
…
https://sites.google.com/site/asergrp/dmse
4. Mining Software Engineering Data
code
bases
change
history
program
states
structural
entities
software engineering data
bug
reports/nl
programming defect detection testing debugging maintenance
software engineering tasks
data mining techniques
…
…
5. 5
5
Programmers commonly reuse APIs of existing
frameworks or libraries
–
Advantages: High productivity of development
–
Challenges: Complexity and lack of documentation
–
Consequences:
•
Spend more efforts in understanding APIs
•
Introduce defects in API client code
–
Solution: Mining API properties as common patterns
across API client code Frame
works
Motivation
9. 9
APIs throw exceptions during runtime errors
Example: Session API of Hibernate framework throws
HibernateException
APIs expect client applications to implement
recovery actions after exceptions occur
Example: Hibernate Session API expects client application to
rollback open uncommitted transactions after
HibernateException occurs
Failure to handle exceptions results in
Fatal issues, e.g., database lock won’t be released if the
transaction is not rolled back
Exception Handling
10. 10
Use exception-handling specification to detect
violations as defects
Problem: Often specifications are not documented
Solution: Mine specifications from existing API client code
Challenges:
Limited data points: Only from a few code bases
searching + mining
Limited expressiveness: Not sufficient to characterize
common exception-handling behaviors: why?
Problem Addressed by CAR-Miner
11. 11
Example
1 .1 : ...
1 .2 : O r a c le D a ta S o u rc e o d s = n u ll; S e s s io n s e s s io n = n u ll;
C o n n e c tio n c o n n = n u ll; S ta te m e n t s ta te m e n t = n u ll;
1 .3 : lo g g e r .d e b u g (" S ta rtin g u p d a te " );
1 .4 : tr y {
1 .5 : o d s = n e w O ra c le D a ta S o u rc e ( );
1 .6 : o d s .s e tU R L ( " jd b c :o r a c le :th in :s c o tt/tig e r @ 1 9 2 .1 6 8 .1 .2 :1 5 2 1 :c a tfis h " ) ;
1 .7 : c o n n = o d s .g e tC o n n e c tio n () ;
1 .8 : s t a te m e n t = c o n n .c r e a te S ta te m e n t( );
1 .9 : s t a te m e n t .e x e c u t e U p d a te ( " D E L E T E F R O M t a b le 1 " ) ;
1 .1 0 : c o n n e c tio n .c o m m it() ; }
1 .1 1 : c a tc h ( S Q L E x c e p tio n s e ) {
1 .1 3 : lo g g e r .e rr o r (" E x c e p tio n o c c u r re d " ); }
1 .1 4 : fin a lly {
1 .1 5 : if(s ta te m e n t != n u ll) { s ta te m e n t.c lo s e () ; }
1 .1 6 : if(c o n n != n u ll) { c o n n .c lo s e ( ); }
1 .1 7 : if(o d s != n u ll) { o d s .c lo s e () ; } }
1 .1 8 : }
S c e n a r io 1
Defect: No rollback done
when SQLException occurs
Requires specification such
as “Connection should be
rolled back when a
connection is created and
SQLException occurs”
Q: Should every connection
instance has to be rolled
back when SQLException
occurs?
Missing “conn.rollback()”
12. 12
Example (cont.)
2 .1 : C o n n e c tio n c o n n = n u ll;
2 .2 : S ta te m e n t s tm t = n u ll;
2 .3 : B u ffe re d W rite r b w = n u ll; F ile W rite r fw = n u ll;
2 .3 : tr y {
2 .4 : fw = n e w F il e W rite r( " o u tp u t.tx t") ;
2 .5 : b w = B u ffe r e d W r ite r(fw );
2 .6 : c o n n = D riv e r M a n a g e r.g e tC o n n e c tio n (" jd b c :p l:d b " , " p s " , " p s " );
2 .7 : S ta t e m e n t s t m t = c o n n .c r e a t e S ta te m e n t( );
2 .8 : R e s u ltS e t r e s = s tm t.e x e c u te Q u e r y ( " S E L E C T P a th F R O M F ile s " ) ;
2 .9 : w h ile (r e s .n e x t() ) {
2 .1 0 : b w .w r ite (r e s .g e tS trin g (1 ));
2 .1 1 : }
2 .1 2 : re s .c lo s e ( );
2 .1 3 : } c a tc h ( IO E x c e p tio n e x ) { lo g g e r.e r ro r (" IO E x c e p tio n o c c u r re d " );
2 .1 4 : } fin a lly {
2 .1 5 : if( s tm t != n u ll) s tm t.c lo s e () ;
2 .1 6 : if( c o n n != n u ll ) c o n n .c lo s e ( );
2 .1 7 : if ( b w != n u ll) b w .c lo s e ( );
2 .1 8 : }
1 .1 : ...
1 .2 : O r a c le D a ta S o u rc e o d s = n u ll; S e s s io n s e s s io n = n u ll;
C o n n e c tio n c o n n = n u ll; S ta te m e n t s ta te m e n t = n u ll;
1 .3 : lo g g e r .d e b u g (" S ta rtin g u p d a te " );
1 .4 : tr y {
1 .5 : o d s = n e w O ra c le D a ta S o u rc e ( );
1 .6 : o d s .s e tU R L ( " jd b c :o r a c le :th in :s c o tt/tig e r @ 1 9 2 .1 6 8 .1 .2 :1 5 2 1 :c a tfis h " );
1 .7 : c o n n = o d s .g e tC o n n e c tio n () ;
1 .8 : s t a te m e n t = c o n n .c r e a te S ta te m e n t( );
1 .9 : s t a te m e n t .e x e c u t e U p d a te ( " D E L E T E F R O M t a b le 1 " ) ;
1 .1 0 : c o n n e c tio n .c o m m it() ; }
1 .1 1 : c a tc h ( S Q L E x c e p tio n s e ) {
1 .1 2 : if ( c o n n != n u ll) { c o n n .ro llb a c k () ; }
1 .1 3 : lo g g e r .e rr o r (" E x c e p tio n o c c u r re d " ); }
1 .1 4 : fin a lly {
1 .1 5 : if(s ta te m e n t != n u ll) { s ta te m e n t.c lo s e () ; }
1 .1 6 : if(c o n n != n u ll) { c o n n .c lo s e ( ); }
1 .1 7 : if(o d s != n u ll) { o d s .c lo s e () ; } }
1 .1 8 : }
S c e n a rio 2S c e n a r io 1
Specification: “Connection creation => Connection rollback”
Satisfied by Scenario 1 but not by Scenario 2
But Scenario 2 has no defect
c
13. 13
Simple association rules of the form “FCa => FCe” are
not expressive
Requires more general association rules (sequence
association rules) such as
(FCc1 FCc2) Λ FCa => FCe1, where
FCc1 -> Connection conn = OracleDataSource.getConnection()
FCc2 -> Statement stmt = Connection.createStatement()
FCa -> stmt.executeUpdate()
FCe1 -> conn.rollback()
Example (cont.)
14. 14
Simple association rules of the form “FCa => FCe” are
not expressive
Requires more general association rules (sequence
association rules) such as
(FCc1 FCc2) Λ FCa => FCe1, where
FCc1 -> Connection conn = OracleDataSource.getConnection()
FCc2 -> Statement stmt = Connection.createStatement()
FCa -> stmt.executeUpdate() //Triggering Action
FCe1 -> conn.rollback()
Example (cont.)
15. 15
Simple association rules of the form “FCa => FCe” are
not expressive
Requires more general association rules (sequence
association rules) such as
(FCc1 FCc2) Λ FCa => FCe1, where
FCc1 -> Connection conn = OracleDataSource.getConnection()
FCc2 -> Statement stmt = Connection.createStatement()
FCa -> stmt.executeUpdate()
FCe1 -> conn.rollback() //Recovery Action
Example (cont.)
16. 16
Simple association rules of the form “FCa => FCe” are
not expressive
Requires more general association rules (sequence
association rules) such as
(FCc1 FCc2) Λ FCa => FCe1, where
FCc1 -> Connection conn = OracleDataSource.getConnection()
FCc2 -> Statement stmt = conn.createStatement() //Context
FCa -> stmt.executeUpdate()
FCe1 -> conn.rollback()
Example (cont.)
17. 17
CAR-Miner Approach
Input
Application
Check whether there are
any exception-related
defects
Classes and
Functions
Open Source Projects on web
Open Source Projects on web
1 2 N…
…
Exception-Flow
Graphs
Static Traces
Sequence
Association
Rules
Violations
Extract classes
and functions
reused
Issue queries and collect relevant
code examples. Eg: “lang:java
java.sql.Statement executeUpdate”
Construct exception-
flow graphs
Collect static traces
Mine static traces
Detect violations
26. 26
Static Trace Mining
Handle traces of each function call (triggering
function call) individually
Input: Two sequence databases with a one-to-one
mapping
•
normal function-call sequences (context)
•
exception function-call sequences (recovery)
Objective: Generate sequence association rules of the
form
(FCc1 ... FCcn) Λ FCa => FCe1 ... FCen
Context Trigger Recovery
27. 27
Input: Two sequence databases with a one-to-one mapping
Mining Problem Definition
Objective: To get association rules of the form
FC1 FC2 ... FCm -> FE1 FE2 ... FEn
where {FC1, FC2, ..., Fcm} Є SDB1 and {FE1, FE2, ..., Fen} Є SDB2
Existing association rule mining algorithms cannot be directly
applied on multiple sequence databases
Context Recovery
28. 28
Annotate the sequences to generate a single combined database
Mining Problem Solution
Apply frequent subsequence mining algorithm [Wang and Han, ICDE 04]
to get frequent sequences
Transform mined sequences into sequence association rules
Rank rules based on the support assigned by frequent
subsequence mining algorithm
(3 10) Λ FCa => (2 8)
Context Trigger Recovery
30. 30
Violation Detection
Analyze each call site of triggering call FCa
Step 1: Extract context call sequence “CC1
CC2 ... CCm” from the beginning of the
function to the call site of FCa
Step 2: If CC1 CC2 ... CCm is super-sequence
of FCc1 ... FCcn
Report any missing function calls of {FCe1 ... FCen} in
any exception path
API client: (CC1 CC2 ... CCm) Λ FCa => Missing any?
isSuperSeqOf
API Rule: (FCc1 ... FCcn) Λ FCa => FCe1 ... FCen
Context Trigger Recovery
31. 31
Evaluation
Research Questions:
1. Do the mined rules represent real rules?
2. Do the detected violations represent real
defects?
3. Does CAR-Miner perform better than WN-
miner [Weimer and Necula, TACAS 05]?
4. Do the sequence association rules help
detect new defects?
32. 32
Subjects
Internal Info: classes and methods belonging to the app
External Info: classes and methods used by the app
Code examples: #files collected through code search engine
33. 33
RQ1: Real Rules
Real rules: 55% (Total: 294)
Usage patterns: 3%
False positives: 43%
Do the mined rules represent real rules?
34. 34
RQ1: Distribution of Real Rules for Axion
#false positives is quite low between 1 to 60 rules
Distribution of rules based on ranks assigned by CAR-Miner
35. 35
RQ2: Detected Violations
Do the detected violations represent real defects?
Total number of defects: 160
New defects not found by WN-Miner approach: 87
36. 36
RQ2: Status of Detected Violations
HsqlDB developers responded on the first 10 reported
defects
Accepted 7 defects
Rejected 3 defects
Reason given by HsqlDB developers for rejected defects:
“Although it can throw exceptions in general, it should not throw with
HsqlDB, So it is fine”
37. 37
RQ3: Comparison with WN-miner
Does CAR-Miner performs better than WN-miner?
Found 224 new rules and missed 32 rules
CAR-Miner detected most of the rules mined by WN-miner
Two major factors:
sequence association rules
Increase in the data scope
38. 38
RQ4: New defects by sequence association rules
Detected 21 new real defects among all applications
Do the sequence association rules detect new defects?
40. 40
40
Existing approaches produce a large number of false
positives
One major observation:
Programmers often write code in different ways for
achieving the same task
Some ways are more frequent than others
Large Number of False Positives
Frequent
ways
Infrequent
ways
Mined Patterns
mine patterns detect violations
41. 41
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string>
entries)
{
…
Iterator it = entries.iterator();
if(it.hasNext()) {
string last = (string) it.next();
}
…
}
PrintEntries1(ArrayList<string>
entries)
{
…
Iterator it = entries.iterator();
if(it.hasNext()) {
string last = (string) it.next();
}
…
}
Code Sample 1
PrintEntries2(ArrayList<string>
entries)
{
…
if(entries.size() > 0) {
Iterator it = entries.iterator();
string last = (string) it.next();
}
…
}
PrintEntries2(ArrayList<string>
entries)
{
…
if(entries.size() > 0) {
Iterator it = entries.iterator();
string last = (string) it.next();
}
…
}
Code Example 2
Code Sample 2
Java.util.Iterator.next() throws NoSuchElementException when invoked on a list
without any elements
42. 42
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string>
entries)
{
…
Iterator it = entries.iterator();
if(it.hasNext()) {
string last = (string) it.next();
}
…
}
PrintEntries1(ArrayList<string>
entries)
{
…
Iterator it = entries.iterator();
if(it.hasNext()) {
string last = (string) it.next();
}
…
}
Code Sample 1
PrintEntries2(ArrayList<string>
entries)
{
…
if(entries.size() > 0) {
Iterator it = entries.iterator();
string last = (string) it.next();
}
…
}
PrintEntries2(ArrayList<string>
entries)
{
…
if(entries.size() > 0) {
Iterator it = entries.iterator();
string last = (string) it.next();
}
…
}
Code Sample 2
1243 code examples
Sample 1 (1218 / 1243)
Sample 2 (6/1243)
Mined Pattern from existing approaches:
“boolean check on return of Iterator.hasNext before Iterator.next”
43. 43
Example: java.util.Iterator.next()
Require more general patterns (alternative patterns): P1 or P2
P1 : boolean check on return of Iterator.hasNext before Iterator.next
P2 : boolean check on return of ArrayList.size before Iterator.next
Cannot be mined by existing approaches, since alternative P2
PrintEntries1(ArrayList<string>
entries)
{
…
Iterator it = entries.iterator();
if(it.hasNext()) {
string last = (string) it.next();
}
…
}
PrintEntries1(ArrayList<string>
entries)
{
…
Iterator it = entries.iterator();
if(it.hasNext()) {
string last = (string) it.next();
}
…
}
Code Sample 1
PrintEntries2(ArrayList<string>
entries)
{
…
if(entries.size() > 0) {
Iterator it = entries.iterator();
string last = (string) it.next();
}
…
}
PrintEntries2(ArrayList<string>
entries)
{
…
if(entries.size() > 0) {
Iterator it = entries.iterator();
string last = (string) it.next();
}
…
}
Code Sample 2
44. 44
Our Solution: ImMiner Algorithm
Mines alternative patterns of the form P1 or P2
Based on the observation that infrequent alternatives such as P2 are
frequent among code examples that do not support P1
1243 code examples
Sample 1 (1218 / 1243)
Sample 2 (6/1243)
P2 is frequent among code
examples not supporting P1
P2 is infrequent among entire
1243 code examples
45. 45
Alternative Patterns
ImMiner mines three kinds of alternative
patterns of the general form “P1 or P2”
Balanced: all alternatives (both P1 and P2) are frequent
Imbalanced: some alternatives (P1) are frequent and
others are infrequent (P2). Represented as “P1 or P^
2”
Single: only one alternative
46. 46
ImMiner Algorithm
Uses frequent-itemset mining [Burdick et al. ICDE 01]
iteratively
An input database with the following APIs
for Iterator.next()
Input database Mapping of IDs to APIs
47. 47
ImMiner Algorithm: Frequent Alternatives
Input database
Frequent itemset
mining
(min_sup 0.5)
Frequent item: 1
P1: boolean-check on the return of
Iterator.hasNext() before Iterator.next()
48. 48
ImMiner: Infrequent Alternatives of P1
Positive database (PSD)
Negative database (NSD)
Split input database into two databases: Positive and Negative
Mine patterns that are frequent in NSD and are infrequent in PSD
Reason: Only such patterns serve as alternatives for P1
Alternative Pattern : P2 “const check on the return of ArrayList.size()
before Iterator.next()”
Alattin applies ImMiner algorithm to detect neglected conditions
49. 49
Neglected Conditions
Neglected conditions refer to
Missing conditions that check the arguments or
receiver of the API call before the API call
Missing conditions that check the return or
receiver of the API call after the API call
One of the primary reasons for many fatal
issues
security or buffer-overflow vulnerabilities [Chang et
al. ISSTA 07]
50. 50
Evaluation
Research Questions:
1. Do alternative patterns exist in real
applications?
2. How high percentage of false positives are
reduced (with low or no increase of false
negatives) in detected violations?
51. 51
Subjects
Two categories of subjects:
3 Java default API libraries
3 popular open source libraries
#Samples: #code examples collected from Google code search
52. 52
RQ1: Balanced and Imbalanced Patterns
How high percentage of balanced and imbalanced patterns exist in real
apps?
Balanced patterns: 0% to 30% (average: 9.69%)
Imbalanced patterns:
30% to 100% (average: 65%) for Java default API libraries
0% to 9.5% (average: 5%) for open source libraries
Explanation: Java default API libraries provide more different ways of
writing code compared to open source libraries
53. 53
RQ2: False Positives and False Negatives
How high % of false positives are reduced (with low or no increase of
false negatives)?
Applied mined patterns (“P1 or P2 or ... or Pi or A^
1 or A^
2 or ... or A^
j ”) in
three modes:
Existing mode:
“P1 or P2 or ... or Pi or A^
1 or A^
2 or ... or A^
j ”
P1 ,P2, ... , Pi
Balanced mode:
“P1 or P2 or ... or Pi or A^
1 or A^
2 or ... or A^
j ”
“P1 or P2 or ... orPi”
Imbalanced mode:
“P1 or P2 or ... or Pi or A^
1 or A^
2 or ... or A^
j ”
“P1 or P2 or ... or Pi or A^
1 or A^
2 or ... or A^
j ”
56. 56
Conclusion
Problem-driven methodology by identifying
•
new problems, patterns
•
mining algorithms, defects
CAR-Miner [ICSE 09]: mining sequence association
rules of the form
(FCc1 ... FCcn) Λ FCa => (FCe1 ... Fcen)
Context Trigger Recovery
reduce false negatives
Alattin [ASE 09]: mining alternative patterns classified
into three categories: balanced, imbalanced, and single
P1 or P2 or ... or Pi or A^
1 or A^
2 or ... or A^
j
reduce false positives
57. 57
Other Selected Work on Mining SE Data
API/Trace mining
•
MAPO: mining call sequences for code reuse [ECOOP 09]
•
MSeqGen: mining call seqs for test gen [ESEC/FSE 09]
•
MAM: mining API mapping for lang migration [ICSE 10]
•
Iterative mining of resource-releasing specs [ASE 11]
•
StackMine: mining callstack traces [ICSE 12]
•
INDICATOR: mining parameters dependency [WWW 13]
Text mining
•
Mining bug reports@Cisco for security ones [MSR 10]
•
Mining bug reports+exec traces for duplicates [ICSE 08]
•
Mining API docs for defect detection [ASE 09, ICSE 12]
•
Mining requirements for policy extraction [FSE 12]
T. Xie, S. Thummalapenta, D. Lo, and C. Liu. Data Mining for Software Engineering.
IEEE Computer, August 2009.