This document discusses intelligent software engineering and the synergy between artificial intelligence and software engineering. It provides an overview of Tao Xie's research group on software analytics, which uses data analysis to provide insights for software tasks. Example research topics discussed include program synthesis, mining software data, and improving software testing using machine learning. The document also discusses challenges for AI systems like neural machine translation and self-driving cars, and the need for new techniques in adversarial testing and quality assurance when applying AI to software.
Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie
2018 Distinguished Speaker, the UC Irvine Institute for Software Research (ISR) Distinguished Speaker Series 2018-2019. "Intelligent Software Engineering: Synergy between AI and Software Engineering" http://isr.uci.edu/content/isr-distinguished-speaker-series-2018-2019
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...Tao Xie
2018 Keynote Speaker, Symposium on Dependable Software Engineering - Theories, Tools and Applications (SETTA 2018). "Intelligent Software Engineering: Synergy between AI and Software Engineering" http://confesta2018.csp.escience.cn/dct/page/65581
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago
Join us as Tao Xie, Professor and Willett Faculty Scholar in the Department of Computer Science at the University of Illinois at Urbana-Champaign and ACM Distinguished Speaker, talks about Intelligent Software Engineering: Synergy between AI and Software Engineering. This is a joint meeting hosted by Chicago Chapter ACM / Loyola University Computer Science Department.
Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie
2018 Distinguished Speaker, the UC Irvine Institute for Software Research (ISR) Distinguished Speaker Series 2018-2019. "Intelligent Software Engineering: Synergy between AI and Software Engineering" http://isr.uci.edu/content/isr-distinguished-speaker-series-2018-2019
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...Tao Xie
2018 Keynote Speaker, Symposium on Dependable Software Engineering - Theories, Tools and Applications (SETTA 2018). "Intelligent Software Engineering: Synergy between AI and Software Engineering" http://confesta2018.csp.escience.cn/dct/page/65581
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago
Join us as Tao Xie, Professor and Willett Faculty Scholar in the Department of Computer Science at the University of Illinois at Urbana-Champaign and ACM Distinguished Speaker, talks about Intelligent Software Engineering: Synergy between AI and Software Engineering. This is a joint meeting hosted by Chicago Chapter ACM / Loyola University Computer Science Department.
Keynote presentation from ECBS conference. The talk is about how to use machine learning and AI in improving software engineering. Experiences from our project in Software Center (www.software-center.se).
The complexity of current software-based systems has led the software engineering community to look for inspiration in diverse related fields (e.g., robotics, artificial intelligence) as well as other areas (e.g., biology) to find new ways of designing and managing systems and services.
Software Engineering for ML/AI, keynote at FAS*/ICAC/SASO 2019Patrizio Pelliccione
ML and AI are increasingly dominating the high-tech industry. Organizations and technology companies are leveraging their big data to create new products or improve their processes to reach the next level in their market. However, ML and AI are not a silver bullet and Software 2.0 is not the end of software developers or software engineering.
In this talk I will argument on how software engineering can help ML and AI to become the key technology for (autonomous) systems of the near future. Software engineering best practices and achievements reached in the last decades might help, e.g., (i) democratising the use of ML/AI, (ii) composing, reusing, chaining ML/AI models to solve more complex problems, and (iii) supporting for reasoning about correctness, repeatability, explainability, traceability, fairness, ethics, while building an ML/AI pipeline.
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...Margaret-Anne Storey
A Video for this talk can be found here: https://www.youtube.com/watch?v=DvRdBb9TEUI
Abstract: How often do we pause to consider how we, as a community, decide which developer problems we address, or how well we are doing at evaluating our solutions within real development contexts? Many of our research contributions in software engineering can be considered as purely technical. Yet somewhere, at some time, a software developer may be impacted by our research. In this talk, I invite the community to question the impact of our research on software developer productivity. To guide the discussion, I first paint a picture of the modern-day developer and the challenges they experience. I then present 4+1 views of software engineering research --- views that concern research context, method choice, research paradigms, theoretical knowledge and real-world impact. I demonstrate how these views can be used to design, communicate and distinguish individual studies, but also how they can be used to compose a critical perspective of our research at a community level. To conclude, I propose structural changes to our collective research and publishing activities --- changes to provoke a more expeditious consideration of the many challenges facing today's software developer.
(Thanks to Brynn Hawker for slide design and proposed new badges. brynn@hawker.me)
In 2003 Dave et al. have coined the term “opinion mining” to refer to “processing a set of search results for a given item, generating a list of product attributes (quality, features, etc.) and aggregating opinions about each of them (poor, mixed, good)”. Nine years later, in 2012 Brooks and Swigger have applied sentiment analysis in the context of software engineering. Today another nine years have passed and it is time to look back: what have we achieved as a research community and where should we go next?
To answer this question we conducted a systematic literature review involving 185 papers. Based on the literature review we present 1) well-defined categories of opinion mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for SE tasks, and provide critical insights for the further development of opinion mining techniques in the SE domain.
This work has been done together with Bin Lin, Gabriele Bavota and Michele Lanza from Università della Svizzera italiana, Switzerland, Nathan Cassee from Eindhoven University of Technology, The Netherlands and Nicole Novielli from University of Bari, Italy.
Analyzing Big Data's Weakest Link (hint: it might be you)HPCC Systems
Tim Menzies, NC State University, presents at the 2015 HPCC Systems Engineering Summit Community Day.
For Big Data applications, there is a lack of any gold standards for "good analysis" or methods to assess our certification programs. Hence, we are still in the dark about whether or not our human analysts are making the best use possible of the tools of Big Data. While much progress has been made in the systems aspects of Big Data, certain critical human-centered aspects remain an open issue. Regardless of the sophistication of the analysis tools and environment, all that architecture can still be used incorrectly by users. If this issue was confined to a small number of inexperienced users, then it could be addressed via process improvements such as better training. But is it? What do we know about our analysts? Where are the studies that mine the people doing the data miners?
This presentation offers some preliminary results on tools that combine ECL with other methods that recognize the code generated by experienced or inexperienced developers. While the results are preliminary, they do raise the possibility that we can better characterize what it means to be experienced (or inexperienced) at Big Data applications.
Implications of Open Source Software Use (or Let's Talk Open Source)Gail Murphy
A talk given to the UBC Computer Science Alumni group discussing a number of implications of the use of open source as part of the global software supply chain.
Why is TDD so hard for Data Engineering and Analytics Projects?Phil Watt
This slide show describes the difficulties in implementing Test-Driven Development (TDD) in the context of analytics and data engineering in development and maintenance phases. If we assumes that the objective of TDD is to reduce cycle time, improve developer productivity and improve production quality. It identifies 7 challenges from the analytics literature and a further 10 from interviews (n=14) and survey respondents (n=20) selected from analytics leaders. A key theme emerging as an output is that many of the challenges can be addressed through education and coaching, notably around data literacy for key stakeholders and executives
Advancing Foundation and Practice of Software AnalyticsTao Xie
Vision Statement Presentation on "Advancing Foundation & Practice of Software Analytics" at the 2nd International NSF sponsored Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE 2013) http://promisedata.org/raise/2013/
Keynote presentation from ECBS conference. The talk is about how to use machine learning and AI in improving software engineering. Experiences from our project in Software Center (www.software-center.se).
The complexity of current software-based systems has led the software engineering community to look for inspiration in diverse related fields (e.g., robotics, artificial intelligence) as well as other areas (e.g., biology) to find new ways of designing and managing systems and services.
Software Engineering for ML/AI, keynote at FAS*/ICAC/SASO 2019Patrizio Pelliccione
ML and AI are increasingly dominating the high-tech industry. Organizations and technology companies are leveraging their big data to create new products or improve their processes to reach the next level in their market. However, ML and AI are not a silver bullet and Software 2.0 is not the end of software developers or software engineering.
In this talk I will argument on how software engineering can help ML and AI to become the key technology for (autonomous) systems of the near future. Software engineering best practices and achievements reached in the last decades might help, e.g., (i) democratising the use of ML/AI, (ii) composing, reusing, chaining ML/AI models to solve more complex problems, and (iii) supporting for reasoning about correctness, repeatability, explainability, traceability, fairness, ethics, while building an ML/AI pipeline.
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...Margaret-Anne Storey
A Video for this talk can be found here: https://www.youtube.com/watch?v=DvRdBb9TEUI
Abstract: How often do we pause to consider how we, as a community, decide which developer problems we address, or how well we are doing at evaluating our solutions within real development contexts? Many of our research contributions in software engineering can be considered as purely technical. Yet somewhere, at some time, a software developer may be impacted by our research. In this talk, I invite the community to question the impact of our research on software developer productivity. To guide the discussion, I first paint a picture of the modern-day developer and the challenges they experience. I then present 4+1 views of software engineering research --- views that concern research context, method choice, research paradigms, theoretical knowledge and real-world impact. I demonstrate how these views can be used to design, communicate and distinguish individual studies, but also how they can be used to compose a critical perspective of our research at a community level. To conclude, I propose structural changes to our collective research and publishing activities --- changes to provoke a more expeditious consideration of the many challenges facing today's software developer.
(Thanks to Brynn Hawker for slide design and proposed new badges. brynn@hawker.me)
In 2003 Dave et al. have coined the term “opinion mining” to refer to “processing a set of search results for a given item, generating a list of product attributes (quality, features, etc.) and aggregating opinions about each of them (poor, mixed, good)”. Nine years later, in 2012 Brooks and Swigger have applied sentiment analysis in the context of software engineering. Today another nine years have passed and it is time to look back: what have we achieved as a research community and where should we go next?
To answer this question we conducted a systematic literature review involving 185 papers. Based on the literature review we present 1) well-defined categories of opinion mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for SE tasks, and provide critical insights for the further development of opinion mining techniques in the SE domain.
This work has been done together with Bin Lin, Gabriele Bavota and Michele Lanza from Università della Svizzera italiana, Switzerland, Nathan Cassee from Eindhoven University of Technology, The Netherlands and Nicole Novielli from University of Bari, Italy.
Analyzing Big Data's Weakest Link (hint: it might be you)HPCC Systems
Tim Menzies, NC State University, presents at the 2015 HPCC Systems Engineering Summit Community Day.
For Big Data applications, there is a lack of any gold standards for "good analysis" or methods to assess our certification programs. Hence, we are still in the dark about whether or not our human analysts are making the best use possible of the tools of Big Data. While much progress has been made in the systems aspects of Big Data, certain critical human-centered aspects remain an open issue. Regardless of the sophistication of the analysis tools and environment, all that architecture can still be used incorrectly by users. If this issue was confined to a small number of inexperienced users, then it could be addressed via process improvements such as better training. But is it? What do we know about our analysts? Where are the studies that mine the people doing the data miners?
This presentation offers some preliminary results on tools that combine ECL with other methods that recognize the code generated by experienced or inexperienced developers. While the results are preliminary, they do raise the possibility that we can better characterize what it means to be experienced (or inexperienced) at Big Data applications.
Implications of Open Source Software Use (or Let's Talk Open Source)Gail Murphy
A talk given to the UBC Computer Science Alumni group discussing a number of implications of the use of open source as part of the global software supply chain.
Why is TDD so hard for Data Engineering and Analytics Projects?Phil Watt
This slide show describes the difficulties in implementing Test-Driven Development (TDD) in the context of analytics and data engineering in development and maintenance phases. If we assumes that the objective of TDD is to reduce cycle time, improve developer productivity and improve production quality. It identifies 7 challenges from the analytics literature and a further 10 from interviews (n=14) and survey respondents (n=20) selected from analytics leaders. A key theme emerging as an output is that many of the challenges can be addressed through education and coaching, notably around data literacy for key stakeholders and executives
Advancing Foundation and Practice of Software AnalyticsTao Xie
Vision Statement Presentation on "Advancing Foundation & Practice of Software Analytics" at the 2nd International NSF sponsored Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE 2013) http://promisedata.org/raise/2013/
Generation of Search Based Test Data on Acceptability Testing Principleiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Graphs are a perfect solution to organize information and to determine the relatedness of content. Neo4j Developer Evangelist Kenny Bastani will discuss using Neo4j to perform document classification and text classification using a graph database. Kenny will demonstrate how to build a scalable architecture for classifying natural language text using a graph-based algorithm called Hierarchical Pattern Recognition. This approach encompasses a set of techniques familiar to Deep Learning practitioners.
Many times what your machine learning algorithm is optimizing for is not what you really want. Worse, many times you do not really know what you want. In interactive machine learning the human is an integral part of the learning process leading to cooperative human-machine solutions. In this talk, we will see a few such problems from the field of Natural Language Processing (NLP), their challenges, common pitfalls and some solutions.
This presentation is about a lecture I gave within the "Software systems and services" immigration course at the Gran Sasso Science Institute, L'Aquila (Italy): http://cs.gssi.it/.
http://www.ivanomalavolta.com
K anonymity for crowdsourcing database
In crowdsourcing database, human operators are embedded into the database engine and collaborate with other conventional database operators to process the queries. Each human operator publishes small HITs (Human Intelligent Task) to the crowdsourcing platform, which consists of a set of database records and corresponding questions for human workers.
Synergy of Human and Artificial Intelligence in Software EngineeringTao Xie
Keynote Talk by Tao Xie at International NSF sponsored Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE 2013) http://promisedata.org/raise/2013/
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...Tao Xie
MSR 2022 Foundational Contribution Award Talk on "Software Analytics: Reflection and Path Forward" by Dongmei Zhang and Tao Xie
https://conf.researchr.org/info/msr-2022/awards
Diversity and Computing/Engineering: Perspectives from AlliesTao Xie
Slides from the invited talk given on Feb 13, 2019 being part of a diversity and inclusion week - Infusion 2019. Infusion is a diversity focused week for the Illinois College of Engineering, hosted by the Dean's Student Advisory Committee of Engineering Council. This invited talk was co-hosted by the NSBE - UIUC chapter.
Transferring Software Testing Tools to PracticeTao Xie
ACM SIGSOFT Webinar co-presented by Nikolai Tillmann (Microsoft), Judith Bishop (Microsoft Research), Pratap Lakshman (Microsoft), Tao Xie (University of Illinois at Urbana-Champaign) http://www.sigsoft.org/resources/webinars.html
Transferring Software Testing and Analytics Tools to PracticeTao Xie
Keynote Talk in the Workshop on Testing: Academia-Industry Collaboration, Practice and Research Techniques (TAIC PART 2016) http://www2016.taicpart.org/
Towards Mining Software Repositories Research that MattersTao Xie
Towards Mining Software Repositories Research that Matters. Talk slides at Next Generation of Mining Software Repositories '14 (Pre-FSE 2014 Event), Nov 15–16. HKUST, Hong Kong http://ng2014.msrworld.org/
Teaching and Learning Programming and Software Engineering via Interactive Ga...Tao Xie
Pex4Fun (http://www.pex4fun.com/) is a web-based educational gaming environment for teaching and learning programming and software engineering. Pex4Fun can be used to teach and learn programming and software engineering at many levels, from high school all the way through graduate courses. With Pex4Fun, a student edits code in any browser – with Intellisense – and Pex4Fun executes it and analyzes it in the cloud. Pex4Fun connects teachers, curriculum authors, and students in a unique social experience, tracking and streaming progress updates in real time. In particular, Pex4Fun finds interesting and unexpected input values (with Pex, an advanced test-generation tool) that help students understand what their code is actually doing. The real fun starts with coding duels where a student writes code to implement a teacher's secret specification (in the form of sample-solution code not visible to the student). Pex4Fun finds any discrepancies in behavior between the student’s code and the secret specification. Such discrepancies are given as feedback to the student to guide how to fix the student’s code to match the behavior of the secret specification. In early 2014, Code Hunt (https://www.codehunt.com/) has been released as a redesign of Pex4Fun as game.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software Engineering
1. Intelligent Software Engineering:
Synergy between AI and Software Engineering
Tao Xie
University of Illinois at Urbana-Champaign
taoxie@illinois.edu
http://taoxie.cs.illinois.edu/
4. 4
Software Analytics
Software analytics is to enable software practitioners to
perform data exploration and analysis in order to obtain
insightful and actionable information for data-driven
tasks around software and services.
Dongmei Zhang, Yingnong Dang, Jian-Guang Lou, Shi Han, Haidong Zhang, and Tao Xie. Software Analytics as a
Learning Case in Practice: Approaches and Experiences. In MALETS 2011
Software Analytics
Group
5. 5
Software Analytics
Software analytics is to enable software practitioners to
perform data exploration and analysis in order to obtain
insightful and actionable information for data-driven
tasks around software and services.
Dongmei Zhang, Yingnong Dang, Jian-Guang Lou, Shi Han, Haidong Zhang, Tao Xie. Software Analytics as a
Learning Case in Practice: Approaches and Experiences. In MALETS 2011
Software Analytics
Group
7. Data Sources
Runtime traces
Program logs
System events
Perf counters
…
Usage log
User surveys
Online forum posts
Blog & Twitter
…
Source code
Bug history
Check-in history
Test cases
Eye tracking
MRI/EMG
…
9. Mining and Understanding Software Enclaves (MUSE)
http://materials.dagstuhl.de/files/15/15472/15472.SureshJagannathan1.Slides.pdf
DARPA
Started in 2014
10. Pliny: Mining Big Code
to help programmers
(Rice U., UT Austin,
Wisconsin, Grammatech)
http://pliny.rice.edu/ http://news.rice.edu/2014/11/05/next-for-darpa-autocomplete-for-programmers-2/
$11 million (4 years)
DARPA MUSE Project
11. Organization Committee
Student volunteers!!
General Chair:
General Chair:Hong Mei, Peking University /
Beijing Institute of Technology
~40 researchers from
China, US, UK, Japan,
Singapore, Australia, etc.
http://www.oscar-lab.org/YQLSA2018/
13. Current: Automated Software Testing
• 10 years of collaboration with Microsoft Research on Pex [ASE’14 Ex]
• .NET Test Generation Tool based on Dynamic Symbolic Execution
• Tackle challenges of
• Path explosion via fitness function [DSN’09]
• Method sequence explosion via program synthesis [OOPSLA’11]
• …
• Shipped in Visual Studio 2015/2017 Enterprise Edition
• As IntelliTest
Tillmann, de Halleux, Xie. Transferring an Automated Test Generation Tool to Practice: From Pex to Fakes and Code
Digger. ASE’14 Experience Papers
14. Next: Intelligent Software Testing
• Learning from others working on the same things
• Our work on mining API usage method sequences to test the API
[ESEC/FSE’09: MSeqGen]
• Visser et al. Green: Reducing, reusing and recycling constraints in program analysis.
[FSE’12]
• Learning from others working on similar things
• Jia et al. Enhancing reuse of constraint solutions to improve symbolic execution.
[ISSTA’15]
• Aquino et al. Heuristically Matching Solution Spaces of Arithmetic Formulas to
Efficiently Reuse Solutions. [ICSE’17]
[Jia et al. ISSTA’15]
Continuous Learning
17. Translation of NL to Regular Expressions/SQL
● Program Aliasing: a semantically equivalent program may
have many syntactically different forms
NL Sentences
18. NL Regex: Sequence-to-sequence Model
● Encoder/Decoder: 2-layer stacked LSTM architectures [Locascio et al. EMNLP’16]
19. Training Objective: Maximum Likelihood Estimation (MLE)
Maximizing Semantic Correctness
● Standard seq-to-seq maximizes likelihood mapping NL to ground truth
● MLE penalizes syntactically different but semantically equivalent regex
● Reward : semantic correctness
● Alternative objective: Maximize the expected
Leveraging REINFORCE technique of policy gradient [William’92] to maximize Expected Semantic Correctness
Zhong, Guo, Yang, Peng, Xie, Lou, Liu, Zhang. SemRegex: A Semantics-Based Approach for Generating Regular Expressions
from Natural Language Specifications. EMNLP’18.
20. Measurements of Semantic Correctness
● Minimal DFAs
● Test Cases (pos/neg string examples)
([ABab]&[A-Z]).*X
21. Evaluation Results of NLRegex Approaches
Zhong, Guo, Yang, Peng, Xie, Lou, Liu, Zhang. SemRegex: A Semantics-Based Approach for Generating Regular Expressions
from Natural Language Specifications. EMNLP’18.
DFA-equivalence Accuracy
24. Self-Driving Tesla Involved in Fatal Crash (2016 June 30)
http://www.nytimes.com/2016/07/01/business/self-driving-tesla-fatal-crash-investigation.html
“A Tesla car in autopilot crashed into a trailer
because the autopilot system failed to recognize
the trailer as an obstacle due to its “white color
against a brightly lit sky” and the “high ride
height”
http://www.cs.columbia.edu/~suman/docs/deepxplore.pdf
26. Amazon’s AI Tool Discriminating Against Women
https://www.businessinsider.com/amazon-built-ai-to-hire-people-discriminated-against-women-2018-10
27. Adversarial Machine Learning/Testing
● Adversarial testing [Szegedy et al. ICLR’14]: find corner-case inputs imperceptible
to human but induce errors
27
School bus OstrichCarefully crafted noise
Pei et al. DeepXplore: Automated Whitebox Testing of Deep Learning Systems. SOSP 2017. Slide adapted from SOSP’17 slides
28. Example Detected Erroneous Behaviors
2828
Turn rightGo straight Go straight Turn left
Tian et al. DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars. ICSE 2018. Slide adapted from SOSP’17 slides
Pei et al. DeepXplore: Automated Whitebox Testing of Deep Learning Systems. SOSP 2017.
29. Example Detected Erroneous Behaviors
2929
Turn rightGo straight Go straight Turn left
Tian et al. DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars. ICSE 2018. Slide adapted from SOSP’17 slides
Lu et al. NO Need to Worry about Adversarial Examples in
Object Detection in Autonomous Vehicles. CVPR’17.
Pei et al. DeepXplore: Automated Whitebox Testing of Deep Learning Systems. SOSP 2017.
30. Issues in Neural Machine Translation
Screen snapshots captured on April 5, 2018
• Overall better than statistical machine
translation
• But worse controllability
• Existing translation quality assurance
Need reference translation, not
applicable online
Cannot precisely locate problem
types
31. Translation Quality Assurance
● Key idea:black-box algorithms specialized for common problems
○ No need for reference translation; need only the original sentence and
generated translation
○ Precise problem localization
● Common problems
○ Under-translation
○ Over-translation
Zheng, Wang, Liu, Zhang, Zeng, Deng, Yang, He, Xie. Testing Untestable Neural Machine Translation: An Industrial Case. arXiv:1807.02340, July 2018.