How dynamic precondition checking can increase the correctness of the Extract Local Variable and Extract And Move Method refactorings. Masterpresentation from the SATToSE conference.
Model-based testing of executable statechartsTom Mens
This talk was presented at the international research seminar SATTOSE 2016 in Bergen, Norway (2016-07-12). We present various ways to test executable statechart models: using unit testing, behaviour-driven development, design by contract, property statecharts and regression testing. We illustrate these ideas on the basis of a running example of a microwave oven controller. We also present a proof-of-concept implementation of a statechart interpreter supporting all these ideas. The interpreter, called Sismic, is provided as an open source Python library. This research was carried out by Tom Mens and Alexandre Decan, Software Engineering Lab, University of Mons, Belgium.
IUST Advanced software engineering course by Dr. Saeed Parsa. Credits of slides belong to Dr. Saeed Parsa and IUST reverse engineering research laboratory. All slides are available publicly due to COVID 19 Pandemic.
Formal methods help improve the quality and reliability of software by providing proof of correctness. However, ensuring the correctness of verification tools that apply these formal methods, is itself a much harder problem. A typical way to justify the correctness is to provide soundness proofs based on semantic models. For program verifiers these soundness proofs are quite large and complex. In this thesis, we introduce certified reasoning to provide machine checked proofs of various components of an automated verification system. We develop new certified decision procedures (Omega++) and certified proofs (for compatible sharing) and integrate with an existing automated verification system (HIP/SLEEK). We show that certified reasoning improves the correctness and expressivity of automated verification without sacrificing on performance.
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
A Delft3D model was implemented for the study of the hydrodynamics of San Quintin Bay. Calibration and validation have been successfully executed in previous research, but uncertainties propagated through simulation of future conditions are mostly unknown, and have not been tested in this region. Data Assimilation (DA) techniques play an important role, as their mathematical methods depict algorithms for combining dynamical system observations, implement computational models describing their evolution, and any relevant prior information. The aim of this study was to make a comparative analysis of calibration methods versus DA, as well as evaluate the long-term predictive capability of a model using sea surface height and current measurements taken within the bay. Delft3D-OpenDA is considered an effective a tool for delivering real-time forecasting via employment of the ensemble Kalman filter algorithm, and this automatic procedure is expected to obtain an improved model forecast. We anticipate an ensemble size of between 40 and 60 will provide the optimal and most accurately predicted water levels for San Quintin Bay by assimilating a single observation point located at the bay’s entry. New computational challenges will also be addressed, as well as means of reducing the computational costs of these implementations.
Model-based testing of executable statechartsTom Mens
This talk was presented at the international research seminar SATTOSE 2016 in Bergen, Norway (2016-07-12). We present various ways to test executable statechart models: using unit testing, behaviour-driven development, design by contract, property statecharts and regression testing. We illustrate these ideas on the basis of a running example of a microwave oven controller. We also present a proof-of-concept implementation of a statechart interpreter supporting all these ideas. The interpreter, called Sismic, is provided as an open source Python library. This research was carried out by Tom Mens and Alexandre Decan, Software Engineering Lab, University of Mons, Belgium.
IUST Advanced software engineering course by Dr. Saeed Parsa. Credits of slides belong to Dr. Saeed Parsa and IUST reverse engineering research laboratory. All slides are available publicly due to COVID 19 Pandemic.
Formal methods help improve the quality and reliability of software by providing proof of correctness. However, ensuring the correctness of verification tools that apply these formal methods, is itself a much harder problem. A typical way to justify the correctness is to provide soundness proofs based on semantic models. For program verifiers these soundness proofs are quite large and complex. In this thesis, we introduce certified reasoning to provide machine checked proofs of various components of an automated verification system. We develop new certified decision procedures (Omega++) and certified proofs (for compatible sharing) and integrate with an existing automated verification system (HIP/SLEEK). We show that certified reasoning improves the correctness and expressivity of automated verification without sacrificing on performance.
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
A Delft3D model was implemented for the study of the hydrodynamics of San Quintin Bay. Calibration and validation have been successfully executed in previous research, but uncertainties propagated through simulation of future conditions are mostly unknown, and have not been tested in this region. Data Assimilation (DA) techniques play an important role, as their mathematical methods depict algorithms for combining dynamical system observations, implement computational models describing their evolution, and any relevant prior information. The aim of this study was to make a comparative analysis of calibration methods versus DA, as well as evaluate the long-term predictive capability of a model using sea surface height and current measurements taken within the bay. Delft3D-OpenDA is considered an effective a tool for delivering real-time forecasting via employment of the ensemble Kalman filter algorithm, and this automatic procedure is expected to obtain an improved model forecast. We anticipate an ensemble size of between 40 and 60 will provide the optimal and most accurately predicted water levels for San Quintin Bay by assimilating a single observation point located at the bay’s entry. New computational challenges will also be addressed, as well as means of reducing the computational costs of these implementations.
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioAlluxio, Inc.
Alluxio Global Online Meetup
Apr 23, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Jiao (Jennie) Wang, Intel
Tsai Louie, Intel
Bin Fan, Alluxio
Today, many people run deep learning applications with training data from separate storage such as object storage or remote data centers. This presentation will demo the Intel Analytics Zoo + Alluxio stack, an architecture that enables high performance while keeping cost and resource efficiency balanced without network being I/O bottlenecked.
Intel Analytics Zoo is a unified data analytics and AI platform open-sourced by Intel. It seamlessly unites TensorFlow, Keras, PyTorch, Spark, Flink, and Ray programs into an integrated pipeline, which can transparently scale from a laptop to large clusters to process production big data. Alluxio, as an open-source data orchestration layer, accelerates data loading and processing in Analytics Zoo deep learning applications.
This talk, we will go over:
- What is Analytics Zoo and how it works
- How to run Analytics Zoo with Alluxio in deep learning applications
- Initial performance benchmark results using the Analytics Zoo + Alluxio stack
A science-gateway for workflow executions: online and non-clairvoyant self-h...Rafael Ferreira da Silva
PhD Thesis presented on November 29th 2013 at INSA-Lyon
Abstract - Science gateways, such as the Virtual Imaging Platform (VIP), enable transparent access to distributed computing and storage resources for scientific computations. However, their large scale and the number of middleware systems involved lead to many errors and faults. In practice, science gateways are often backed by substantial support staff who monitors running experiments by performing simple yet crucial actions such as rescheduling tasks, restarting services, killing misbehaving runs or replicating data files to reliable storage facilities. Fair quality of service (QoS) can then be delivered, yet with important human intervention. Automating such operations is challenging for two reasons. First, the problem is online by nature because no reliable user activity prediction can be assumed, and new workloads may arrive at any time. Therefore, the considered metrics, decisions and actions have to remain simple and to yield results while the application is still executing. Second, it is non-clairvoyant due to the lack of information about applications and resources in production conditions. Computing resources are usually dynamically provisioned from heterogeneous clusters, clouds or desktop grids without any reliable estimate of their availability and characteristics. Models of application execution times are hardly available either, in particular on heterogeneous computing resources. In this thesis, we propose a general healing process for autonomous detection and handling of operational incidents in workflow executions. Instances are modeled as Fuzzy Finite State Machines (FuSM) where state degrees of membership are determined by an external healing process. Degrees of membership are computed from metrics assuming that incidents have outlier performance, e.g. a site or a particular invocation behaves differently than the others. Based on incident degrees, the healing process identifies incident levels using thresholds determined from the platform history. A specific set of actions is then selected from association rules among incident levels.
For more information visit http://www.rafaelsilva.com
Scaling Deep Learning Algorithms on Extreme Scale Architecturesinside-BigData.com
In this video from the MVAPICH User Group, Abhinav Vishnu from PNNL presents: Scaling Deep Learning Algorithms on Extreme Scale Architectures.
"Deep Learning (DL) is ubiquitous. Yet leveraging distributed memory systems for DL algorithms is incredibly hard. In this talk, we will present approaches to bridge this critical gap. We will start by scaling DL algorithms on large scale systems such as leadership class facilities (LCFs). Specifically, we will: 1) present our TensorFlow and Keras runtime extensions which require negligible changes in user-code for scaling DL implementations, 2) present communication-reducing/avoiding techniques for scaling DL implementations, 3) present approaches on fault tolerant DL implementations, and 4) present research on semi-automatic pruning of DNN topologies. Our results will include validation on several US supercomputer sites such as Berkeley's NERSC, Oak Ridge Leadership Class Facility, and PNNL Institutional Computing. We will provide pointers and discussion on the general availability of our research under the umbrella of Machine Learning Toolkit on Extreme Scale (MaTEx) available at http://github.com/matex-org/matex."
Watch the video: https://wp.me/p3RLHQ-hnZ
Introduction to Model-Based Machine LearningDaniel Emaasit
The field of machine learning has seen the development of thousands of learning algorithms. Typically, scientists choose from these algorithms to solve specific problems. Their choices often being limited by their familiarity with these algorithms. In this classical/traditional framework of machine learning, scientists are constrained to making some assumptions so as to use an existing algorithm. This is in contrast to the model-based machine learning approach which seeks to create a bespoke solution tailored to each new problem.
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
A talk given at the DATAPLAT workshop, co-located with the IEEE ICDE conference (May 2024, Utrecht, NL).
Data Provenance for Data Science is our attempt to provide a foundation to add explainability to data-centric AI.
It is a prototype, with lots of work still to do.
Big Data Europe is a EU funded Horizon2020 project and will undertake the foundational work for enabling European companies to build innovative multilingual products and services based on semantically interoperable, large-scale, multi-lingual data assets and knowledge, available under a variety of licenses and business models.
The Open PHACTS Discovery Platform is bringing together pharmacological data resources in an integrated, interoperable infrastructure, and has been developed to reduce barriers to drug discovery in industry, academia and for small businesses.
The first round of pilots for the Big Data Europe project is about to enter the evaluation phase. This also holds for the Societal Challenge 1: Health. For this challenge the Open PHACTS foundation, Manchester University and the VU Amsterdam are working on the Open PHACTS docker and its integration with the Big Data Europe infrastructure.
This presentation will give you:
- a general overview of the infrastructure and the status of the generic components that are being developed
- an outline of the Societal Challenge and the rationale for the pilot
a look into the future pilot options
The intended audience are people acquainted with basic development tools like Docker and GitHub with an interest in Big Data and Drug Discovery.
OKE Challenge was hosted by European Semantic Web Conference ESWC, 3-7 June 2018, held in Heraklion, Crete, Greece (Aldemar Knossos Royal & Royal Villa).
This work was supported by grants from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227).
An Empirical Study of Identical Function Clones in CRANTom Mens
Research presentation by Maelick Claes of joint research at the IWSC 2015 workshop of SANER 2015 (Montreal, Canada), reporting on an empirical analysis studying the presence of software code clones across R packages in CRAN, the reasons for these clones, and whether and how these clones could be removed.
Automating Environmental Computing Applications with Scientific WorkflowsRafael Ferreira da Silva
Presentation held at the Environmental Computing Workshop on October 23, 2016
Abstract - Computational environmental science applications have evolved and become more complex over the last decade. In order to cope with the needs of such applications, computational methods and technologies have emerged to support the execution of these applications on heterogeneous, distributed systems. Among them are workflow management systems such as Pegasus. Pegasus is being used by researchers to model seismic wave propagation, to discover new celestial objects, to study RNA critical to human brain development, and to investigate other important research questions. This paper provides an introduction to scientific workflows and describes Pegasus and its main features. The paper highlights how the environmental science community has used Pegasus to automate their scientific workflow executions on high performance and high throughput computing systems by presenting three use cases: two Earth science workflows, and a climate science workflow.
H2O Deep Learning through Examples, Silicon Valley Big Data Science Meetup, Mountain View, 2/12/15
http://www.meetup.com/Silicon-Valley-Big-Data-Science/events/219790984/?a=md1_grp&rv=md1&_af_eid=219790984&_af=event
Live Stream: http://new.livestream.com/accounts/10932136/events/3806139
H2O.ai basic components and model deployment pipeline presented. Benchmark for scalability, speed and accuracy of machine learning libraries for classification presented from https://github.com/szilard/benchm-ml.
Machine learning key to your formulation challengesMarc Borowczak
You develop pharmaceutical, cosmetic, food, industrial or civil engineered products, and are often confronted with the challenge of blending and formulating to meet process or performance properties. While traditional Research and Development does approach the problem with experimentation, it generally involves designs, time and resource constraints, and can be considered slow, expensive and often times redundant, fast forgotten or perhaps obsolete.
Consider the alternative Machine Learning tools offers today. We will show this is not only quick, efficient and ultimately the only way Front End of Innovation should proceed, and how it is particularly suited for formulation and classification.
Today, we will explain how Machine Learning can shed new light on this generic and very persistent formulation challenge. We will discuss the other important aspect of classification and clustering often associated with these formulations challenges in a forthcoming communication.
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioAlluxio, Inc.
Alluxio Global Online Meetup
Apr 23, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Jiao (Jennie) Wang, Intel
Tsai Louie, Intel
Bin Fan, Alluxio
Today, many people run deep learning applications with training data from separate storage such as object storage or remote data centers. This presentation will demo the Intel Analytics Zoo + Alluxio stack, an architecture that enables high performance while keeping cost and resource efficiency balanced without network being I/O bottlenecked.
Intel Analytics Zoo is a unified data analytics and AI platform open-sourced by Intel. It seamlessly unites TensorFlow, Keras, PyTorch, Spark, Flink, and Ray programs into an integrated pipeline, which can transparently scale from a laptop to large clusters to process production big data. Alluxio, as an open-source data orchestration layer, accelerates data loading and processing in Analytics Zoo deep learning applications.
This talk, we will go over:
- What is Analytics Zoo and how it works
- How to run Analytics Zoo with Alluxio in deep learning applications
- Initial performance benchmark results using the Analytics Zoo + Alluxio stack
A science-gateway for workflow executions: online and non-clairvoyant self-h...Rafael Ferreira da Silva
PhD Thesis presented on November 29th 2013 at INSA-Lyon
Abstract - Science gateways, such as the Virtual Imaging Platform (VIP), enable transparent access to distributed computing and storage resources for scientific computations. However, their large scale and the number of middleware systems involved lead to many errors and faults. In practice, science gateways are often backed by substantial support staff who monitors running experiments by performing simple yet crucial actions such as rescheduling tasks, restarting services, killing misbehaving runs or replicating data files to reliable storage facilities. Fair quality of service (QoS) can then be delivered, yet with important human intervention. Automating such operations is challenging for two reasons. First, the problem is online by nature because no reliable user activity prediction can be assumed, and new workloads may arrive at any time. Therefore, the considered metrics, decisions and actions have to remain simple and to yield results while the application is still executing. Second, it is non-clairvoyant due to the lack of information about applications and resources in production conditions. Computing resources are usually dynamically provisioned from heterogeneous clusters, clouds or desktop grids without any reliable estimate of their availability and characteristics. Models of application execution times are hardly available either, in particular on heterogeneous computing resources. In this thesis, we propose a general healing process for autonomous detection and handling of operational incidents in workflow executions. Instances are modeled as Fuzzy Finite State Machines (FuSM) where state degrees of membership are determined by an external healing process. Degrees of membership are computed from metrics assuming that incidents have outlier performance, e.g. a site or a particular invocation behaves differently than the others. Based on incident degrees, the healing process identifies incident levels using thresholds determined from the platform history. A specific set of actions is then selected from association rules among incident levels.
For more information visit http://www.rafaelsilva.com
Scaling Deep Learning Algorithms on Extreme Scale Architecturesinside-BigData.com
In this video from the MVAPICH User Group, Abhinav Vishnu from PNNL presents: Scaling Deep Learning Algorithms on Extreme Scale Architectures.
"Deep Learning (DL) is ubiquitous. Yet leveraging distributed memory systems for DL algorithms is incredibly hard. In this talk, we will present approaches to bridge this critical gap. We will start by scaling DL algorithms on large scale systems such as leadership class facilities (LCFs). Specifically, we will: 1) present our TensorFlow and Keras runtime extensions which require negligible changes in user-code for scaling DL implementations, 2) present communication-reducing/avoiding techniques for scaling DL implementations, 3) present approaches on fault tolerant DL implementations, and 4) present research on semi-automatic pruning of DNN topologies. Our results will include validation on several US supercomputer sites such as Berkeley's NERSC, Oak Ridge Leadership Class Facility, and PNNL Institutional Computing. We will provide pointers and discussion on the general availability of our research under the umbrella of Machine Learning Toolkit on Extreme Scale (MaTEx) available at http://github.com/matex-org/matex."
Watch the video: https://wp.me/p3RLHQ-hnZ
Introduction to Model-Based Machine LearningDaniel Emaasit
The field of machine learning has seen the development of thousands of learning algorithms. Typically, scientists choose from these algorithms to solve specific problems. Their choices often being limited by their familiarity with these algorithms. In this classical/traditional framework of machine learning, scientists are constrained to making some assumptions so as to use an existing algorithm. This is in contrast to the model-based machine learning approach which seeks to create a bespoke solution tailored to each new problem.
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
A talk given at the DATAPLAT workshop, co-located with the IEEE ICDE conference (May 2024, Utrecht, NL).
Data Provenance for Data Science is our attempt to provide a foundation to add explainability to data-centric AI.
It is a prototype, with lots of work still to do.
Big Data Europe is a EU funded Horizon2020 project and will undertake the foundational work for enabling European companies to build innovative multilingual products and services based on semantically interoperable, large-scale, multi-lingual data assets and knowledge, available under a variety of licenses and business models.
The Open PHACTS Discovery Platform is bringing together pharmacological data resources in an integrated, interoperable infrastructure, and has been developed to reduce barriers to drug discovery in industry, academia and for small businesses.
The first round of pilots for the Big Data Europe project is about to enter the evaluation phase. This also holds for the Societal Challenge 1: Health. For this challenge the Open PHACTS foundation, Manchester University and the VU Amsterdam are working on the Open PHACTS docker and its integration with the Big Data Europe infrastructure.
This presentation will give you:
- a general overview of the infrastructure and the status of the generic components that are being developed
- an outline of the Societal Challenge and the rationale for the pilot
a look into the future pilot options
The intended audience are people acquainted with basic development tools like Docker and GitHub with an interest in Big Data and Drug Discovery.
OKE Challenge was hosted by European Semantic Web Conference ESWC, 3-7 June 2018, held in Heraklion, Crete, Greece (Aldemar Knossos Royal & Royal Villa).
This work was supported by grants from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227).
An Empirical Study of Identical Function Clones in CRANTom Mens
Research presentation by Maelick Claes of joint research at the IWSC 2015 workshop of SANER 2015 (Montreal, Canada), reporting on an empirical analysis studying the presence of software code clones across R packages in CRAN, the reasons for these clones, and whether and how these clones could be removed.
Automating Environmental Computing Applications with Scientific WorkflowsRafael Ferreira da Silva
Presentation held at the Environmental Computing Workshop on October 23, 2016
Abstract - Computational environmental science applications have evolved and become more complex over the last decade. In order to cope with the needs of such applications, computational methods and technologies have emerged to support the execution of these applications on heterogeneous, distributed systems. Among them are workflow management systems such as Pegasus. Pegasus is being used by researchers to model seismic wave propagation, to discover new celestial objects, to study RNA critical to human brain development, and to investigate other important research questions. This paper provides an introduction to scientific workflows and describes Pegasus and its main features. The paper highlights how the environmental science community has used Pegasus to automate their scientific workflow executions on high performance and high throughput computing systems by presenting three use cases: two Earth science workflows, and a climate science workflow.
H2O Deep Learning through Examples, Silicon Valley Big Data Science Meetup, Mountain View, 2/12/15
http://www.meetup.com/Silicon-Valley-Big-Data-Science/events/219790984/?a=md1_grp&rv=md1&_af_eid=219790984&_af=event
Live Stream: http://new.livestream.com/accounts/10932136/events/3806139
H2O.ai basic components and model deployment pipeline presented. Benchmark for scalability, speed and accuracy of machine learning libraries for classification presented from https://github.com/szilard/benchm-ml.
Machine learning key to your formulation challengesMarc Borowczak
You develop pharmaceutical, cosmetic, food, industrial or civil engineered products, and are often confronted with the challenge of blending and formulating to meet process or performance properties. While traditional Research and Development does approach the problem with experimentation, it generally involves designs, time and resource constraints, and can be considered slow, expensive and often times redundant, fast forgotten or perhaps obsolete.
Consider the alternative Machine Learning tools offers today. We will show this is not only quick, efficient and ultimately the only way Front End of Innovation should proceed, and how it is particularly suited for formulation and classification.
Today, we will explain how Machine Learning can shed new light on this generic and very persistent formulation challenge. We will discuss the other important aspect of classification and clustering often associated with these formulations challenges in a forthcoming communication.
Similar to Making Software Refactorings Safer (20)
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
1. Making Software Refactorings Safer
Anna Maria Eilertsen
@Sowhow
Supervised by: Anya Bagge1
Volker Stolz 2
1Inst. for Informatikk, Universitetet i Bergen
2Inst. for Data- og Realfag, Høgskolen i Bergen
Norway
July 12, 2016
The results of this thesis has been accepted to the ISOLA1
conference as a paper.
1
http://www.isola-conference.org/isola2016/
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 1 / 14
2. Software Refactorings
“Behaviour preserving program
transformation”
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 2 / 14
3. Software Refactoring Tools
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 3 / 14
4. Unsafe Refactorings
“The primary risk is regression, mostly from
misunderstanding subtle corner cases in the
original code and not accounting for them
in the refactored code.”
– interviewee, Microsoft developer,
Kim et al., 2012
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 4 / 14
5. Unsafe Refactoring Example
Extract Local Variable
In Java/Eclipse:
Before
1 public void f() {
2 x.n();
3 setX();
4 x.n();
5 }
After
1 public void f() {
2 X temp = x;
3 temp.n();
4 setX();
5 temp.n();
6 }
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 5 / 14
6. An analysing problem
x = new X();
setX(); // x = new X();
1 public void f() {
2 X temp = x;
3 temp.n();
4 setX();
5 temp.n();
6 }
Solution:
assert temp == x;
∗
Dog art from Hyperbole and a Half
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 6 / 14
7. Extract Local Variable
Simplified example:
1 public class C {
2 public X x = new X();
3 {//initializer
4 x.myC = this;
5 }
6
7 public void f(){
8 x.n();
9 x.m();
0 x.n();
1 }
2 }
1 public class X{
2 public C myC;
3
4 public void m(){
5 myC.x = new X();
6 }
7
8 public void n(){
9 System.out.println(
10 this.hashCode());
11 }
12 }
Output:
1735600054
21685669 skip example
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 7 / 14
8. Extract Local Variable
Refactored:
1 public class C {
2 public X x = new X();
3 {//initializer
4 x.myC = this;
5 }
6
7 public void f(){
8 X temp = x;
9 temp.n();
0 temp.m();
1 temp.n();
2 }
3 }
1 public class X{
2 public C myC;
3
4 public void m(){
5 myC.x = new X();
6 }
7
8 public void n(){
9 System.out.println(
10 this.hashCode());
11 }
12 }
Output:
1735600054
1735600054
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 8 / 14
9. Extract Local Variable
With dynamic checks:
1 public class C {
2 public X x = new X();
3 {//initializer
4 x.myC = this;
5 }
6 public void f(){
7 X temp = x;
8 assert temp == x;
9 temp.n();
0 assert temp == x;
1 temp.m();
2 assert temp == x;
3 temp.n();
1 public class X{
2 public C myC;
3
4 public void m(){
5 myC.x = new X();
6 }
7
8 public void n(){
9 System.out.println(
10 this.hashCode());
11 }
12 }
Output:
1735600054
Exception in thread ”main” java.lang.AssertionError
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 9 / 14
10. Extract And Move Method
A similar problem:
1 public class C {
2 public X x = new X();
3 {//initializer
4 x.myC = this;
5 }
6 public void f(){
7 x.bar(this);
8 }
9 }
1 public class X{
2 ...
3 void bar(C c){
4 this.n();
5 assert this == c.x;
6 this.m();
7 assert this == c.x;
8 this.n();
9 }
10 }
Similar how?
Evaluate x once.
Refer to that value by this
Substitute every occurrence of x with this
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 10 / 14
11. Experiment: Case study
Case: Eclipse JDT UI source code
Experiment:
Execute our modified refactorings on Eclipse JDT UI project
Run Eclipse test suite
Look for triggered asserts
Profit!!
Need custom automated refactoring tool.
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 11 / 14
12. Experiment: Results
Extract Local Extract and
Variable Move Method
Executed refactorings 4538 755
Total number of asserts 7665 610
Resulting compile errors 0 14
Successful Tests 2392 2151
Unsuccessful Tests 4 245
Asserts triggered 2 / 136∗ 0
∗ 136 instances of the same 2 assert statements
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 12 / 14
13. Discussion
Take-away and questions:
Extract Local Extract and
Variable Move Method
Executed refactorings 4538 755
Total number of asserts 7665 610
Resulting compile errors 0 14
Successful Tests 2392 2151
Unsuccessful Tests 4 245
Asserts triggered 2 / 136 0
Dynamic preconditions can be useful!
Assert statements are incomplete.
Show or hide the asserts from the programmer?
Is reference equivalence too strict?
Thank you!
∗
Art with face is from Hyperbole and a Half
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 13 / 14
14. Experiment: Development
Eclipse refactoring plug-in
Modify Eclipse’s refactorings to introduce asserts
Extract Local Variable
Extract And Move Method
Automate refactoring process
Execute on Java project
One refactoring per method
Custom heuristic for finding refactoring targets
Anna Maria Eilertsen @sowhow (Inst. for Informatikk, Universitetet i Bergen, Inst. for Data-Making Software Refactorings Safer July 12, 2016 14 / 14