In this talk we will discuss modern challenges and research directions in bioinformatics. First, we will discuss computational proteomics, which is dedicated to the design of algorithms that are able to identify proteins and peptides in data of biological samples (given as mass spectra). We will also discuss how machine learning techniques can be used here to improve the validation of peptide identifications. Second, we will discuss computational immunology and how data science helps to analyze data of transplantation patients. Again, we also use machine learning to learn models that distinguish between normal and critical progresses. Third, we will discuss machine learning approaches for learning models for cancer predictions and virtual tumor markers on the basis of clinical data from the General Hospital Linz.
Software engineering research often requires analyzing
multiple revisions of several software projects, be it to make and
test predictions or to observe and identify patterns in how software evolves. However, code analysis tools are almost exclusively designed for the analysis of one specific version of the code, and the time and resources requirements grow linearly with each additional revision to be analyzed. Thus, code studies often observe a relatively small number of revisions and projects. Furthermore, each programming ecosystem provides dedicated tools, hence researchers typically only analyze code of one language, even when researching topics that should generalize
to other ecosystems. To alleviate these issues, frameworks and models have been developed to combine analysis tools or automate the analysis of multiple revisions, but little research has gone into actually removing redundancies in multi-revision, multi-language code analysis. We present a novel end-to-end approach that systematically avoids redundancies every step of the way: when reading sources from version control, during parsing, in the internal code representation, and during the actual analysis. We evaluate our open-source implementation, LISA, on the full
history of 300 projects, written in 3 different programming languages, computing basic code metrics for over 1.1 million program revisions. When analyzing many revisions, LISA requires less than a second on average to compute basic code metrics for all files in a single revision, even for projects consisting of millions of lines of code.
Leveraging NLP and Deep Learning for Document Recommendations in the CloudDatabricks
Efficient recommender systems are critical for the success of many industries, such as job recommendation, news recommendation, ecommerce, etc. This talk will illustrate how to build an efficient document recommender system by leveraging Natural Language Processing(NLP) and Deep Neural Networks (DNNs). The end-to-end flow of the document recommender system is build on AWS at scale, using Analytics Zoo for Spark and BigDL. The system first processes text rich documents into embeddings by incorporating Global Vectors (GloVe), then trains a K-means model using native Spark APIs to cluster users into several groups. The system further trains a recommender model for each group, and gives an ensemble prediction for each test record. By adopting the end-to-end pipeline of Analytics Zoo solution, we saw about 10% improvement of mean reciprocal ranking and 6% of precision respectively compared to the search recommendations for a job recommendation study.
Speaker: Guoqiong Song
Software engineering research often requires analyzing
multiple revisions of several software projects, be it to make and
test predictions or to observe and identify patterns in how software evolves. However, code analysis tools are almost exclusively designed for the analysis of one specific version of the code, and the time and resources requirements grow linearly with each additional revision to be analyzed. Thus, code studies often observe a relatively small number of revisions and projects. Furthermore, each programming ecosystem provides dedicated tools, hence researchers typically only analyze code of one language, even when researching topics that should generalize
to other ecosystems. To alleviate these issues, frameworks and models have been developed to combine analysis tools or automate the analysis of multiple revisions, but little research has gone into actually removing redundancies in multi-revision, multi-language code analysis. We present a novel end-to-end approach that systematically avoids redundancies every step of the way: when reading sources from version control, during parsing, in the internal code representation, and during the actual analysis. We evaluate our open-source implementation, LISA, on the full
history of 300 projects, written in 3 different programming languages, computing basic code metrics for over 1.1 million program revisions. When analyzing many revisions, LISA requires less than a second on average to compute basic code metrics for all files in a single revision, even for projects consisting of millions of lines of code.
Leveraging NLP and Deep Learning for Document Recommendations in the CloudDatabricks
Efficient recommender systems are critical for the success of many industries, such as job recommendation, news recommendation, ecommerce, etc. This talk will illustrate how to build an efficient document recommender system by leveraging Natural Language Processing(NLP) and Deep Neural Networks (DNNs). The end-to-end flow of the document recommender system is build on AWS at scale, using Analytics Zoo for Spark and BigDL. The system first processes text rich documents into embeddings by incorporating Global Vectors (GloVe), then trains a K-means model using native Spark APIs to cluster users into several groups. The system further trains a recommender model for each group, and gives an ensemble prediction for each test record. By adopting the end-to-end pipeline of Analytics Zoo solution, we saw about 10% improvement of mean reciprocal ranking and 6% of precision respectively compared to the search recommendations for a job recommendation study.
Speaker: Guoqiong Song
Apache Hadoop has emerged as the storage and processing platform of choice for Big Data. In this tutorial, I will give an overview of Apache Hadoop and its ecosystem, with specific use cases. I will explain the MapReduce programming framework in detail, and outline how it interacts with Hadoop Distributed File System (HDFS). While Hadoop is written in Java, MapReduce applications can be written using a variety of languages using a framework called Hadoop Streaming. I will give several examples of MapReduce applications using Hadoop Streaming.
Introduction to Mahout and Machine LearningVarad Meru
This presentation gives an introduction to Apache Mahout and Machine Learning. It presents some of the important Machine Learning algorithms implemented in Mahout. Machine Learning is a vast subject; this presentation is only a introductory guide to Mahout and does not go into lower-level implementation details.
In this deck from the Argonne Training Program on Extreme-Scale Computing 2019, Howard Pritchard from LANL and Simon Hammond from Sandia present: NNSA Explorations: ARM for Supercomputing.
"The Arm-based Astra system at Sandia will be used by the National Nuclear Security Administration (NNSA) to run advanced modeling and simulation workloads for addressing areas such as national security, energy and science.
"By introducing Arm processors with the HPE Apollo 70, a purpose-built HPC architecture, we are bringing powerful elements, like optimal memory performance and greater density, to supercomputers that existing technologies in the market cannot match,” said Mike Vildibill, vice president, Advanced Technologies Group, HPE. “Sandia National Laboratories has been an active partner in leveraging our Arm-based platform since its early design, and featuring it in the deployment of the world’s largest Arm-based supercomputer, is a strategic investment for the DOE and the industry as a whole as we race toward achieving exascale computing.”
Watch the video: https://wp.me/p3RLHQ-l29
Learn more: https://insidehpc.com/2018/06/arm-goes-big-hpe-builds-petaflop-supercomputer-sandia/
and
https://extremecomputingtraining.anl.gov/agenda-2019/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Provenance for Data Munging EnvironmentsPaul Groth
Data munging is a crucial task across domains ranging from drug discovery and policy studies to data science. Indeed, it has been reported that data munging accounts for 60% of the time spent in data analysis. Because data munging involves a wide variety of tasks using data from multiple sources, it often becomes difficult to understand how a cleaned dataset was actually produced (i.e. its provenance). In this talk, I discuss our recent work on tracking data provenance within desktop systems, which addresses problems of efficient and fine grained capture. I also describe our work on scalable provence tracking within a triple store/graph database that supports messy web data. Finally, I briefly touch on whether we will move from adhoc data munging approaches to more declarative knowledge representation languages such as Probabilistic Soft Logic.
Presented at Information Sciences Institute - August 13, 2015
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Gruter
Apache Tajo is an open source big data warehouse system on Hadoop. This slide shows two high-tech efforts for performance improvement in Tajo project. First one is query optimization including cost-based join order and progressive optimization. The second effort is JIT-based vectorized processing.
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Databricks
In this session, you will learn how CERN easily applied end-to-end deep learning and analytics pipelines on Apache Spark at scale for High Energy Physics using BigDL and Analytics Zoo open source software running on Intel Xeon-based distributed clusters.
Technical details and development learnings will be shared using an example of topology classification to improve real-time event selection at the Large Hadron Collider experiments. The classifier has demonstrated very good performance figures for efficiency, while also reducing the false positive rate compared to the existing methods. It could be used as a filter to improve the online event selection infrastructure of the LHC experiments, where one could benefit from a more flexible and inclusive selection strategy while reducing the amount of downstream resources wasted in processing false positives.
This is part of CERN’s research on applying Deep Learning and Analytics using open source and industry standard technologies as an alternative to the existing customized rule based methods. We show how we could quickly build and implement distributed deep learning solutions and data pipelines at scale on Apache Spark using Analytics Zoo and BigDL, which are open source frameworks unifying Analytics and AI on Spark with easy to use APIs and development interfaces seamlessly integrated with Big Data Platforms.
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Databricks
The physicists at CERN are increasingly turning to Spark to process large physics datasets in a distributed fashion with the aim of reducing time-to-physics with increased interactivity. The physics data itself is stored in CERN’s mass storage system: EOS and CERN’s IT department runs on-premise private cloud based on OpenStack as a way to provide on-demand compute resources to physicists. This provides both opportunity and challenges to Big Data team at CERN to provide elastic, scalable, reliable spark-as-a-service on OpenStack.
The talk focuses on the design choices made and challenges faced while developing spark-as-a-service over kubernetes on openstack to simplify provisioning, automate management, and minimize the operating burden of managing Spark Clusters. In addition, the service tooling simplifies submitting applications on the behalf of the users, mounting user-specified ConfigMaps, copying application logs to s3 buckets for troubleshooting, performance analysis and accounting of spark applications and support for stateful spark streaming applications. We will also share results from running large scale sustained workloads over terabytes of physics data.
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Codemotion
Monitoring an entire application is not a simple task, but with the right tools it is not a hard task either. However, events like Black Friday can push your application to the limit, and even cause crashes. As the system is stressed, it generates a lot more logs, which may crash the monitoring system as well. In this talk I will walk through the best practices when using the Elastic Stack to centralize and monitor your logs. I will also share some tricks to help you with the huge increase of traffic typical in Black Fridays.
In this video from the DDN User Group at SC16, Pamela Hill, Computational & Information Systems Laboratory, NCAR/UCAR, presents: NCAR’s Evolving Infrastructure for Weather and Climate Research.
Watch the video presentation: http://wp.me/p3RLHQ-g4a
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Hardware Acceleration of Apache Spark on Energy-Efficient FPGAs with Christof...Spark Summit
In this talk, we will present SPynq framework: A framework for the efficient mapping and acceleration of Spark applications on heterogeneous all-programmable MPSoC-based platforms, such as Zynq. Spark has been mapped to the Pynq platform and the proposed framework allows the seamlessly utilization of the programmable logic for the hardware acceleration of computational intensive Spark kernels. We have also developed the required libraries in Spark, by extending the MLLib library, that hides the accelerator’s details to minimize the design effort to utilize the accelerators. A cluster of 4 nodes (workers) based on the all-programmable MPSoCs has been implemented and the proposed platform is evaluated in a typical machine learning application based on logistic regression. The logistic regression kernel has been developed as an accelerator and incorporated to the Spark. The developed system is compared to a high-performance Xeon cluster that is typically used in cloud computing. The performance evaluation shows that the heterogeneous accelerator-based MpSoC can achieve up to 2.3x system speedup compared with a Xeon system (with 90% accuracy) and 20x better energy-efficiency. For embedded application, the proposed system can achieve up to 40x speedup compared to the software only implementation on low-power embedded processors and 30x lower energy consumption.
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...huguk
This talk will describe his research into using Hadoop to query and manage big geographic datasets, specifically OpenStreetMap(OSM). OSM is an “open-source” map of the world, growing at a large rate, currently around 5TB of data. The talk will introduce OSM, detail some aspects of the research, but also discuss his experiences with using the SpatialHadoop stack on Azure and Google Cloud.
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Linaro
Event: Arm Architecture HPC Workshop by Linaro and HiSilicon
Location: Santa Clara, CA
Speaker: Andrew J Younge
Talk Title: Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Supercomputing
Talk Desc: The Vanguard program looks to expand the potential technology choices for leadership-class High Performance Computing (HPC) platforms, not only for the National Nuclear Security Administration (NNSA) but for the Department of Energy (DOE) and wider HPC community. Specifically, there is a need to expand the supercomputing ecosystem by investing and developing emerging, yet-to-be-proven technologies and address both hardware and software challenges together, as well as to prove-out the viability of such novel platforms for production HPC workloads.
The first deployment of the Vanguard program will be Astra, a prototype Petascale Arm supercomputer to be sited at Sandia National Laboratories during 2018. This talk will focus on the arthictecural details of Astra and the significant investments being made towards the maturing the Arm software ecosystem. Furthermore, we will share initial performance results based on our pre-general availability testbed system and outline several planned research activities for the machine.
Bio: Andrew Younge is a R&D Computer Scientist at Sandia National Laboratories with the Scalable System Software group. His research interests include Cloud Computing, Virtualization, Distributed Systems, and energy efficient computing. Andrew has a Ph.D in Computer Science from Indiana University, where he was the Persistent Systems fellow and a member of the FutureGrid project, an NSF-funded experimental cyberinfrastructure test-bed. Over the years, Andrew has held visiting positions at the MITRE Corporation, the University of Southern California / Information Sciences Institute, and the University of Maryland, College Park. He received his Bachelors and Masters of Science from the Computer Science Department at Rochester Institute of Technology (RIT) in 2008 and 2010, respectively.
"El álgebra lineal es una herramienta fundamental en muchos campos de la ciencia y la tecnología. Es particularmente importante en la física, la ingeniería, la informática y la estadística. La capacidad de manipular eficientemente grandes cantidades de datos y matrices complejas es esencial en estas áreas para la resolución de problemas y la toma de decisiones.
A priori, puede dar la sensación de que estamos muy lejos del uso del álgebra lineal en nuestro día a día. Sin embargo, algunas técnicas como la descomposición en valores singulares y la regresión lineal para entrenar modelos y hacer predicciones precisas están detrás de la inteligencia artificial y el aprendizaje automático. ¿Te suena ChatGPT? Puede no parecerlo, pero el álgebra lineal también está detrás en algunos de sus procesos. Por este motivo, debemos seguir trabajando en este campo, ya que su importancia seguirá creciendo a medida que se generen y analicen grandes cantidades de datos en el mundo actual.
"
La pandemia de COVID-19 ha supuesto una proliferación de mapas y contramapas. Por ello, organizaciones de la sociedad civil y movimientos sociales han generado sus propias interpretaciones y representaciones de los datos sobre la crisis. Estos también han contribuido a visibilizar aspectos, sujetos y temas que han sido desatendidos o infrarrepresentados en las visualizaciones hegemónicas y dominantes. En este contexto, la presente ponencia se centra en el análisis de los imaginarios sociales relacionados con la elaboración de mapas durante la pandemia. Es decir, trata de indagar en la importancia de los mapas para el activismo digital, las potencialidades que se extraen de esta tecnología y los valores asociados a las visualizaciones creadas con ellos. El objetivo último es reflexionar sobre la vía emergente del activismo de datos, así como sobre la intersección entre los imaginarios sociales y la geografía digital.
More Related Content
Similar to Proteomics, Computational Immunology and Machine Learning - Bioinformatics Research at FH Upper Austria, Hagenberg
Apache Hadoop has emerged as the storage and processing platform of choice for Big Data. In this tutorial, I will give an overview of Apache Hadoop and its ecosystem, with specific use cases. I will explain the MapReduce programming framework in detail, and outline how it interacts with Hadoop Distributed File System (HDFS). While Hadoop is written in Java, MapReduce applications can be written using a variety of languages using a framework called Hadoop Streaming. I will give several examples of MapReduce applications using Hadoop Streaming.
Introduction to Mahout and Machine LearningVarad Meru
This presentation gives an introduction to Apache Mahout and Machine Learning. It presents some of the important Machine Learning algorithms implemented in Mahout. Machine Learning is a vast subject; this presentation is only a introductory guide to Mahout and does not go into lower-level implementation details.
In this deck from the Argonne Training Program on Extreme-Scale Computing 2019, Howard Pritchard from LANL and Simon Hammond from Sandia present: NNSA Explorations: ARM for Supercomputing.
"The Arm-based Astra system at Sandia will be used by the National Nuclear Security Administration (NNSA) to run advanced modeling and simulation workloads for addressing areas such as national security, energy and science.
"By introducing Arm processors with the HPE Apollo 70, a purpose-built HPC architecture, we are bringing powerful elements, like optimal memory performance and greater density, to supercomputers that existing technologies in the market cannot match,” said Mike Vildibill, vice president, Advanced Technologies Group, HPE. “Sandia National Laboratories has been an active partner in leveraging our Arm-based platform since its early design, and featuring it in the deployment of the world’s largest Arm-based supercomputer, is a strategic investment for the DOE and the industry as a whole as we race toward achieving exascale computing.”
Watch the video: https://wp.me/p3RLHQ-l29
Learn more: https://insidehpc.com/2018/06/arm-goes-big-hpe-builds-petaflop-supercomputer-sandia/
and
https://extremecomputingtraining.anl.gov/agenda-2019/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Provenance for Data Munging EnvironmentsPaul Groth
Data munging is a crucial task across domains ranging from drug discovery and policy studies to data science. Indeed, it has been reported that data munging accounts for 60% of the time spent in data analysis. Because data munging involves a wide variety of tasks using data from multiple sources, it often becomes difficult to understand how a cleaned dataset was actually produced (i.e. its provenance). In this talk, I discuss our recent work on tracking data provenance within desktop systems, which addresses problems of efficient and fine grained capture. I also describe our work on scalable provence tracking within a triple store/graph database that supports messy web data. Finally, I briefly touch on whether we will move from adhoc data munging approaches to more declarative knowledge representation languages such as Probabilistic Soft Logic.
Presented at Information Sciences Institute - August 13, 2015
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Gruter
Apache Tajo is an open source big data warehouse system on Hadoop. This slide shows two high-tech efforts for performance improvement in Tajo project. First one is query optimization including cost-based join order and progressive optimization. The second effort is JIT-based vectorized processing.
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Databricks
In this session, you will learn how CERN easily applied end-to-end deep learning and analytics pipelines on Apache Spark at scale for High Energy Physics using BigDL and Analytics Zoo open source software running on Intel Xeon-based distributed clusters.
Technical details and development learnings will be shared using an example of topology classification to improve real-time event selection at the Large Hadron Collider experiments. The classifier has demonstrated very good performance figures for efficiency, while also reducing the false positive rate compared to the existing methods. It could be used as a filter to improve the online event selection infrastructure of the LHC experiments, where one could benefit from a more flexible and inclusive selection strategy while reducing the amount of downstream resources wasted in processing false positives.
This is part of CERN’s research on applying Deep Learning and Analytics using open source and industry standard technologies as an alternative to the existing customized rule based methods. We show how we could quickly build and implement distributed deep learning solutions and data pipelines at scale on Apache Spark using Analytics Zoo and BigDL, which are open source frameworks unifying Analytics and AI on Spark with easy to use APIs and development interfaces seamlessly integrated with Big Data Platforms.
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Databricks
The physicists at CERN are increasingly turning to Spark to process large physics datasets in a distributed fashion with the aim of reducing time-to-physics with increased interactivity. The physics data itself is stored in CERN’s mass storage system: EOS and CERN’s IT department runs on-premise private cloud based on OpenStack as a way to provide on-demand compute resources to physicists. This provides both opportunity and challenges to Big Data team at CERN to provide elastic, scalable, reliable spark-as-a-service on OpenStack.
The talk focuses on the design choices made and challenges faced while developing spark-as-a-service over kubernetes on openstack to simplify provisioning, automate management, and minimize the operating burden of managing Spark Clusters. In addition, the service tooling simplifies submitting applications on the behalf of the users, mounting user-specified ConfigMaps, copying application logs to s3 buckets for troubleshooting, performance analysis and accounting of spark applications and support for stateful spark streaming applications. We will also share results from running large scale sustained workloads over terabytes of physics data.
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Codemotion
Monitoring an entire application is not a simple task, but with the right tools it is not a hard task either. However, events like Black Friday can push your application to the limit, and even cause crashes. As the system is stressed, it generates a lot more logs, which may crash the monitoring system as well. In this talk I will walk through the best practices when using the Elastic Stack to centralize and monitor your logs. I will also share some tricks to help you with the huge increase of traffic typical in Black Fridays.
In this video from the DDN User Group at SC16, Pamela Hill, Computational & Information Systems Laboratory, NCAR/UCAR, presents: NCAR’s Evolving Infrastructure for Weather and Climate Research.
Watch the video presentation: http://wp.me/p3RLHQ-g4a
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Hardware Acceleration of Apache Spark on Energy-Efficient FPGAs with Christof...Spark Summit
In this talk, we will present SPynq framework: A framework for the efficient mapping and acceleration of Spark applications on heterogeneous all-programmable MPSoC-based platforms, such as Zynq. Spark has been mapped to the Pynq platform and the proposed framework allows the seamlessly utilization of the programmable logic for the hardware acceleration of computational intensive Spark kernels. We have also developed the required libraries in Spark, by extending the MLLib library, that hides the accelerator’s details to minimize the design effort to utilize the accelerators. A cluster of 4 nodes (workers) based on the all-programmable MPSoCs has been implemented and the proposed platform is evaluated in a typical machine learning application based on logistic regression. The logistic regression kernel has been developed as an accelerator and incorporated to the Spark. The developed system is compared to a high-performance Xeon cluster that is typically used in cloud computing. The performance evaluation shows that the heterogeneous accelerator-based MpSoC can achieve up to 2.3x system speedup compared with a Xeon system (with 90% accuracy) and 20x better energy-efficiency. For embedded application, the proposed system can achieve up to 40x speedup compared to the software only implementation on low-power embedded processors and 30x lower energy consumption.
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...huguk
This talk will describe his research into using Hadoop to query and manage big geographic datasets, specifically OpenStreetMap(OSM). OSM is an “open-source” map of the world, growing at a large rate, currently around 5TB of data. The talk will introduce OSM, detail some aspects of the research, but also discuss his experiences with using the SpatialHadoop stack on Azure and Google Cloud.
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Linaro
Event: Arm Architecture HPC Workshop by Linaro and HiSilicon
Location: Santa Clara, CA
Speaker: Andrew J Younge
Talk Title: Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Supercomputing
Talk Desc: The Vanguard program looks to expand the potential technology choices for leadership-class High Performance Computing (HPC) platforms, not only for the National Nuclear Security Administration (NNSA) but for the Department of Energy (DOE) and wider HPC community. Specifically, there is a need to expand the supercomputing ecosystem by investing and developing emerging, yet-to-be-proven technologies and address both hardware and software challenges together, as well as to prove-out the viability of such novel platforms for production HPC workloads.
The first deployment of the Vanguard program will be Astra, a prototype Petascale Arm supercomputer to be sited at Sandia National Laboratories during 2018. This talk will focus on the arthictecural details of Astra and the significant investments being made towards the maturing the Arm software ecosystem. Furthermore, we will share initial performance results based on our pre-general availability testbed system and outline several planned research activities for the machine.
Bio: Andrew Younge is a R&D Computer Scientist at Sandia National Laboratories with the Scalable System Software group. His research interests include Cloud Computing, Virtualization, Distributed Systems, and energy efficient computing. Andrew has a Ph.D in Computer Science from Indiana University, where he was the Persistent Systems fellow and a member of the FutureGrid project, an NSF-funded experimental cyberinfrastructure test-bed. Over the years, Andrew has held visiting positions at the MITRE Corporation, the University of Southern California / Information Sciences Institute, and the University of Maryland, College Park. He received his Bachelors and Masters of Science from the Computer Science Department at Rochester Institute of Technology (RIT) in 2008 and 2010, respectively.
"El álgebra lineal es una herramienta fundamental en muchos campos de la ciencia y la tecnología. Es particularmente importante en la física, la ingeniería, la informática y la estadística. La capacidad de manipular eficientemente grandes cantidades de datos y matrices complejas es esencial en estas áreas para la resolución de problemas y la toma de decisiones.
A priori, puede dar la sensación de que estamos muy lejos del uso del álgebra lineal en nuestro día a día. Sin embargo, algunas técnicas como la descomposición en valores singulares y la regresión lineal para entrenar modelos y hacer predicciones precisas están detrás de la inteligencia artificial y el aprendizaje automático. ¿Te suena ChatGPT? Puede no parecerlo, pero el álgebra lineal también está detrás en algunos de sus procesos. Por este motivo, debemos seguir trabajando en este campo, ya que su importancia seguirá creciendo a medida que se generen y analicen grandes cantidades de datos en el mundo actual.
"
La pandemia de COVID-19 ha supuesto una proliferación de mapas y contramapas. Por ello, organizaciones de la sociedad civil y movimientos sociales han generado sus propias interpretaciones y representaciones de los datos sobre la crisis. Estos también han contribuido a visibilizar aspectos, sujetos y temas que han sido desatendidos o infrarrepresentados en las visualizaciones hegemónicas y dominantes. En este contexto, la presente ponencia se centra en el análisis de los imaginarios sociales relacionados con la elaboración de mapas durante la pandemia. Es decir, trata de indagar en la importancia de los mapas para el activismo digital, las potencialidades que se extraen de esta tecnología y los valores asociados a las visualizaciones creadas con ellos. El objetivo último es reflexionar sobre la vía emergente del activismo de datos, así como sobre la intersección entre los imaginarios sociales y la geografía digital.
Designing RISC-V-based Accelerators for next generation Computers (DRAC) is a 3-year project (2019-2022) funded by the ERDF Operational Program of Catalonia 2014-2020. DRAC will design, verify, implement and fabricate a high performance general purpose processor that will incorporate different accelerators based on the RISC-V technology, with specific applications in the field of post-quantum security, genomics and autonomous navigation. In this talk, we will provide an overview of the main achievements in the DRAC project, including the fabrication of Lagarto, the first RISC-V processor developed in Spain.
This talk will begin introducing the uElectronics section of ESA at ESTEC and the general activities the group is responsible for. Then, it will go through some of the R+D on-going activities that the group is involved with, hand in hand with universities and/or companies. One of the major ones is related to the European rad-hard FPGAs that have been partially founded by ESA for several years and that will be playing a major role in the sector in the upcoming years. It´s also worth talking about the RTL soft IPs that are currently under development and that will allow us to keep on providing the European ecosystem with some key capabilities. The latter will be an overview of RISC-V space hardened on-going activities that might be replacing the current SPARC based processors available for our missions.
El objetivo de esta charla es presentar las últimas novedades incorporadas en la arquitectura ARM y describir las tendencias en la microarquitectura de los procesadores con arquitectura ARM. ARM es una empresa relativamente pequeña en comparación con otros gigantes del sector tecnológico. Sin embargo, la amplia implantación de su arquitectura, siendo ampliamente dominante en algunos sectores, y sus microarquitecturas, hacen que la tecnología ARM ocupe un lugar central en el desarrollo tecnológico del mundo actual. La tecnología ARM está presente prácticamente en todo el espectro tecnológico, desde los dispositivos más sencillos hasta el HPC y Cloud computing, pasando por los smartphones, automoción electrónica de consumo, etc
"Formal verification has been used by computer scientists for decades to prevent
software bugs. However, with a few exceptions, it has not been used by researchers
working in most areas of mathematics (geometry, algebra, analysis, etc.). In this
talk, we will discuss how this has changed in the past few years, and the possible
implications to the future of mathematical research, teaching and communication.
We will focus on the theorem prover Lean and its mathematical library
mathlib, since this is currently the system most widely used by mathematicians.
Lean is a functional programming language and interactive theorem prover based
on dependent type theory, with proof irrelevance and non-cumulative universes.
The mathlib library, open-source and designed as a basis for research level
mathematics, is one of the largest collections of formalized mathematics. It allows
classical reasoning, uses large- and small-scale automation, and is characterized
by its decentralized nature with over 200 contributors, including both computer
scientists and mathematicians."
"Part of the research community thinks that it is still early to tackle the development of quantum software engineering techniques. The reason is that how the quantum computers of the future will look like is still unknown. However, there are some facts that we can affirm today: 1) quantum and classical computers will coexist, each dedicated to the tasks at which they are most efficient. 2) quantum computers will be part of the cloud infrastructure and will be accessible through the Internet. 3) complex software systems will be made up of smaller pieces that will collaborate with each other. 4) some of those pieces will be quantum, therefore the systems of the future will be hybrid. 5) the coexistence and interaction between the components of said hybrid systems will be supported by service composition: quantum services.
This talk analyzes the challenges that the integration of quantum services poses to Service Oriented Computing."
In this talk, after a brief overview of AI concepts in particular Machine Learning (ML) techniques, some of the well-known computer design concepts for high performance and power efficiency are presented. Subsequently, those techniques that have had a promising impact for computing ML algorithms are discussed. Deep learning has emerged as a game changer for many applications in various fields of engineering and medical sciences. Although the primary computation function is matrix vector multiplication, many competing efficient implementations of this primary function have been proposed and put into practice. This talk will review and compare some of those techniques that are used for ML computer design.
Tras una breve introducción a la informática médica y unas pinceladas sobre conceptos prácticos de Inteligencia Artificial (posible definición consensuada, strong VS weak AI y técnicas y métodos comúnmente empleados), el bloque central de la charla muestra ejemplos prácticos (en forma de casos de éxito) de distintos desarrollos llevados a cabo por el grupo de Sistemas Informáticos de Nueva Generación (SING: http//sing-group.org/) en los ámbitos de (i) Informática clínica (InNoCBR, PolyDeep), (ii) Informática para investigación clínica (PathJam, WhichGenes), (iii) bioinformática traslacional (Genómica: ALTER, Proteómica: DPD, BI, BS, Mlibrary, Mass-Up, e integración de datos ÓMICOS: PunDrugs) y (iv) Informática en salud pública (CURMIS4th). Finalmente, se comenta brevemente la importancia que se espera tenga en un futuro inmediato la IA interpretable (XAI, Explainable Artificial Intelligence) y la participación humana (HITL. Human-In-The-Loop). La charla termina con una breve reflexión sobre las lecciones aprendidas por el ponente después de más de 16 años de desarrollo de sistemas inteligentes en el ámbito de la informática médica.
Many emerging applications require methods tailored towards high-speed data acquisition and filtering of streaming data followed by offline event reconstruction and analysis. In this case, the main objective is to relieve the immense pressure on the storage and communication resources within the experimental infrastructure. In other applications, ultra low latency real time analysis is required for autonomous experimental systems and anomaly detection in acquired scientific data in the absence of any prior data model for unknown events. At these data rates, traditional computing approaches cannot carry out even cursory analyses in a time frame necessary to guide experimentation. In this talk, Prof. Ogrenci will present some examples of AI hardware architectures. She will discuss the concept of co-design, which makes the unique needs of an application domain transparent to the hardware design process and present examples from three applications: (1) An in-pixel AI chip built using the HLS methodology; (2) A radiation hardened ASIC chip for quantum systems; (3) An FPGA-based edge computing controller for real-time control of a High Energy Physics experiment.
En esta conferencia se presentará una revisión del concepto de autonomía para robots móviles de campo y la identificación de desafíos para lograr un verdadero sistema autónomo, además de sugerir posibles direcciones de investigación. Los sistemas robóticos inteligentes, por lo general, obtienen conocimiento de sus funciones y del entorno de trabajo en etapa de diseño y desarrollo. Este enfoque no siempre es eficiente, especialmente en entornos semiestructurados y complejos como puede ser el campo de cultivo. Un sistema robótico verdaderamente autónomo debería desarrollar habilidades que le permitan tener éxito en tales entornos sin la necesidad de tener a-priori un conocimiento ontológico del área de trabajo y la definición de un conjunto de tareas o comportamientos predefinidos. Por lo que en esta conferencia se presentarán posibles estrategias basadas en Inteligencia Artificial que permitan perfeccionar las capacidades de navegación de robots móviles y que sean capaces de ofrecer un nivel de autonomía lo suficientemente elevado para poder ejecutar todas las tareas dentro de una misión casa-a-casa (home-to-home).
Quantum computing has become a noteworthy topic in academia and industry. The multinational companies in the world have been obtaining impressive advances in all areas of quantum technology during the last two decades. These companies try to construct real quantum computers in order to exploit their theoretical preferences over today’s classical computers in practical applications. However, they are challenging to build a full-scale quantum computer because of their increased susceptibility to errors due to decoherence and other quantum noise. Therefore, quantum error correction (QEC) and fault-tolerance protocol will be essential for running quantum algorithms on large-scale quantum computers.
The overall effect of noise is modeled in terms of a set of Pauli operators and the identity acting on the physical qubits (bit flip, phase flip and a combination of bit and phase flips). In addition to Pauli errors, there is another error named leakage errors that occur when a qubit leaves the defined computational subspace. As the location of leakage errors is unknown, these can damage even more the quantum computations. Thus, this talk will briefly provide quantum error models.
Los chatbots son un elemento clave en la transformación digital de nuestra sociedad. Están por todas partes: eCommerce, salud digital, asistencia a clientes, turismo,... Pero si habéis usado alguno, probablemente os habrá decepcionado. Lo confieso, la mayoría de los chatbots que existen son muy malos. Y es que no es nada fácil hacer un chatbot que sea realmente útil e inteligente. Un chatbot combina toda la complejidad de la ingeniería de software con la del procesamiento de lenguaje natural. Pensad que muchos chatbots hay que desplegarlos en varios canales (web, telegram, slack,...) y a menudo tienen que utilizar APIs y servicios externos, acceder a bases de datos internas o integrar modelos de lenguaje preentrenados (por ej. detectores de toxicidad), etc. Y el problema no es sólo crear el bot, si no también probarlo y evolucionarlo. En esta charla veremos los mayores desafíos a los que hay que enfrentarse cuando nos encargan un proyecto de desarrollo que incluye un chatbot y qué técnicas y estrategias podemos ir aplicando en función de las necesidades del proyecto, para conseguir, esta vez sí un chatbot que sepa de lo que habla.
Many HPC applications are massively parallel and can benefit from the spatial parallelism offered by reconfigurable logic. While modern memory technologies can offer high bandwidth, designers must craft advanced communication and memory architectures for efficient data movement and on-chip storage. Addressing these challenges requires to combine compiler optimizations, high-level synthesis, and hardware design.
In this talk, I will present challenges, solutions, and trends for generating massively parallel accelerators on FPGA for high-performance computing. These architectures can provide performance comparable to software implementations on high-end processors, and much higher energy efficiency thanks to logic customization.
The main challenge of concurrent software verification has always been in achieving modularity, i.e., the ability to divide and conquer the correctness proofs with the goal of scaling the verification effort. Types are a formal method well-known for its ability to modularize programs, and in the case of dependent types, the ability to modularize and scale complex mathematical proofs.
In this talk I will present our recent work towards reconciling dependent types with shared memory concurrency, with the goal of achieving modular proofs for the latter. Applying the type-theoretic paradigm to concurrency has lead us to view separation logic as a type theory of state, and has motivated novel abstractions for expressing concurrency proofs based on the algebraic structure of a resource and on structure-preserving functions (i.e., morphisms) between resources.
Microarchitectural attacks, such as Spectre and Meltdown, are a class of
security threats that affect almost all modern processors. These attacks exploit the side-effects resulting from processor optimizations to leak sensitive information and compromise a system’s security.
Over the years, a large number of hardware and software mechanisms for
preventing microarchitectural leaks have been proposed. Intuitively, more
defensive mechanisms are less efficient, while more permissive mechanisms may offer more performance but require more defensive programming. Unfortunately, there are no
hardware-software contracts that would turn this intuition into a basis for
principled co-design.
In this talk, we present a framework for specifying hardware/software security
contracts, an abstraction that captures a processor’s security guarantees in a
simple, mechanism-independent manner by specifying which program executions a
microarchitectural attacker can distinguish.
La aparición de vulnerabilidades por la falta de controles de seguridad es una de las causas por las que se demandan nuevos marcos de trabajo que produzcan software seguro de forma predeterminada. En la conferencia se abordará cómo transformar el proceso de desarrollo de software dando la importancia que merece la seguridad desde el inicio del ciclo de vida. Para ello se propone un nuevo modelo de desarrollo – modelo Viewnext-UEx – que incorpora prácticas de seguridad de forma preventiva y sistemática en todas las fases del proceso de ciclo de vida del software. El propósito de este nuevo modelo es anticipar la detección de vulnerabilidades aplicando la seguridad desde las fases más tempranas, a la vez que se optimizan los procesos de construcción del software. Se exponen los resultados de un escenario preventivo, tras la aplicación del modelo Viewnext-UEx, frente al escenario reactivo tradicional de aplicar la seguridad a partir de la fase de testing.
This lecture will address issues of trust in computer systems, artificial intelligence and attacks on these types of systems with practical examples. Artificial Intelligence has gained ground in several areas with different applications scenarios, but in the perspective of this lecture, the fundamental point of the discussion is: what does an artificial intelligence system should do from a security perspective and how does an intelligence system provide results on a given subject? Few people are really concerned about the behavior of these types of systems from a security point of view. If you like machine learning and security, I believe this lecture will show you interesting security problems in artificial intelligence systems.
El uso de energías renovables es clave para cumplir los objetivos de desarrollo sostenible de la Agenda 2030. Entre estas energías, la eólica es la segunda más utilizada debido a su alta eficiencia. Algunos estudios sugieren que la energía eólica será la principal fuente de generación en 2050. Por ello es conveniente seguir investigando en la aplicación de técnicas de control avanzadas en estos sistemas.
Entre estas técnicas avanzadas cabe destacar las redes neuronales y el aprendizaje por refuerzo combinadas con estrategias clásicas de control. Estas técnicas ya se han empleado con éxito en el modelado y el control de sistemas complejos.
Esta conferencia presentará la aplicación de redes neuronales y aprendizaje por refuerzo al control de aerogeneradores, centrándolo especialmente en el control de pitch. Se detallarán diferentes configuraciones con redes neuronales y otras técnicas aplicadas al control de pitch. Finalmente se propondrán algunas técnicas híbridas que combinen lógica difusa, tablas de búsqueda y redes neuronales, mostrando resultados que han permitido probar su utilidad para mejorar la eficiencia de las turbinas eólicas.
As the world's energy demand rises, so does the amount of renewable energy, particularly wind energy, in the supply. The life cycle of wind farms starting from manufacturing the components to decommission stage involve significant involvement of cost and the application of AI and data analytics are on reducing these costs are limited. With this conference talk, the audience expected to know some of the interesting applications of AI and data analytics on offshore wind. And, also highlight the future challenges and opportunities. This conference could be useful for students, academics and researcher who want to make next career in offshore wind but yet know where to start.
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSveerababupersonal22
It consists of cw radar and fmcw radar ,range measurement,if amplifier and fmcw altimeterThe CW radar operates using continuous wave transmission, while the FMCW radar employs frequency-modulated continuous wave technology. Range measurement is a crucial aspect of radar systems, providing information about the distance to a target. The IF amplifier plays a key role in signal processing, amplifying intermediate frequency signals for further analysis. The FMCW altimeter utilizes frequency-modulated continuous wave technology to accurately measure altitude above a reference point.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
The Internet of Things (IoT) is a revolutionary concept that connects everyday objects and devices to the internet, enabling them to communicate, collect, and exchange data. Imagine a world where your refrigerator notifies you when you’re running low on groceries, or streetlights adjust their brightness based on traffic patterns – that’s the power of IoT. In essence, IoT transforms ordinary objects into smart, interconnected devices, creating a network of endless possibilities.
Here is a blog on the role of electrical and electronics engineers in IOT. Let's dig in!!!!
For more such content visit: https://nttftrg.com/
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
Proteomics, Computational Immunology and Machine Learning - Bioinformatics Research at FH Upper Austria, Hagenberg
1. 2018
Research Group HEAL –
Heuristics and Evolutionary Algorithms Laboratory
Contact:
Heuristic and Evolutionary
Algorithms Laboratory (HEAL)
FH OÖ University of Applied
Sciences Upper Austria
Softwarepark 11
A-4232 Hagenberg
WWW:
https://heal.heuristiclab.com
https://dev.heuristiclab.com
2. University of Applied Sciences
Upper Austria
2
Largest University of Applied Sciences in Austria
• 4 schools
• 62 study programms
• 5.888 students
• 17.34 m€ R&D turnover (2016)
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
3. Research Group
Heuristic and Evolutionary Algorithms
Laboratory (HEAL)
Research Group
• since 2005 at FH OÖ
• 5 professors
• 12 research associates
• various interns, bachelor & master students
Research
• > 200 peer-reviewed publications
• 8 dissertations
• > 60 master and bachelor theses
Industry partners (excerpt)
Scientific partners
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory 3
4. Metaheuristics
4
Metaheuristics
• intelligent search strategies
• can be applied to different problems
• explore interesting regions of the search space (parameter)
• tradeoff: computation vs. quality
good solutions for very complex problems
• must be tuned to applications
Challenges
• choice of appropriate metaheuristics
• hybridization
Finding needles in haystacks
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
5. Research Focus
5
Production Planning and
Logistics Optimization
ES
ACO
SA
PSO
SEGA
GATS
SASEGASA
GP
Machine Learning
Neural Networks
Statistics
Operations
Research
Modeling and
Simulation
Structure Identification
Data Mining
Regression
Time-Series
Classification
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
7. HeuristicLab
7
Open Source Optimization Environment HeuristicLab
• developed since 2002
• basis of many research projects and publications
• 2nd place at Microsoft Innovation Award 2009
• HeuristicLab 3.3.x since May 2010 under GNU GPL
Motivation and Goals
• graphical user interface for interactive development, analysis and
application of optimizations methods
• numerous optimization algorithms and optimization problems
• support for extensive experiments and analysis
• distribution through parallel execution of algorithms
• extensibility and flexibility (plug-in architecture)
Distributed Computing with HeuristicLab Hive
• framework for distribution and parallel execution of HeuristicLab
algorithms
• compute resources at Campus Hagenberg
2006 – 2011: research cluster 1 (14 cores)
since 2009: research cluster 2 (112 cores, 448GB RAM)
since 2011: lab computers (100 PCs, on demand in the night)
since 2017: research cluster 3 (448 cores, 4TB RAM)
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
8. Available Algorithms
Population-based
• CMA-ES
• Evolution Strategy
• Genetic Algorithm
• Offspring Selection Genetic Algorithm (OSGA)
• Island Genetic Algorithm
• Island Offspring Selection Genetic Algorithm
• Parameter-less Population Pyramid (P3)
• SASEGASA
• Relevant Alleles Preserving GA (RAPGA)
• Age-Layered Population Structure (ALPS)
• Genetic Programming
• NSGA-II
• Scatter Search
• Particle Swarm Optimization
Trajectory-based
• Local Search
• Tabu Search
• Robust Taboo Search
• Variable Neighborhood Search
• Simulated Annealing
Data Analysis
• Linear Discriminant Analysis
• Linear Regression
• Multinomial Logit Classification
• k-Nearest Neighbor
• k-Means
• Neighborhood Component Analysis
• Artificial Neural Networks
• Random Forests
• Support Vector Machines
• Gaussian Processes
• Gradient Boosted Trees
• Gradient Boosted Regression
Additional Algorithms
• User-defined Algorithm
• Performance Benchmarks
• Hungarian Algorithm
• Cross Validation
• LM-BFGS
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory 8
9. Available Problems
Combinatorial Problems
• Traveling Salesman
• Probabilistic Traveling Salesman
• Vehicle Routing
• Knapsack
• Bin Packing
• NK[P,Q]
• Job Shop Scheduling
• Linear Assignment
• Quadratic Assignment
• OneMax
• Orienteering
• Deceptive Trap
• Deceptive Trap Step
• HIFF
Genetic Programming Problems
• Test Problems (Even Parity, MUX)
• Symbolic Classification
• Symbolic Regression
• Symbolic Time-Series Prognosis
• Artificial Ant
• Lawn Mower
• Robocode
• Grammatical Evolution
Additional Problems
• Single-/Multi-Objective Test Function
• User-defined Problem
• Programmable Problem
• External Evaluation Problem
(Anylogic, Scilab, MATLAB)
• Regression, Classification, Clustering
• Trading
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory 9
11. How to get HeuristicLab?
Download binaries
• deployed as ZIP archives
• latest stable version 3.3.14 "Denver"
released on July 24th, 2016
• daily trunk builds
• https://dev.heuristiclab.com/download
Check out sources
• SVN repository
• HeuristicLab 3.3.14 tag
https://src.heuristiclab.com/svn/core/tags/3.3.14
• Stable development version
https://src.heuristiclab.com/svn/core/stable
License
• GNU General Public License (Version 3)
System requirements
• Microsoft .NET Framework 4.5
• enough RAM and CPU power ;-)
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory 11
13. Some Additional Features
HeuristicLab Hive
• parallel and distributed execution of algorithms
and experiments on many computers in a network
Optimization Knowledge Base (OKB)
• database to store algorithms, problems, parameters and results
• open to the public
• open for other frameworks
• analyze and store characteristics of problem instances and problem classes
External solution evaluation and simulation-based optimization
• interface to couple HeuristicLab with other applications
(MATLAB, Simulink, SciLab, AnyLogic, …)
• supports different protocols (command line parameters, TCP, …)
Parameter grid tests and meta-optimization
• automatically create experiments to test large ranges of parameters
• apply heuristic optimization algorithms to find optimal parameter settings for heuristic optimization
algorithms
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory 13
14. Research Focus
14
Production Planning and
Logistics Optimization
ES
ACO
SA
PSO
SEGA
GATS
SASEGASA
GP
Machine Learning
Neural Networks
Statistics
Operations
Research
Modeling and
Simulation
Structure Identification
Data Mining
Regression
Time-Series
Classification
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
15. Data Based Modeling: Starting Point
15Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
Goal: Mathematical models that describe system behavior
System = Engine, human body, financial data etc.
?? ??
Analysis of steel production processes Medical data analysis
17. Data Based Modeling: Process
17Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
18. Data Based Modeling: Results
18Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
Results:
Improved process understanding
for steel production
Virtual tumor markers for cancer
diagnosis
??
MeltingRate(t)= f(x3(t-2), x1(t-3), …) C125(p780641) = (chol, GGT, x58, …)
??
19. Data Based Modeling: Methods
19Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
Modeling Methods
• linear regression, random forests
• support vector machines, neural networks
• nearest neighbor, k-means
Genetic Programming
• implicit feature selection
• optimizes model structure and parameters
• generates interpretable formulas
• results directly applicable
• assessment of variable relevance
20. Black-Box vs. White-Box Modeling
20Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
Instead of black box models (ANN, SVM, etc.) identification of
model structure, i.e. white box models
(symbolic regression/classification with Genetic Programming)
Black Box Model
?
White Box Model
24. Visual Model Exploration
24Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
Validation Quality
TrainingQuality
Model Height
ModelSize
Selected Model
25. Example:
Virtual Sensors for Modeling Exhaust Gases
25
Motivation
• high quality modeling of emissions (NOx and soot) of a diesel engine
• virtual sensors: (mathematical) models that mimic the behavior of physical sensors
• advantages: low cost and non-intrusive
• identify variable impacts:
injected fuel, engine frequency, manifold air pressure, concentration of O2 in exhaustion etc.
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
NOx(t) = f(x1(t-7), x2(t-2), …)
26. Example:
Blast Furnace Modeling
26
Innovations
• results as formulas domain experts can analyze, simplify and refine the models
• integration of prior physical knowledge into modeling process
• powerful data analysis tools: model simplification and variable impact analysis
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
Model
f(x)
Prognosis
27. Example:
Foam Quality of Firefighting Vehicles
27Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
Goals
• detect relevant impact factors and potential relationships between foam parameters with
respect to throw range and foam quality
• model throw range and foam quality
• configure extinguishing systems for optimal throw range and foam quality
28. Example:
Plasma Nitriding Modeling
28
Motivation
• hardening of materials (e.g. transmission parts)
• process parameter settings based on expert
knowledge
Modeling Scenarios
a) prediction of quality values based on process
parameters and material composition
b) propose process parameter settings to reach the
desired material characteristics
a) b)
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
29. Example:
Modelling Tribological Systems
29
Goals
• virtual design and prototyping
• support design & dimensioning process
Methods
• model friction coefficient, wear, noise & vibration
• integrate domain knowledge
• reuse and formalize existing test data
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
Tribological System
Virtual Design Testing Final Product
30. Example:
Medical Diagnosis
30
Motivation
• research goal: identification of mathematical models
for cancer diagnosis
• tumor markers: substances found in humans
(especially blood and / or body tissues) that can be
used as indicators for certain types of cancer
Data
• medical database compiled at the central laboratory
of the General Hospital Linz, Austria, in the years 2005
– 2008
• total: blood values and cancer diagnoses for 20,819
patients
Modeling Scenarios
• model virtual tumor markers using normal blood data
• develop cancer diagnosis models using normal blood
data
• develop cancer diagnosis models using normal blood
data and (virtual) tumor markers
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
effects seen in
data (blood
examinations,
tumor markers)
Cancer Diagnosis Estimation
31. Integration of Expert Knowledge
31
Model Analysis
Knowledge Integration
• specification of known correlations
• model extension through algorithm
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
X1 X2 X3 X4
Y
Y
Y
Y
Y
*
k+
2
7
3
X
1
X
5
+
?
?
unknown
factors
unknown
factors
32. Holistic Knowledge Discovery
32Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
Variable interaction networks
• reveals non-linear correlations
Variable frequencies
• analyzed during the algorithm run
33. Model Networks for Structuring Systems
33Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
34. Detection of Regime Shifts
34
Real-time Analytics (Streaming Data)
Analysis of Variable Frequencies
• clearly shows that algorithm is able to detect which variables are relevant
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
35. Smart Factory Lab
35
„Smart Factory Lab EFRE/IWB 2014–2020“ – Technology Lab for
Intelligent Production along the Product Lifecycle
Project Locations: Hagenberg, Steyr, Wels
Time Frame: 01.01.2016 – 31.12.2021
Research Focus in the
Industry 4.0 Field
Infrastructure Investments
• high performance compute cluster
• VR/AR technology
• laser cladding and milling system
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
36. Preemptive Maintenance
36
Real-Time Identification of Maintenance Needs with Machine Learning
• prevention of production downtimes, early detection of quality problems, scrap prognosis
Time Series Analysis / Data Stream Analysis
• consecutive data from sensors for monitoring production facilities
Sliding Window Regression
• online and real-time capabilities
• ensembles for state detection
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
37. Research Focus
37
Production Planning and
Logistics Optimization
ES
ACO
SA
PSO
SEGA
GATS
SASEGASA
GP
Machine Learning
Neural Networks
Statistics
Operations
Research
Modeling and
Simulation
Structure Identification
Data Mining
Regression
Time-Series
Classification
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
38. Interrelated Production and Logistics
Processes
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory 38
39. Steel Production Processes
Casting Stacking Transport
Storage Transport Rolling
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory 39
40. Integrated Modeling,
Simulation & Optimization
40Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
Digital Factory
Modeling / Simulation / Optimization
Scenario Parameters
Customer Orders, Production Orders,
Resource Availabilities, Layout,
Transport Distances, etc.
Key Performance Indicators
Utilization, Costs, Lead Times
Suggested
Operational Actions
Stacking, Batching, Warehouse Admission,
Transport, Sequences
Operational Data
Databases, Information Systems, Cloud, etc.
Worker
Production Planner
42. Modeling of Optimization Networks
Academic Example: Knapsack + TSP
ParametersNode
ParametersParameters
TSPNode
Execute
GA TSP
KSPTSPConnector
KSP Connector TSP Connector
ParametersParameters
KSPNode
ConfigureKSPConfigureKSP
EvaluateRouteEvaluateRoute
GA KSP HookOperator
↑ Ci es: DoubleMatrix
↑ KnapsackCapacity: IntValue
↑ Values: IntArray
↑ Weights: IntArray
↑ TransportCostFactor: DoubleValue
↓ KnapsackCapacity: IntValue
↓ Values: IntArray
↓ Weights: IntArray
↓ Ci es: DoubleMatrix
↓ TransportCostFactor: DoubleValue
↑ KnapsackSolution: BinaryVector
↕ Quality: DoubleValue
↓ Route: PathTSPTour
↓ TransportCosts: DoubleValue
↓ KnapsackSolution: BinaryVector
↕ Quality: DoubleValue
↑ Route: PathTSPTour
↑ TransportCosts: DoubleValue
↑ Coordinates: DoubleMatrix
↓ Best TSP Solu on: PathTSPTour
↓ BestQuality: DoubleValue
↓ Coordinates: DoubleMatrix
↑ Best TSP Solu on: PathTSPTour
↑ BestQuality: DoubleValue
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory 42
43. Optimization Networks in HeuristicLab
43Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
44. Interaction with Simulation Software
44
e.g.
Configurable and extensible data exchange using protocol buffers
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
45. Integration into Simulation Process
• complex decisions may need to be made inside a simulation
e.g. which customers should be served by which trunk in what tour?
• HeuristicLab can be used to optimize these decisions
exisiting problem models can be parameterized (e.g. VRP) and solved
new problem models can be implemented and added
Interaction with Simulation Software
45
complex problem
optimized descision
Simulate
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
46. Interaction with Simulation Software
46
Google Protocol Buffers
• originally developed at Google but has been released to the public
• reference implementations still maintained by Google
• used to describe messages that are exchanged between HeuristicLab and simulation software
• very small and fast serialization format
• protocol buffers can be extended and customized
Source: http://www.servicestack.net/benchmarks/NorthwindDatabaseRowsSerialization.100000-times.2010-08-17.html
Protobuf is among the fastest
and smallest serializers for many
programming languages
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
47. HeuristicLab PPOVCockpit
47
Production Planning Optimization & Visualization Cockpit
• software frontend for applying optimization and simulation methods on company data
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
Data Import
Enterprise DB
Scenario
Definition
Optimization: + CPLEX Simulation: Sim#
Visualization
49. Example:
Material Flow and Layout Optimization
Goal
• the layout of an automotive production
partner should be optimized
• material flows are to be simulated and
used to rearrange workcenters
Simulation Model
• the simulation runs with the historical data
of the production plant
• production flows are identified as either
sequential, parallel or material demands
that are satisified from different sources
Optimization
• a rearrangement of workcenters shows
significant potential in reducing
transportation requirements
• the company currently builds a new facility
and adapts the layout of the existing plant
49Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
50. Example:
Steel Slab Logistics
Goal
• the transportation of steel slabs should be
improved and a better utilization of the
capacities achieved
Simulation Model
• straddle carriers transport steel slabs
between the continuous caster, the
designated storage areas, processing
facilities and the rolling mill
Optimization
• determine which straddle carrier picks up
which slabs in which order
• a 4% improvement in lead time in
dispatching of the straddle carriers was
identified
• major bottlenecks (e.g. stacking of slabs) in
the process have been identified and led
to subsequent projects
50Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
51. Example:
Fork Lift Routing
Goal
• dispatching of fork lifts to serve the
assembly of trucks should be improved
• interaction with warehouse operations
should be investigated to determine
interrelated effects
Simulation Model
• fork lifts transport materials to the
assembly stations
• finished assembly parts have to be
transported back
Optimization
• results indicate that a combined view of
picking and in-house transport is able to
reduce makespan more than optimizing
storage and transport independently
51Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
52. Example:
Warehouse Operations
Goal
• automate the decision of where to place
items in a high rack storage
• simulate the effects of the automation
Simulation Model
• the high rack contains the assigned items
• what is the effect on the picking process?
Optimization
• optimize the assignment of items to
storage locations
• several constraints are considered such as
storage box capacity, different location
capacities, aspects of certain parts (FIFO)
and more
• a 10% reduction of the travel distances
could be achieved if the whole high rack
was to be reorganized
• reduction of 10km of travel distance in the
first month of the automation
52Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
53. Example:
Setup Cost Minimization
53
Motivation
• shift from mass production to customized products in small lot
sizes
• more frequent setups to prepare machines or tooling
• 40% - 70% spent on setup in some manufacturing environments
• in many cases: constantly changing product portfolio
(especially for toll manufacturers)
Setup Costs
• setup times are a crucial factor in many branches of industry
• setup costs are frequently sequence-dependent
Goal
• no measurements but approximation of setup costs
• optimization of total setup costs for production planning
• evaluation of scheduling vs. dispatching rules
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
54. Example:
Product Mixture Simulation and
Optimization
Goal
• for creating a product with a stable quality
multiple raw materials stacked in several
containers must be mixed together
• the mixing descision is complex and has to
take into account product restrictions
• an existing optimization only considers the
bottom raw material in the container
Simulation Model
• given a certain mixing plan the simulation
model calculates the properties of future
products
• trucks bring in new raw material
Optimization
• the optimization finds mixtures that result
in stable qualities of the products over
time and reduces fluctuations
54
Converter
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
55. Example:
Scheduling vs. Dispatching in Production
Planning
55
Motivation
• challenges in the optimization of a real-world production plant:
data quality, problem size, constraints
• scheduling: long-term planning, „big picture“ (global optimization)
• dispatching: rule-based, local strategies, real-time capable, for
volatile environments
Variants
• simple dispatching rules: FIFO, EarliestDueDate, NrOfOperations, etc.
• rule per group/machine:
• complex dispatching rule:
Evaluation via Simulation
• robustness, stochastic variability, additional key figures
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
5 7 3 0 2 2 7 4 9 4 5 1
56. Example:
Logistics Network Design
BioBoost – EU FP7 Project
• agricultural waste should be converted to
BioFuel
• however waste has little energy density and is
inefficient to transport
• the project develops plants to compress
waste into intermediate and final products
• question: what would an efficient logistic
network to support this process look like and
what would it cost to build and run?
Simulation Model
• regions supply certain types of waste in
different degrees
• regions have transportation infrastructure
that influences cost and speed
• a high demand influences the price of waste
Optimization
• determine which regions get a converter
• determine which region contributes which
type of waste to which degree to which
converter
56Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
57. Example:
Dial-a-Ride Transportation Model
Goal
• what would a more efficient public
transport look like that is based on
dynamic routes and small buses
• a control strategy should be identified
which performs bus dispatching
Simulation Model
• customers appear at bus stops and
demand a ride to their destination
• buses frequent the stops and pick up as
well as drop off the customers
• the control strategy decides where the bus
heads next and which customers to be
picked up
Optimization
• the control strategy consists of several
parameters that need to be optimized
• a reduced lead time is the main goal such
that customers do not wait too long
57Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
58. Example:
Vendor Managed Inventory
Goal
• build a tool to evaluate and compare the case
of order-based and vendor-managed
inventory
• determine the expected improvements and
potential return on investment
Simulation Model
• supermarkts have several thousand product
groups in stock
• customers have demands and consume the
stock
• the central warehouse needs to continuously
restock the supermarkets
Optimization
• determine which supermarkts will be
restocked with which truck in what tour on
which day with which products
• compare this case to an order-based case
where only tours need to be determined
• the demands have been smoothened over
the week and peek demands could be
avoided
• consideration of mixed scenarios where only
some markets adopted VMI
58Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
59. Contact
59Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
FH-Prof. Priv.-Doz. DI Dr.
Michael Affenzeller
Phone: +43 5 0804 22031
E-Mail: michael.affenzeller@fh-hagenberg.at
FH-Prof. DI Dr.
Stephan M. Winkler
Phone: +43 5 0804 22030
E-Mail: stephan.winkler@fh-hagenberg.at
Heuristic and Evolutionary Algorithms Laboratory (HEAL)
School of Informatics, Communication and Media
FH OÖ University of Applied Sciences Upper Austria
Softwarepark 11, 4232 Hagenberg, Austria
HEAL: https://heal.heuristiclab.com
HeuristicLab: https://dev.heuristiclab.com
60. Specific Strengths and USPs
60
Own Framework, Own Algorithms and Methods
• long-term experience in algorithm development, analysis and application
• tailored solutions
Well-Known and Reputable in the Scientific Community
• first invitation of an European group to GPTP
• organization of annual GECCO workshop since 2012
• other workshop and conference organizations (APCase, I3M, EuroCAST,
LINDI)
• excerpts of CRC Press book proposal about GA & GP
Reviewer 1: “I know most of the authors and have very high opinion
about their professionalism. They are from one of the LEADING groups
in the field.”
Reviewer 2: “Affenzeller’s group is probably the best academic
organization with respect to symbolic regression and industrial
applications.”
• numerous publications in journals, books and conference papers
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
http://dev.heuristiclab.com
61. Specific Strengths and USPs
61
Own Framework, Own Algorithms and Methods
• long-term experience in algorithm development, analysis
and application
• tailored solutions
Long-Term Cooperation with Leading Local Industrial
Partners such as voestalpine, Rosenbauer, MIBA, AVL, Rübig
• first Josef Ressel Centre of Excellence in Austria
• first COMET project in Hagenberg campus
• close personal and individual cooperation
Research Group HEAL - Heuristics and Evolutionary Algorithms Laboratory
http://dev.heuristiclab.com