https://arxiv.org/abs/1606.06543
Finding optimal configurations for Stream Processing Systems (SPS) is a challenging problem due to the large number of parameters that can influence their performance and the lack of analytical models to anticipate the effect of a change. To tackle this issue, we consider tuning methods where an experimenter is given a limited budget of experiments and needs to carefully allocate this budget to find optimal configurations. We propose in this setting Bayesian Optimization for Configuration Optimization (BO4CO), an auto-tuning algorithm that leverages Gaussian Processes (GPs) to iteratively capture posterior distributions of the configuration spaces and sequentially drive the experimentation. Validation based on Apache Storm demonstrates that our approach locates optimal configurations within a limited experimental budget, with an improvement of SPS performance typically of at least an order of magnitude compared to existing configuration algorithms.
The document discusses using machine learning techniques like Gaussian processes (GPs) to optimize the configuration of software systems. It notes that software performance landscapes are often complex, with non-linear interactions between parameters and non-convex response surfaces. Measurements are also subject to noise. The document introduces an approach called TL4CO that uses multi-task Gaussian processes to model software performance across different versions/deployments, allowing it to leverage data from other versions to improve optimization. This helps address challenges in DevOps where new versions are continuously delivered.
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Pooyan Jamshidi
Modern software systems are now being built to be used in dynamic environments utilizing configuration capabilities to adapt to changes and external uncertainties. In a self-adaptation context, we are often interested in reasoning about the performance of the systems under different configurations. Usually, we learn a black-box model based on real measurements to predict the performance of the system given a specific configuration. However, as modern systems become more complex, there are many configuration parameters that may interact and, therefore, we end up learning an exponentially large configuration space. Naturally, this does not scale when relying on real measurements in the actual changing environment. We propose a different solution: Instead of taking the measurements from the real system, we learn the model using samples from other sources, such as simulators that approximate performance of the real system at low cost.
Transfer Learning for Improving Model Predictions in Robotic SystemsPooyan Jamshidi
Modern software systems are now being built to be used in dynamic environments utilizing configuration capabilities to adapt to changes and external uncertainties. In a self-adaptation context, we are often interested in reasoning about the performance of the systems under different configurations. Usually, we learn a black-box model based on real measurements to predict the performance of the system given a specific configuration. However, as modern systems become more complex, there are many configuration parameters that may interact and, therefore, we end up learning an exponentially large configuration space. Naturally, this does not scale when relying on real measurements in the actual changing environment. We propose a different solution: Instead of taking the measurements from the real system, we learn the model using samples from other sources, such as simulators that approximate performance of the real system at low cost.
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...Pooyan Jamshidi
(1) The document discusses challenges in managing elasticity in cloud architectures due to unpredictable demand and uncertainty in measurements. (2) It proposes a fuzzy self-learning controller called RobusT2Scale that uses type-2 fuzzy logic to qualitatively specify thresholds and make robust scaling decisions despite uncertainty. (3) Experimental results show that RobusT2Scale is able to guarantee service level agreements while avoiding over- and under-provisioning of resources compared to other approaches.
Transfer Learning for Software Performance Analysis: An Exploratory AnalysisPooyan Jamshidi
The document discusses transfer learning for building performance models of configurable software systems. Building accurate performance models through direct measurement is challenging due to the large configuration space and environmental factors. Transfer learning aims to address this by leveraging knowledge from performance models built for related systems or environments to improve the learning process for new systems and environments. The goal is to develop techniques that allow predicting and optimizing performance for configurable systems across changing environments.
Continuous Architecting of Stream-Based SystemsCHOOSE
Pooyan Jamshidi CHOOSE Talk 2016-11-01
Big data architectures have been gaining momentum in recent years. For instance, Twitter uses stream processing frameworks like Storm to analyse billions of tweets per minute and learn the trending topics. However, architectures that process big data involve many different components interconnected via semantically different connectors making it a difficult task for software architects to refactor the initial designs. As an aid to designers and developers, we developed OSTIA (On-the-fly Static Topology Inference Analysis) that allows: (a) visualizing big data architectures for the purpose of design-time refactoring while maintaining constraints that would only be evaluated at later stages such as deployment and run-time; (b) detecting the occurrence of common anti-patterns across big data architectures; (c) exploiting software verification techniques on the elicited architectural models. In the lecture, OSTIA will be shown on three industrial-scale case studies.
See: http://www.choose.s-i.ch/events/jamshidi-2016/
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsPooyan Jamshidi
This document provides background on Pooyan Jamshidi's research related to learning software performance models for dynamic and uncertain environments. It summarizes his past work developing techniques for modeling and optimizing performance across different systems and environments, including using transfer learning to reuse performance data from related sources to build more accurate models with fewer measurements. It also outlines opportunities for using transfer learning to adapt performance models to new environments and systems.
Configuration Optimization for Big Data SoftwarePooyan Jamshidi
The document discusses configuration optimization for big data software using an approach developed in the DICE project funded by the European Union's Horizon 2020 program. It describes optimizing configurations for Apache Storm and Cassandra to significantly reduce configuration time. Experiments showed large performance variations between configurations and that default settings often performed poorly compared to optimized settings. Tuning on one version did not guarantee good performance on other versions, but transferring more observations from other versions improved performance, though with diminishing returns due to increased optimization costs.
The document discusses using machine learning techniques like Gaussian processes (GPs) to optimize the configuration of software systems. It notes that software performance landscapes are often complex, with non-linear interactions between parameters and non-convex response surfaces. Measurements are also subject to noise. The document introduces an approach called TL4CO that uses multi-task Gaussian processes to model software performance across different versions/deployments, allowing it to leverage data from other versions to improve optimization. This helps address challenges in DevOps where new versions are continuously delivered.
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Pooyan Jamshidi
Modern software systems are now being built to be used in dynamic environments utilizing configuration capabilities to adapt to changes and external uncertainties. In a self-adaptation context, we are often interested in reasoning about the performance of the systems under different configurations. Usually, we learn a black-box model based on real measurements to predict the performance of the system given a specific configuration. However, as modern systems become more complex, there are many configuration parameters that may interact and, therefore, we end up learning an exponentially large configuration space. Naturally, this does not scale when relying on real measurements in the actual changing environment. We propose a different solution: Instead of taking the measurements from the real system, we learn the model using samples from other sources, such as simulators that approximate performance of the real system at low cost.
Transfer Learning for Improving Model Predictions in Robotic SystemsPooyan Jamshidi
Modern software systems are now being built to be used in dynamic environments utilizing configuration capabilities to adapt to changes and external uncertainties. In a self-adaptation context, we are often interested in reasoning about the performance of the systems under different configurations. Usually, we learn a black-box model based on real measurements to predict the performance of the system given a specific configuration. However, as modern systems become more complex, there are many configuration parameters that may interact and, therefore, we end up learning an exponentially large configuration space. Naturally, this does not scale when relying on real measurements in the actual changing environment. We propose a different solution: Instead of taking the measurements from the real system, we learn the model using samples from other sources, such as simulators that approximate performance of the real system at low cost.
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...Pooyan Jamshidi
(1) The document discusses challenges in managing elasticity in cloud architectures due to unpredictable demand and uncertainty in measurements. (2) It proposes a fuzzy self-learning controller called RobusT2Scale that uses type-2 fuzzy logic to qualitatively specify thresholds and make robust scaling decisions despite uncertainty. (3) Experimental results show that RobusT2Scale is able to guarantee service level agreements while avoiding over- and under-provisioning of resources compared to other approaches.
Transfer Learning for Software Performance Analysis: An Exploratory AnalysisPooyan Jamshidi
The document discusses transfer learning for building performance models of configurable software systems. Building accurate performance models through direct measurement is challenging due to the large configuration space and environmental factors. Transfer learning aims to address this by leveraging knowledge from performance models built for related systems or environments to improve the learning process for new systems and environments. The goal is to develop techniques that allow predicting and optimizing performance for configurable systems across changing environments.
Continuous Architecting of Stream-Based SystemsCHOOSE
Pooyan Jamshidi CHOOSE Talk 2016-11-01
Big data architectures have been gaining momentum in recent years. For instance, Twitter uses stream processing frameworks like Storm to analyse billions of tweets per minute and learn the trending topics. However, architectures that process big data involve many different components interconnected via semantically different connectors making it a difficult task for software architects to refactor the initial designs. As an aid to designers and developers, we developed OSTIA (On-the-fly Static Topology Inference Analysis) that allows: (a) visualizing big data architectures for the purpose of design-time refactoring while maintaining constraints that would only be evaluated at later stages such as deployment and run-time; (b) detecting the occurrence of common anti-patterns across big data architectures; (c) exploiting software verification techniques on the elicited architectural models. In the lecture, OSTIA will be shown on three industrial-scale case studies.
See: http://www.choose.s-i.ch/events/jamshidi-2016/
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsPooyan Jamshidi
This document provides background on Pooyan Jamshidi's research related to learning software performance models for dynamic and uncertain environments. It summarizes his past work developing techniques for modeling and optimizing performance across different systems and environments, including using transfer learning to reuse performance data from related sources to build more accurate models with fewer measurements. It also outlines opportunities for using transfer learning to adapt performance models to new environments and systems.
Configuration Optimization for Big Data SoftwarePooyan Jamshidi
The document discusses configuration optimization for big data software using an approach developed in the DICE project funded by the European Union's Horizon 2020 program. It describes optimizing configurations for Apache Storm and Cassandra to significantly reduce configuration time. Experiments showed large performance variations between configurations and that default settings often performed poorly compared to optimized settings. Tuning on one version did not guarantee good performance on other versions, but transferring more observations from other versions improved performance, though with diminishing returns due to increased optimization costs.
This document discusses the history and implementation of regression tree models. It begins by covering early tree models from the 1960s-1980s like CART and GUIDE. It then discusses more modern unified frameworks using modular packages in R like partykit and mob models. The document provides an example using a Bradley-Terry tree to model preferences from paired comparisons. It concludes by discussing potential extensions to deep learning methods.
This document summarizes an adaptive checkpointing and replication strategy to tolerate faults in computational grids. It proposes maintaining a balance between the overheads of replication and checkpointing. Tasks are replicated on up to three resources based on each resource's probability of permanent failure. Checkpoints are taken adaptively based on the probability of recoverable failure. If a resource fails permanently, the task resumes from the last checkpoint. If a failure is recoverable, the task resumes on the same resource. This strategy aims to minimize resource wastage from replication while utilizing different resource speeds.
Buffer Allocation Problem is an important research issue in manufacturing system design.
Objective of this paper is to find optimum buffer allocation for closed queuing network with
multi servers at each node. Sum of buffers in closed queuing network is constant. Attempt is
made to find optimum number of pallets required to maximize throughput of manufacturing
system which has pre specified space for allocating pallets. Expanded Mean Value Analysis is
used to evaluate the performance of closed queuing network. Particle Swarm Optimization is
used as generative technique to optimize the buffer allocation. Numerical experiments are
shown to explain effectiveness of procedure
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)byteLAKE
See our presentation from the 6th International EULAG Users Workshop. We talked about taking HPC to the "Industry 4.0" by implementing smart techniques to optimize the codes in terms of performance and energy consumption. It explains how Machine Learning can dynamically optimize HPC simulations and byteLAKE's software autotuning solution.
Find out more about byteLAKE at: www.byteLAKE.com
Incremental collaborative filtering via evolutionary co clusteringAllen Wu
This document summarizes an incremental collaborative filtering approach via evolutionary co-clustering. It introduces incremental collaborative filtering and discusses existing approaches. It then proposes an incremental evolutionary co-clustering method that assigns new users and items to clusters during the online phase to make more accurate predictions. The method uses an ensemble of co-clustering solutions and an evolutionary algorithm to improve performance. Experimental results on a movie rating dataset show the proposed approach achieves better accuracy than other incremental collaborative filtering methods.
논문 제목부터 재미있어 보이는 주제 입니다. 오늘 딥러닝 논문읽기 모임에서 소개드릴 논문은 DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems, 강화학습을 이용한 온라인 추천 시스템 입니다. 비공개 된 정보들이 몇가지가 있지만, 아이디어면에서 여러분들이 충분히 재밌게 들으실수 있습니다. 강화학습의 기본적인 개념부터,
논문에 대한 디테일하고 깊이 있는 리뷰를
펀디멘탈팀 김창연 님이 도와주셨습니다!
오늘도 많은 관심 미리 감사드립니다!
추가로 .. 딥러닝 논문읽기 모임은 청강방 오픈채팅 방을 운영하고 있습니다. 최근 악성 홍보 봇 계정이 늘어나 방을 비밀번호를 걸어두게 되었습니다
딥러닝 청강방도 많은 관심 부탁드립니다!
청강방 링크 : https://open.kakao.com/o/gp6GHMMc
청강방 비밀번호 : 0501
Focal Loss for Dense Object Detection proposes a novel focal loss function to address the extreme foreground-background class imbalance encountered in training dense object detectors. The focal loss focuses training on hard examples and prevents easy negatives from overwhelming the detector. RetinaNet, a simple dense detector designed with a ResNet-FPN backbone and focal loss, achieves state-of-the-art accuracy while running faster than existing two-stage detectors. Extensive experiments demonstrate the focal loss enables training highly accurate dense detectors on datasets with vast numbers of background examples like COCO.
HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardin...Sunny Kr
Cardinality estimation has a wide range of applications and
is of particular importance in database systems. Various
algorithms have been proposed in the past, and the HyperLogLog algorithm is one of them
A scalable collaborative filtering framework based on co clusteringAllenWu
This document proposes a scalable collaborative filtering framework based on co-clustering. It introduces collaborative filtering and discusses limitations of existing methods. The framework uses co-clustering to simultaneously obtain user and item neighborhoods and generate predictions based on average ratings. Experimental results show the approach provides high quality predictions with lower computational cost than other methods.
This document presents a comparative study of two genetic algorithm-based task allocation models in distributed computing systems. It aims to minimize turnaround time, where the previous model aimed to maximize reliability. The models are implemented on two example cases, with the minimum turnaround time model finding an allocation with a turnaround of 14 units and slightly lower reliability than the maximum reliability model's allocation of 20 units. In conclusion, minimizing turnaround time leads to slightly reduced reliability compared to maximizing reliability.
This document summarizes a research paper that proposes a new heuristic called PAUSE for investigating the producer-consumer problem in distributed systems. The paper motivates the need to study this problem, describes PAUSE's approach of using compact configurations and decentralized components, outlines its implementation in Lisp and Java, and presents experimental results showing PAUSE outperforms previous methods. Related work investigating similar challenges is also discussed.
Towards a Unified Data Analytics Optimizer with Yanlei DiaoDatabricks
Today’s big data analytics systems are best effort only: despite the wide adoption, they still lack the ability to take user monetary constraints and performance goals, and automatically configure an analytic job to achieve those goals. Our work aims to take a step further towards building a new data analytics optimizer that works for arbitrary dataflow programs and determines the job configuration in an automated manner based on user objectives regarding latency, throughput, monetary cost, etc.
At the core of the optimizer are a principled multi-objective optimization framework that enables one to explore the tradeoffs between different objectives, and a deep learning-based modeling approach that can learn a model for each user objective as complex as necessary for the user computing environment. Using both SQL-like and machine learning jobs in Spark, we show that our techniques can learn a model of each objective with high accuracy, and the multi-objective optimizer can automatically recommend new configurations that significantly improve performance from the configurations manually set by engineers.
Build your own Convolutional Neural Network CNNHichem Felouat
This document provides an overview of building and training a convolutional neural network (CNN) from scratch in Keras and TensorFlow. It discusses CNN architecture including convolutional layers, pooling layers, and fully connected layers. It also covers techniques for avoiding overfitting such as regularization, dropout, data augmentation, early stopping, and callbacks. The document concludes with instructions on how to save and load a trained CNN model.
Adversarial Reinforced Learning for Unsupervised Domain Adaptationtaeseon ryu
안녕하세요 딥러닝 논문읽기 모임입니다 오늘 업로드된 논문 리뷰 영상은 2021 WACB 에서 발표된 Adversarial Reinforced Learning for Unsupervised Domain Adaptation 라는 제목의 논문입니다.
데이터 분류의 자동화를 위해서는 많은양의 학습데이터가 필요합니다. 그렇기에 레이블이 존재하는 데이터로 학습이 끝난 모델을 재활용해서 새로운 도메인에 적용하는 연구인 도메인 어뎁션 분야는 많은 각광을 받고 있습니다.
논문의 특징으로는 크게 세가지를 둘 수 있습니다.
첫 번째로 본 논문에서는 GAN을 이용하여 비지도 방식으로 도메인 어뎁션이 가능한 프레임워크를 제안하였습니다 여기서 이제 강화학습 모델은 소스와 타겟
도메인간 가장 최적의 피처쌍을 선택하는데 사용됩니다
두 번째로 레이블링 되지않은 타겟 도메인에서 가장 적합한 피처를 찾아내기 위해
소스와 타겟간 상관관계를 보상으로 적용하는 정책을 개발하였습니다
마지막으로 제안된 적대적 강화학습 모델을 소스와 타겟 도메인간
최소화하는 피처쌍의 탐색과 각 도메인의 거리 분포상태의
Alignment 학습을 통해 소타대비 이제 성능을 향상 하였습니다
논문에 대한 디테일한 리뷰를 펀디멘탈팀 이근배님이 많은 도움 주셨습니다!
This document provides an overview of VAE-type deep generative models, especially RNNs combined with VAEs. It begins with notations and abbreviations used. The agenda then covers the mathematical formulation of generative models, the Variational Autoencoder (VAE), variants of VAE that combine it with RNNs (VRAE, VRNN, DRAW), a Chainer implementation of Convolutional DRAW, other related models (Inverse DRAW, VAE+GAN), and concludes with challenges of VAE-like generative models.
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...IJCNCJournal
Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel
system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting
of multiple collections of nodes with different types of computing devices. The execution engine of the
system is open for optimizer implementations, focusing on various criteria. In this paper, we propose a new
optimizer for KernelHive, that utilizes distributed databases and performs data prefetching to optimize the
execution time of applications, which process large input data. Employing a versatile data management
scheme, which allows combining various distributed data providers, we propose using NoSQL databases
for our purposes. We support our solution with results of experiments with real executions of our OpenCL
implementation of a regular expression matching application in various hardware configurations.
Additionally, we propose a network-aware scheduling scheme for selecting hardware for the proposed
optimizer and present simulations that demonstrate its advantages.
Co-clustering of multi-view datasets: a parallelizable approachAllen Wu
This document summarizes a research paper on co-clustering multi-view datasets using a parallelizable approach called MVSIM. MVSIM computes co-similarity matrices for related objects across multiple views or relation matrices. It creates a learning network matching the relational structure and aggregates the similarity matrices using a damping factor. Experiments show MVSIM outperforms single-view and other multi-view clustering methods on document and newsgroup datasets, and its performance decreases slightly but computation time reduces significantly when the data is split across more views.
Deep learning for molecules, introduction to chainer chemistryKenta Oono
1) The document introduces machine learning and deep learning techniques for predicting chemical properties, including rule-based approaches versus learning-based approaches using neural message passing algorithms.
2) It discusses several graph neural network models like NFP, GGNN, WeaveNet and SchNet that can be applied to molecular graphs to predict characteristics. These models update atom representations through message passing and graph convolution operations.
3) Chainer Chemistry is introduced as a deep learning framework that can be used with these graph neural network models for chemical property prediction tasks. Examples of tasks include drug discovery and molecular generation.
- The document describes a reinforcement learning method using deep neural networks called DQN that was able to learn successful policies to play 49 Atari 2600 games directly from raw pixel inputs, outperforming prior methods on 43 games.
- DQN trained large neural networks using a reinforcement learning signal and stochastic gradient descent in a stable manner. Its performance was comparable to human-level performance on over half the games.
- The method took high-dimensional video game inputs and used a convolutional neural network architecture to learn policies without additional domain knowledge beyond the inputs, actions, and rewards.
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...Pooyan Jamshidi
A look at the searches related to the term “microservices” on Google Trends revealed that the top searches are now technology driven. This implies that the time of general search terms such as “What is microservices?” has now long passed. Not only are software vendors (for example, IBM and Microsoft) using microservices and DevOps practices, but also content providers (for example, Netflix and the BBC) have adopted and are using them.
I report on experiences and lessons learned during incremental migration and architectural refactoring of a commercial mobile back end as a service to microservices architecture. I explain how we adopted DevOps and how this facilitated a smooth migration towards Microservices architecture.
Cloud Migration Patterns: A Multi-Cloud Architectural PerspectivePooyan Jamshidi
Cloud migration requires an engineering, verifiable, measurable, transparent and repeatable approach rather than an ad-hoc approach based on trial and error.
We describe a comprehensive set of (multi-)cloud migration patterns from an architectural perspective. In this work, we focus on application components and their migration to the multi-cloud environments. We define and characterize the patterns with concrete usage scenario. We also describe the process for migration pattern selection, composition and extension.
This document discusses the history and implementation of regression tree models. It begins by covering early tree models from the 1960s-1980s like CART and GUIDE. It then discusses more modern unified frameworks using modular packages in R like partykit and mob models. The document provides an example using a Bradley-Terry tree to model preferences from paired comparisons. It concludes by discussing potential extensions to deep learning methods.
This document summarizes an adaptive checkpointing and replication strategy to tolerate faults in computational grids. It proposes maintaining a balance between the overheads of replication and checkpointing. Tasks are replicated on up to three resources based on each resource's probability of permanent failure. Checkpoints are taken adaptively based on the probability of recoverable failure. If a resource fails permanently, the task resumes from the last checkpoint. If a failure is recoverable, the task resumes on the same resource. This strategy aims to minimize resource wastage from replication while utilizing different resource speeds.
Buffer Allocation Problem is an important research issue in manufacturing system design.
Objective of this paper is to find optimum buffer allocation for closed queuing network with
multi servers at each node. Sum of buffers in closed queuing network is constant. Attempt is
made to find optimum number of pallets required to maximize throughput of manufacturing
system which has pre specified space for allocating pallets. Expanded Mean Value Analysis is
used to evaluate the performance of closed queuing network. Particle Swarm Optimization is
used as generative technique to optimize the buffer allocation. Numerical experiments are
shown to explain effectiveness of procedure
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)byteLAKE
See our presentation from the 6th International EULAG Users Workshop. We talked about taking HPC to the "Industry 4.0" by implementing smart techniques to optimize the codes in terms of performance and energy consumption. It explains how Machine Learning can dynamically optimize HPC simulations and byteLAKE's software autotuning solution.
Find out more about byteLAKE at: www.byteLAKE.com
Incremental collaborative filtering via evolutionary co clusteringAllen Wu
This document summarizes an incremental collaborative filtering approach via evolutionary co-clustering. It introduces incremental collaborative filtering and discusses existing approaches. It then proposes an incremental evolutionary co-clustering method that assigns new users and items to clusters during the online phase to make more accurate predictions. The method uses an ensemble of co-clustering solutions and an evolutionary algorithm to improve performance. Experimental results on a movie rating dataset show the proposed approach achieves better accuracy than other incremental collaborative filtering methods.
논문 제목부터 재미있어 보이는 주제 입니다. 오늘 딥러닝 논문읽기 모임에서 소개드릴 논문은 DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems, 강화학습을 이용한 온라인 추천 시스템 입니다. 비공개 된 정보들이 몇가지가 있지만, 아이디어면에서 여러분들이 충분히 재밌게 들으실수 있습니다. 강화학습의 기본적인 개념부터,
논문에 대한 디테일하고 깊이 있는 리뷰를
펀디멘탈팀 김창연 님이 도와주셨습니다!
오늘도 많은 관심 미리 감사드립니다!
추가로 .. 딥러닝 논문읽기 모임은 청강방 오픈채팅 방을 운영하고 있습니다. 최근 악성 홍보 봇 계정이 늘어나 방을 비밀번호를 걸어두게 되었습니다
딥러닝 청강방도 많은 관심 부탁드립니다!
청강방 링크 : https://open.kakao.com/o/gp6GHMMc
청강방 비밀번호 : 0501
Focal Loss for Dense Object Detection proposes a novel focal loss function to address the extreme foreground-background class imbalance encountered in training dense object detectors. The focal loss focuses training on hard examples and prevents easy negatives from overwhelming the detector. RetinaNet, a simple dense detector designed with a ResNet-FPN backbone and focal loss, achieves state-of-the-art accuracy while running faster than existing two-stage detectors. Extensive experiments demonstrate the focal loss enables training highly accurate dense detectors on datasets with vast numbers of background examples like COCO.
HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardin...Sunny Kr
Cardinality estimation has a wide range of applications and
is of particular importance in database systems. Various
algorithms have been proposed in the past, and the HyperLogLog algorithm is one of them
A scalable collaborative filtering framework based on co clusteringAllenWu
This document proposes a scalable collaborative filtering framework based on co-clustering. It introduces collaborative filtering and discusses limitations of existing methods. The framework uses co-clustering to simultaneously obtain user and item neighborhoods and generate predictions based on average ratings. Experimental results show the approach provides high quality predictions with lower computational cost than other methods.
This document presents a comparative study of two genetic algorithm-based task allocation models in distributed computing systems. It aims to minimize turnaround time, where the previous model aimed to maximize reliability. The models are implemented on two example cases, with the minimum turnaround time model finding an allocation with a turnaround of 14 units and slightly lower reliability than the maximum reliability model's allocation of 20 units. In conclusion, minimizing turnaround time leads to slightly reduced reliability compared to maximizing reliability.
This document summarizes a research paper that proposes a new heuristic called PAUSE for investigating the producer-consumer problem in distributed systems. The paper motivates the need to study this problem, describes PAUSE's approach of using compact configurations and decentralized components, outlines its implementation in Lisp and Java, and presents experimental results showing PAUSE outperforms previous methods. Related work investigating similar challenges is also discussed.
Towards a Unified Data Analytics Optimizer with Yanlei DiaoDatabricks
Today’s big data analytics systems are best effort only: despite the wide adoption, they still lack the ability to take user monetary constraints and performance goals, and automatically configure an analytic job to achieve those goals. Our work aims to take a step further towards building a new data analytics optimizer that works for arbitrary dataflow programs and determines the job configuration in an automated manner based on user objectives regarding latency, throughput, monetary cost, etc.
At the core of the optimizer are a principled multi-objective optimization framework that enables one to explore the tradeoffs between different objectives, and a deep learning-based modeling approach that can learn a model for each user objective as complex as necessary for the user computing environment. Using both SQL-like and machine learning jobs in Spark, we show that our techniques can learn a model of each objective with high accuracy, and the multi-objective optimizer can automatically recommend new configurations that significantly improve performance from the configurations manually set by engineers.
Build your own Convolutional Neural Network CNNHichem Felouat
This document provides an overview of building and training a convolutional neural network (CNN) from scratch in Keras and TensorFlow. It discusses CNN architecture including convolutional layers, pooling layers, and fully connected layers. It also covers techniques for avoiding overfitting such as regularization, dropout, data augmentation, early stopping, and callbacks. The document concludes with instructions on how to save and load a trained CNN model.
Adversarial Reinforced Learning for Unsupervised Domain Adaptationtaeseon ryu
안녕하세요 딥러닝 논문읽기 모임입니다 오늘 업로드된 논문 리뷰 영상은 2021 WACB 에서 발표된 Adversarial Reinforced Learning for Unsupervised Domain Adaptation 라는 제목의 논문입니다.
데이터 분류의 자동화를 위해서는 많은양의 학습데이터가 필요합니다. 그렇기에 레이블이 존재하는 데이터로 학습이 끝난 모델을 재활용해서 새로운 도메인에 적용하는 연구인 도메인 어뎁션 분야는 많은 각광을 받고 있습니다.
논문의 특징으로는 크게 세가지를 둘 수 있습니다.
첫 번째로 본 논문에서는 GAN을 이용하여 비지도 방식으로 도메인 어뎁션이 가능한 프레임워크를 제안하였습니다 여기서 이제 강화학습 모델은 소스와 타겟
도메인간 가장 최적의 피처쌍을 선택하는데 사용됩니다
두 번째로 레이블링 되지않은 타겟 도메인에서 가장 적합한 피처를 찾아내기 위해
소스와 타겟간 상관관계를 보상으로 적용하는 정책을 개발하였습니다
마지막으로 제안된 적대적 강화학습 모델을 소스와 타겟 도메인간
최소화하는 피처쌍의 탐색과 각 도메인의 거리 분포상태의
Alignment 학습을 통해 소타대비 이제 성능을 향상 하였습니다
논문에 대한 디테일한 리뷰를 펀디멘탈팀 이근배님이 많은 도움 주셨습니다!
This document provides an overview of VAE-type deep generative models, especially RNNs combined with VAEs. It begins with notations and abbreviations used. The agenda then covers the mathematical formulation of generative models, the Variational Autoencoder (VAE), variants of VAE that combine it with RNNs (VRAE, VRNN, DRAW), a Chainer implementation of Convolutional DRAW, other related models (Inverse DRAW, VAE+GAN), and concludes with challenges of VAE-like generative models.
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...IJCNCJournal
Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel
system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting
of multiple collections of nodes with different types of computing devices. The execution engine of the
system is open for optimizer implementations, focusing on various criteria. In this paper, we propose a new
optimizer for KernelHive, that utilizes distributed databases and performs data prefetching to optimize the
execution time of applications, which process large input data. Employing a versatile data management
scheme, which allows combining various distributed data providers, we propose using NoSQL databases
for our purposes. We support our solution with results of experiments with real executions of our OpenCL
implementation of a regular expression matching application in various hardware configurations.
Additionally, we propose a network-aware scheduling scheme for selecting hardware for the proposed
optimizer and present simulations that demonstrate its advantages.
Co-clustering of multi-view datasets: a parallelizable approachAllen Wu
This document summarizes a research paper on co-clustering multi-view datasets using a parallelizable approach called MVSIM. MVSIM computes co-similarity matrices for related objects across multiple views or relation matrices. It creates a learning network matching the relational structure and aggregates the similarity matrices using a damping factor. Experiments show MVSIM outperforms single-view and other multi-view clustering methods on document and newsgroup datasets, and its performance decreases slightly but computation time reduces significantly when the data is split across more views.
Deep learning for molecules, introduction to chainer chemistryKenta Oono
1) The document introduces machine learning and deep learning techniques for predicting chemical properties, including rule-based approaches versus learning-based approaches using neural message passing algorithms.
2) It discusses several graph neural network models like NFP, GGNN, WeaveNet and SchNet that can be applied to molecular graphs to predict characteristics. These models update atom representations through message passing and graph convolution operations.
3) Chainer Chemistry is introduced as a deep learning framework that can be used with these graph neural network models for chemical property prediction tasks. Examples of tasks include drug discovery and molecular generation.
- The document describes a reinforcement learning method using deep neural networks called DQN that was able to learn successful policies to play 49 Atari 2600 games directly from raw pixel inputs, outperforming prior methods on 43 games.
- DQN trained large neural networks using a reinforcement learning signal and stochastic gradient descent in a stable manner. Its performance was comparable to human-level performance on over half the games.
- The method took high-dimensional video game inputs and used a convolutional neural network architecture to learn policies without additional domain knowledge beyond the inputs, actions, and rewards.
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...Pooyan Jamshidi
A look at the searches related to the term “microservices” on Google Trends revealed that the top searches are now technology driven. This implies that the time of general search terms such as “What is microservices?” has now long passed. Not only are software vendors (for example, IBM and Microsoft) using microservices and DevOps practices, but also content providers (for example, Netflix and the BBC) have adopted and are using them.
I report on experiences and lessons learned during incremental migration and architectural refactoring of a commercial mobile back end as a service to microservices architecture. I explain how we adopted DevOps and how this facilitated a smooth migration towards Microservices architecture.
Cloud Migration Patterns: A Multi-Cloud Architectural PerspectivePooyan Jamshidi
Cloud migration requires an engineering, verifiable, measurable, transparent and repeatable approach rather than an ad-hoc approach based on trial and error.
We describe a comprehensive set of (multi-)cloud migration patterns from an architectural perspective. In this work, we focus on application components and their migration to the multi-cloud environments. We define and characterize the patterns with concrete usage scenario. We also describe the process for migration pattern selection, composition and extension.
Autonomic Resource Provisioning for Cloud-Based SoftwarePooyan Jamshidi
The Third National Conference on Cloud Computing and Commerce (NC4), for more information please refer to: http://computing.dcu.ie/~pjamshidi/PDF/SEAMS2014.pdf
Towards Quality-Aware Development of Big Data Applications with DICEPooyan Jamshidi
The document summarizes the DICE Horizon 2020 project, which aims to improve quality-aware development of big data applications. The 3-year project involves 9 partners across 7 EU countries. It seeks to shorten development times and reduce costs and quality incidents for big data projects through model-driven engineering and DevOps approaches. The project will demonstrate its techniques on three big data case studies and has milestones to define requirements, provide tools, and define its integrated architecture.
The document describes a configuration optimization tool that aims to automatically optimize the configuration of big data technologies. It does this by running experiments on data intensive applications, measuring performance under different configurations, and using this data to recommend optimal configurations. The tool implements two approaches for optimization - Bayesian optimization and transfer learning. It consists of several components, including an experimental suite to run tests, an optimization module, interfaces to various big data technologies, and a performance repository to store results. The goal is to help users like SMEs reduce the time and cost of testing and configuring big data applications between releases.
Sensitivity Analysis for Building Adaptive Robotic SoftwarePooyan Jamshidi
P. Jamshidi, M. Velez, C. Kästner, N. Siegmund, and P. Kawthekar. Transfer learning for improving model predictions in highly configurable software. Int’l Symp. Software Engineering for Adaptive and Self-Managing Systems (SEAMS), 2017.
This document discusses self-learning cloud controllers that can dynamically scale cloud resources. It notes that current auto-scaling approaches require deep application knowledge and expertise to determine scaling parameters and policies. The paper proposes a type-2 fuzzy logic approach called RobusT2Scale that uses fuzzy rules and monitoring data to determine scaling actions. It aims to handle uncertainty in elastic systems and accommodate different user preferences through fuzzy reasoning over workload and response time data. The approach pre-computes scaling decisions to enable efficient runtime elasticity control. It is evaluated based on its ability to meet an SLA target response time compared to over- and under-provisioning approaches.
An Algorithm For Vector Quantizer DesignAngie Miller
The document presents an algorithm for designing vector quantizers. The algorithm is efficient, intuitive, and can be used for quantizers with general distortion measures and large block lengths. It is based on Lloyd's approach but does not require differentiation, making it applicable even when the data distribution has discrete components. The algorithm finds quantizers that meet necessary optimality conditions. Examples show it converges well and finds near-optimal quantizers for memoryless Gaussian sources. It is also used successfully to quantize LPC speech parameters with a complicated distortion measure.
A simple framework for contrastive learning of visual representationsDevansh16
Link: https://machine-learning-made-simple.medium.com/learnings-from-simclr-a-framework-contrastive-learning-for-visual-representations-6c145a5d8e99
If you'd like to discuss something, text me on LinkedIn, IG, or Twitter. To support me, please use my referral link to Robinhood. It's completely free, and we both get a free stock. Not using it is literally losing out on free money.
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let's connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
My Substack: https://devanshacc.substack.com/
Live conversations at twitch here: https://rb.gy/zlhk9y
Get a free stock on Robinhood: https://join.robinhood.com/fnud75
This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.
Comments: ICML'2020. Code and pretrained models at this https URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as: arXiv:2002.05709 [cs.LG]
(or arXiv:2002.05709v3 [cs.LG] for this version)
Submission history
From: Ting Chen [view email]
[v1] Thu, 13 Feb 2020 18:50:45 UTC (5,093 KB)
[v2] Mon, 30 Mar 2020 15:32:51 UTC (5,047 KB)
[v3] Wed, 1 Jul 2020 00:09:08 UTC (5,829 KB)
This document discusses using unsupervised support vector analysis to increase the efficiency of simulation-based functional verification. It describes applying an unsupervised machine learning technique called support vector analysis to filter redundant tests from a set of verification tests. By clustering similar tests into regions of a similarity metric space, it aims to select the most important tests to verify a design while removing redundant tests, improving verification efficiency. The approach trains an unsupervised support vector model on an initial set of simulated tests and uses it to filter future tests by comparing them to support vectors that define regions in the similarity space.
A Combinatorial View Of The Service Rates Of Codes Problem, Its Equivalence T...Whitney Anderson
This document discusses a new technique for analyzing the service rates of coded storage systems by representing codes as graphs. It shows that the service rate problem is equivalent to the fractional matching problem in graph theory. This allows bounds on service capacity to be derived from graph properties like matching number and vertex cover number. If the graph is bipartite, these bounds are equal and the exact service capacity is obtained. The technique is applied to determine the service capacity of binary simplex codes, whose graph representation is shown to be bipartite. It is also shown that batch codes can be viewed as a special case of the service rate problem.
This document proposes a novel framework called smooth sparse coding for learning sparse representations of data. It incorporates feature similarity or temporal information present in data sets via non-parametric kernel smoothing. The approach constructs codes that represent neighborhoods of samples rather than individual samples, leading to lower reconstruction error. It also proposes using marginal regression rather than lasso for obtaining sparse codes, providing a dramatic speedup of up to two orders of magnitude without sacrificing accuracy. The document contributes a framework for incorporating domain information into sparse coding, sample complexity results for dictionary learning using smooth sparse coding, an efficient marginal regression training procedure, and successful application to classification tasks with improved accuracy and speed.
Approaches to online quantile estimationData Con LA
Data Con LA 2020
Description
This talk will explore and compare several compact data structures for estimation of quantiles on streams, including a discussion of how they balance accuracy against computational resource efficiency. A new approach providing more flexibility in specifying how computational resources should be expended across the distribution will also be explained. Quantiles (e.g., median, 99th percentile) are fundamental summary statistics of one-dimensional distributions. They are particularly important for SLA-type calculations and characterizing latency distributions, but unlike their simpler counterparts such as the mean and standard deviation, their computation is somewhat more expensive. The increasing importance of stream processing (in observability and other domains) and the impossibility of exact online quantile calculation together motivate the construction of compact data structures for estimation of quantiles on streams. In this talk we will explore and compare several such data structures (e.g., moment-based, KLL sketch, t-digest) with an eye towards how they balance accuracy against resource efficiency, theoretical guarantees, and desirable properties such as mergeability. We will also discuss a recent variation of the t-digest which provides more flexibility in specifying how computational resources should be expended across the distribution. No prior knowledge of the subject is assumed. Some familiarity with the general problem area would be helpful but is not required.
Speaker
Joe Ross, Splunk, Principal Data Scientist
This document discusses the use of block coordinate descent (BCD) for training convolutional neural networks on computer vision tasks. The authors trained a CNN on MNIST and FashionMNIST datasets using both BCD and SGD. They found that BCD achieved slightly better accuracy than SGD (0.1% on average) when optimizing 2500 parameters per batch with a batch size of 50. BCD requires smaller batch sizes for stochasticity to improve convergence. While BCD did not significantly outperform SGD in terms of speed or accuracy on CPU, the authors believe improvements to the BCD implementation could make it faster than SGD, especially on GPUs.
This document describes using a Markov logic network (MLN) to model structural constraints for multipartite entity resolution across multiple collections. The MLN combines first-order logic rules with weights learned from data. For bipartite resolution between two collections, the MLN expresses constraints for similarity, cardinality, preference, and global matching. For multipartite resolution of more than two collections, it adds rules for cross-collection transitivity grounded on observed features rather than predicted matches. Experiments on real datasets of cameras and phones spanning four collections validate the MLN approach and contributions of its components.
The document discusses using clustering models like subtractive fuzzy clustering (SFC) and fuzzy c-means clustering (FCM) to generate an adaptive neuro-fuzzy inference system (ANFIS) for medical diagnoses. Experimental results on medical diagnosis datasets show that ANFIS models using SFC and FCM clustering (ANFIS-SFC and ANFIS-FCM) had better average training and checking errors compared to ANFIS without clustering. Specifically, ANFIS-SFC performed best using backpropagation learning, while ANFIS-FCM performed best using a hybrid learning model. Clustering the datasets without ANFIS was also able to identify different disease clusters.
This document summarizes an academic paper that proposes modifying well-known local linear models for system identification by replacing their original recursive learning rules with outlier-robust variants based on M-estimation. It describes three existing local linear models - local linear map (LLM), radial basis function network (RBFN), and local model network (LMN) - and then introduces the concept of M-estimation as a way to make the learning rules of these models more robust to outliers. The performance of the proposed outlier-robust variants is evaluated on three benchmark datasets and is found to provide considerable improvement in the presence of outliers compared to the original models.
Adaptive check-pointing and replication strategy to tolerate faults in comput...IOSR Journals
This document summarizes an adaptive checkpointing and replication strategy to tolerate faults in computational grids. It proposes maintaining a balance between the overheads of replication and checkpointing. Tasks are replicated on up to three resources based on each resource's probability of permanent failure. Checkpoints are taken adaptively based on the probability of recoverable failure. If a resource fails permanently, the task resumes from the last checkpoint. If a failure is recoverable, the task resumes on the same resource. This strategy aims to minimize resource wastage from replication while utilizing different resource speeds.
Scikit-learn is a popular machine learning library for Python that provides simple and efficient tools for data mining and data analysis. It includes algorithms for classification, regression, clustering and dimensionality reduction. The scikit-learn API is designed for consistency, with common estimator, predictor and transformer interfaces that allow algorithms to be used interchangeably. This standardized interface helps users easily try different algorithms and preprocessing techniques for their machine learning tasks.
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsSubhajit Sahu
Highlighted notes on Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds.
While doing research work under Prof. Kishore Kothapalli.
Laxman Dhulipala, David Durfee, Janardhan Kulkarni, Richard Peng, Saurabh Sawlani, Xiaorui Sun:
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds. SODA 2020: 1300-1319
In this paper we study the problem of dynamically maintaining graph properties under batches of edge insertions and deletions in the massively parallel model of computation. In this setting, the graph is stored on a number of machines, each having space strongly sublinear with respect to the number of vertices, that is, n for some constant 0 < < 1. Our goal is to handle batches of updates and queries where the data for each batch fits onto one machine in constant rounds of parallel computation, as well as to reduce the total communication between the machines. This objective corresponds to the gradual buildup of databases over time, while the goal of obtaining constant rounds of communication for problems in the static setting has been elusive for problems as simple as undirected graph connectivity. We give an algorithm for dynamic graph connectivity in this setting with constant communication rounds and communication cost almost linear in terms of the batch size. Our techniques combine a new graph contraction technique, an independent random sample extractor from correlated samples, as well as distributed data structures supporting parallel updates and queries in batches. We also illustrate the power of dynamic algorithms in the MPC model by showing that the batched version of the adaptive connectivity problem is P-complete in the centralized setting, but sub-linear sized batches can be handled in a constant number of rounds. Due to the wide applicability of our approaches, we believe it represents a practically-motivated workaround to the current difficulties in designing more efficient massively parallel static graph algorithms.
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsSubhajit Sahu
In this paper we study the problem of dynamically
maintaining graph properties under batches of edge
insertions and deletions in the massively parallel model
of computation. In this setting, the graph is stored
on a number of machines, each having space strongly
sublinear with respect to the number of vertices, that
is, n
for some constant 0 < < 1. Our goal is to
handle batches of updates and queries where the data
for each batch fits onto one machine in constant rounds
of parallel computation, as well as to reduce the total
communication between the machines. This objective
corresponds to the gradual buildup of databases over
time, while the goal of obtaining constant rounds of
communication for problems in the static setting has
been elusive for problems as simple as undirected graph
connectivity.
We give an algorithm for dynamic graph connectivity
in this setting with constant communication rounds and
communication cost almost linear in terms of the batch
size. Our techniques combine a new graph contraction
technique, an independent random sample extractor from
correlated samples, as well as distributed data structures
supporting parallel updates and queries in batches.
We also illustrate the power of dynamic algorithms in
the MPC model by showing that the batched version
of the adaptive connectivity problem is P-complete in
the centralized setting, but sub-linear sized batches can
be handled in a constant number of rounds. Due to
the wide applicability of our approaches, we believe
it represents a practically-motivated workaround to the
current difficulties in designing more efficient massively
parallel static graph algorithms.
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...IRJET Journal
This document summarizes research on developing parallel algorithms to optimize solving the longest common subsequence (LCS) problem. LCS is commonly used for sequence comparison in bioinformatics. Traditional sequential dynamic programming algorithms have complexity of O(mn) for sequences of lengths m and n. The document reviews parallel algorithms developed using tools like OpenMP and GPUs like CUDA to reduce computation time. It proposes the authors' own optimized parallel algorithm for multi-core CPUs using OpenMP.
This document proposes a method for obtaining a sparse polynomial model from time series data. It uses an optimal minimal nonuniform time embedding to construct a time delay kernel from which a polynomial basis is built. A sparse model is then obtained by solving a regularized least squares problem that minimizes error while penalizing model complexity. The method is applied to generate a model of the Mackey-Glass chaotic system from time series data.
This document presents an overview of the Tolerance Rough Set Model (TRSM) and its applications in web intelligence and document clustering. The TRSM uses a tolerance relation instead of an indiscernibility relation to define rough set approximations. This model is used for clustering web search results by enriching document representations with terms from tolerance classes and then applying k-means clustering. An extended TRSM incorporates a thesaurus to further expand document representations. The document concludes by describing SONCA, a search platform that applies these rough set methods along with semantic indexing to perform advanced querying over document collections.
Similar to An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems (20)
Learning LWF Chain Graphs: A Markov Blanket Discovery ApproachPooyan Jamshidi
LWF Chain graphs were introduced by Lauritzen, Wermuth, and Frydenberg as a generalization of graphical models based on undirected graphs and DAGs. From the causality point of view, in an LWF CG: Directed edges represent direct causal effects. Undirected edges represent causal effects due to interference, which occurs when an individual’s outcome is influenced by their social interaction with other population members, e.g., in situations that involve contagious agents, educational programs, or social networks. The construction of chain graph models is a challenging task that would be greatly facilitated by automation.
Markov blanket discovery has an important role in structure learning of Bayesian network. It is surprising, however, how little attention it has attracted in the context of learning LWF chain graphs. In this work, we provide a graphical characterization of Markov blankets in chain graphs. The characterization is different from the well-known one for Bayesian networks and generalizes it. We provide a novel scalable and sound algorithm for Markov blanket discovery in LWF chain graphs. We also provide a sound and scalable constraint-based framework for learning the structure of LWF CGs from faithful causally sufficient data. With the use of our algorithm, the problem of structure learning is reduced to finding an efficient algorithm for Markov blanket discovery in LWF chain graphs. This greatly simplifies the structure-learning task and makes a wide range of inference/learning problems computationally tractable because our approach exploits locality.
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...Pooyan Jamshidi
We enable reliable and dependable self‐adaptations of component connectors in unreliable environments with imperfect monitoring facilities and conflicting user opinions about adaptation policies by developing a framework which comprises: (a) mechanisms for robust model evolution, (b) a method for adaptation reasoning, and (c) tool support that allows an end‐to‐end application of the developed techniques in real‐world domains.
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...Pooyan Jamshidi
Modern cyber-physical systems (e.g., robotics systems) are typically composed of physical and software components, the characteristics of which are likely to change over time. Assumptions about parts of the system made at design time may not hold at run time, especially when a system is deployed for long periods (e.g., over decades). Self-adaptation is designed to find reconfigurations of systems to handle such run-time inconsistencies. Planners can be used to find and enact optimal reconfigurations in such an evolving context. However, for systems that are highly configurable, such planning becomes intractable due to the size of the adaptation space. To overcome this challenge, in this paper we explore an approach that (a) uses machine learning to find Pareto-optimal configurations without needing to explore every configuration and (b) restricts the search space to such configurations to make planning tractable. We explore this in the context of robot missions that need to consider task timeliness and energy consumption. An independent evaluation shows that our approach results in high-quality adaptation plans in uncertain and adversarial environments.
Paper: https://arxiv.org/abs/1903.03920
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Pooyan Jamshidi
Despite achieving state-of-the-art performance across many domains, machine learning systems are highly vulnerable to subtle adversarial perturbations. Although defense approaches have been proposed in recent years, many have been bypassed by even weak adversarial attacks. Previous studies showed that ensembles created by combining multiple weak defenses (i.e., input data transformations) are still weak. In this talk, I will show that it is indeed possible to construct effective ensembles using weak defenses to block adversarial attacks. However, to do so requires a diverse set of such weak defenses. Based on this motivation, I will present Athena, an extensible framework for building effective defenses to adversarial attacks against machine learning systems. I will talk about the effectiveness of ensemble strategies with a diverse set of many weak defenses that comprise transforming the inputs (e.g., rotation, shifting, noising, denoising, and many more) before feeding them to target deep neural network classifiers. I will also discuss the effectiveness of the ensembles with adversarial examples generated by various adversaries in different threat models. In the second half of the talk, I will explain why building defenses based on the idea of many diverse weak defenses works, when it is most effective, and what its inherent limitations and overhead are.
Transfer Learning for Performance Analysis of Machine Learning SystemsPooyan Jamshidi
This document discusses transfer learning approaches for analyzing the performance of machine learning systems. It begins with the presenter's background and credentials. It then notes that today's most popular systems are highly configurable, but understanding how configurations impact performance is challenging. The document uses a case study of a social media analytics system called SocialSensor to illustrate the opportunity of exploring different configurations to improve performance without extra resources. Testing various configurations of SocialSensor's data processing pipelines revealed that the default was suboptimal, and an optimal configuration found through experimentation significantly outperformed the default and an expert's recommendation. The document concludes that default configurations are often bad, but transfer learning approaches can help identify configurations that noticeably improve performance.
Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...Pooyan Jamshidi
Modern systems (e.g., deep neural networks, big data analytics, and compilers) are highly configurable, which means they expose different performance behavior under different configurations. The fundamental challenge is that one cannot simply measure all configurations due to the sheer size of the configuration space. Transfer learning has been used to reduce the measurement efforts by transferring knowledge about performance behavior of systems across environments. Previously, research has shown that statistical models are indeed transferable across environments. In this work, we investigate identifiability and transportability of causal effects and statistical relations in highly-configurable systems. Our causal analysis agrees with previous exploratory analysis~\cite{Jamshidi17} and confirms that the causal effects of configuration options can be carried over across environments with high confidence. We expect that the ability to carry over causal relations will enable effective performance analysis of highly-configurable systems.
1) Machine learning systems are increasingly configurable, making their performance behavior complex and difficult to understand. 2) The document discusses a social media monitoring system called SocialSensor, which uses a configurable data processing pipeline. 3) By exploring different system configurations, the performance of SocialSensor could potentially be improved without requiring more resources. This demonstrates an opportunity to optimize performance through configuration tuning.
Integrated Model Discovery and Self-Adaptation of RobotsPooyan Jamshidi
Machine learn models efficiently under budget constraints to adapt to perturbations such as environmental changes or changes in the internal resources.
Modern software-intensive systems are composed of components that are likely to change their behaviour over time (e.g., adding/removing components).
For software to continue to operate under such changes, the assumptions about parts of the system made at design time may not hold at runtime due to uncertainty.
Mechanisms must be put in place that can dynamically learn new models of these assumptions and use them to make decisions about missions, configurations, etc.
Transfer Learning for Performance Analysis of Highly-Configurable SoftwarePooyan Jamshidi
This document discusses using transfer learning to analyze the performance of configurable software systems. It begins by noting that today's most popular software systems are highly configurable and that their increasing configurability makes understanding performance behavior difficult. The author then describes using transfer learning to enable learning performance models more efficiently by reusing data from related source domains. This allows developers and users to better understand performance tradeoffs and find optimal configurations.
Architectural Tradeoff in Learning-Based SoftwarePooyan Jamshidi
In classical software development, developers write explicit instructions in a programming language to hardcode the explicit behavior of software systems. By writing each line of code, the programmer instructs the software to have the desirable behavior by exploring a specific point in program space.
Recently, however, software systems are adding learning components that, instead of hardcoding an explicit behavior, learn a behavior through data. The learning-intensive software systems are written in terms of models and their parameters that need to be adjusted based on data. In learning-enabled systems, we specify some constraints on the behavior of a desirable program (e.g., a data set of input–output pairs of examples) and use the computational resources to search through the program space to find a program that satisfies the constraints. In neural networks, we restrict the search to a continuous subset of the program space.
This talk provides experimental evidence of making tradeoffs for deep neural network models, using the Deep Neural Network Architecture system as a case study. Concrete experimental results are presented; also featured are additional case studies in big data (Storm, Cassandra), data analytics (configurable boosting algorithms), and robotics applications.
Production-Ready Machine Learning for the Software ArchitectPooyan Jamshidi
This document summarizes a guest lecture on building production-ready machine learning systems. The lecturer discusses how a startup called Sniffable, which created an Instagram-like app for dogs, tried to build a machine learning system called Pooch Predictor to predict how popular photos would be. However, Pooch Predictor failed repeatedly in production due to issues like not retraining models, tight coupling between components, and treating ML like a transactional rather than dynamic system. The lecturer emphasizes that ML systems must be reactive, responsive, resilient, elastic, and message-driven to succeed at scale.
Workload Patterns for Quality-driven Dynamic Cloud Service Configuration and...Pooyan Jamshidi
The document proposes using workload patterns and collaborative filtering to predict quality of service for dynamic cloud service configuration and auto-scaling. It describes collecting monitoring data on invocations of different services, normalizing the data, identifying patterns in the data, and using those patterns and collaborative filtering to predict performance of future invocations.
In this Dagstuhl talk, I presented my current research on cloud auto-scaling and component connector self-adaptation and how I employed type-2 fuzzy control to tame the uncertainty regarding knowledge specification.
Autonomic Resource Provisioning for Cloud-Based SoftwarePooyan Jamshidi
This document proposes using fuzzy logic and type-2 fuzzy sets to develop an autonomous resource provisioning system for cloud-based software. Current auto-scaling solutions have limitations including requiring deep application knowledge and performance modeling expertise from users. The proposed system would use fuzzy inference to map monitored performance data to scaling actions, eliminating the need for users to specify scaling parameters or policies. It would incorporate uncertainty into the modeling and use expert knowledge from multiple users to develop robust and adaptive provisioning behavior.
AppSec PNW: Android and iOS Application Security with MobSFAjin Abraham
Mobile Security Framework - MobSF is a free and open source automated mobile application security testing environment designed to help security engineers, researchers, developers, and penetration testers to identify security vulnerabilities, malicious behaviours and privacy concerns in mobile applications using static and dynamic analysis. It supports all the popular mobile application binaries and source code formats built for Android and iOS devices. In addition to automated security assessment, it also offers an interactive testing environment to build and execute scenario based test/fuzz cases against the application.
This talk covers:
Using MobSF for static analysis of mobile applications.
Interactive dynamic security assessment of Android and iOS applications.
Solving Mobile app CTF challenges.
Reverse engineering and runtime analysis of Mobile malware.
How to shift left and integrate MobSF/mobsfscan SAST and DAST in your build pipeline.
Introducing BoxLang : A new JVM language for productivity and modularity!Ortus Solutions, Corp
Just like life, our code must adapt to the ever changing world we live in. From one day coding for the web, to the next for our tablets or APIs or for running serverless applications. Multi-runtime development is the future of coding, the future is to be dynamic. Let us introduce you to BoxLang.
Dynamic. Modular. Productive.
BoxLang redefines development with its dynamic nature, empowering developers to craft expressive and functional code effortlessly. Its modular architecture prioritizes flexibility, allowing for seamless integration into existing ecosystems.
Interoperability at its Core
With 100% interoperability with Java, BoxLang seamlessly bridges the gap between traditional and modern development paradigms, unlocking new possibilities for innovation and collaboration.
Multi-Runtime
From the tiny 2m operating system binary to running on our pure Java web server, CommandBox, Jakarta EE, AWS Lambda, Microsoft Functions, Web Assembly, Android and more. BoxLang has been designed to enhance and adapt according to it's runnable runtime.
The Fusion of Modernity and Tradition
Experience the fusion of modern features inspired by CFML, Node, Ruby, Kotlin, Java, and Clojure, combined with the familiarity of Java bytecode compilation, making BoxLang a language of choice for forward-thinking developers.
Empowering Transition with Transpiler Support
Transitioning from CFML to BoxLang is seamless with our JIT transpiler, facilitating smooth migration and preserving existing code investments.
Unlocking Creativity with IDE Tools
Unleash your creativity with powerful IDE tools tailored for BoxLang, providing an intuitive development experience and streamlining your workflow. Join us as we embark on a journey to redefine JVM development. Welcome to the era of BoxLang.
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
"Scaling RAG Applications to serve millions of users", Kevin GoedeckeFwdays
How we managed to grow and scale a RAG application from zero to thousands of users in 7 months. Lessons from technical challenges around managing high load for LLMs, RAGs and Vector databases.
What is an RPA CoE? Session 2 – CoE RolesDianaGray10
In this session, we will review the players involved in the CoE and how each role impacts opportunities.
Topics covered:
• What roles are essential?
• What place in the automation journey does each role play?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
AI in the Workplace Reskilling, Upskilling, and Future Work.pptxSunil Jagani
Discover how AI is transforming the workplace and learn strategies for reskilling and upskilling employees to stay ahead. This comprehensive guide covers the impact of AI on jobs, essential skills for the future, and successful case studies from industry leaders. Embrace AI-driven changes, foster continuous learning, and build a future-ready workforce.
Read More - https://bit.ly/3VKly70
The Department of Veteran Affairs (VA) invited Taylor Paschal, Knowledge & Information Management Consultant at Enterprise Knowledge, to speak at a Knowledge Management Lunch and Learn hosted on June 12, 2024. All Office of Administration staff were invited to attend and received professional development credit for participating in the voluntary event.
The objectives of the Lunch and Learn presentation were to:
- Review what KM ‘is’ and ‘isn’t’
- Understand the value of KM and the benefits of engaging
- Define and reflect on your “what’s in it for me?”
- Share actionable ways you can participate in Knowledge - - Capture & Transfer
What is an RPA CoE? Session 1 – CoE VisionDianaGray10
In the first session, we will review the organization's vision and how this has an impact on the COE Structure.
Topics covered:
• The role of a steering committee
• How do the organization’s priorities determine CoE Structure?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
"NATO Hackathon Winner: AI-Powered Drug Search", Taras KlobaFwdays
This is a session that details how PostgreSQL's features and Azure AI Services can be effectively used to significantly enhance the search functionality in any application.
In this session, we'll share insights on how we used PostgreSQL to facilitate precise searches across multiple fields in our mobile application. The techniques include using LIKE and ILIKE operators and integrating a trigram-based search to handle potential misspellings, thereby increasing the search accuracy.
We'll also discuss how the azure_ai extension on PostgreSQL databases in Azure and Azure AI Services were utilized to create vectors from user input, a feature beneficial when users wish to find specific items based on text prompts. While our application's case study involves a drug search, the techniques and principles shared in this session can be adapted to improve search functionality in a wide range of applications. Join us to learn how PostgreSQL and Azure AI can be harnessed to enhance your application's search capability.
3. Motivation
0 1 2 3 4 5
average read latency (µs) ×104
0
20
40
60
80
100
120
140
160
observations
1000 1200 1400 1600 1800 2000
average read latency (µs)
0
10
20
30
40
50
60
70
observations
1
1
(a) cass-20 (b) cass-10
Best configurations
Worst configurations
Experiments on
Apache Cassandra:
- 6 parameters, 1024 configurations
- Average read latency
- 10 millions records (cass-10)
- 20 millions records (cass-20)
4. Motivation (Apache Storm)
number of counters
number of splitters
latency(ms)
100
150
1
200
250
2
300
Cubic Interpolation Over Finer Grid
243 684 10125 14166 18
In our experiments we
observed improvement
up to 100%
5. Goal
is denoted by f(x). Throughout, we assume
ncy, however, other metrics for response may
re consider the problem of finding an optimal
⇤
that globally minimizes f(·) over X:
x⇤
= arg min
x2X
f(x) (1)
esponse function f(·) is usually unknown or
n, i.e., yi = f(xi), xi ⇢ X. In practice, such
may contain noise, i.e., yi = f(xi) + ✏. The
of the optimal configuration is thus a black-
on program subject to noise [27, 33], which
harder than deterministic optimization. A
n is based on sampling that starts with a
pled configurations. The performance of the
sociated to this initial samples can deliver
tanding of f(·) and guide the generation of
of samples. If properly guided, the process
ration-evaluation-feedback-regeneration will
tinuously, (ii) Big Data systems are d
frameworks (e.g., Apache Hadoop, S
on similar platforms (e.g., cloud clust
versions of a system often share a sim
To the best of our knowledge, only
the possibility of transfer learning in
The authors learn a Bayesian network
of a system and reuse this model fo
systems. However, the learning is lim
the Bayesian network. In this paper,
that not only reuse a model that has b
but also the valuable raw data. There
to the accuracy of the learned model
consider Bayesian networks and inste
2.4 Motivation
A motivating example. We now i
points on an example. WordCount (cf.
benchmark [12]. WordCount features
(Xi). In general, Xi may either indicate (i) integer vari-
such as level of parallelism or (ii) categorical variable
as messaging frameworks or Boolean variable such as
ng timeout. We use the terms parameter and factor in-
angeably; also, with the term option we refer to possible
s that can be assigned to a parameter.
assume that each configuration x 2 X in the configura-
pace X = Dom(X1) ⇥ · · · ⇥ Dom(Xd) is valid, i.e., the
m accepts this configuration and the corresponding test
s in a stable performance behavior. The response with
guration x is denoted by f(x). Throughout, we assume
f(·) is latency, however, other metrics for response may
ed. We here consider the problem of finding an optimal
guration x⇤
that globally minimizes f(·) over X:
x⇤
= arg min
x2X
f(x) (1)
fact, the response function f(·) is usually unknown or
ally known, i.e., yi = f(xi), xi ⇢ X. In practice, such
it still requires hundr
per, we propose to ad
with the search e ci
than starting the sear
the learned knowledg
software to accelerate
version. This idea is i
in real software engin
in DevOps di↵erent
tinuously, (ii) Big Da
frameworks (e.g., Ap
on similar platforms (
versions of a system o
To the best of our k
the possibility of tran
The authors learn a B
of a system and reus
systems. However, the
the Bayesian network.
his configuration and the corresponding test
le performance behavior. The response with
is denoted by f(x). Throughout, we assume
ncy, however, other metrics for response may
e consider the problem of finding an optimal
⇤
that globally minimizes f(·) over X:
x⇤
= arg min
x2X
f(x) (1)
esponse function f(·) is usually unknown or
, i.e., yi = f(xi), xi ⇢ X. In practice, such
may contain noise, i.e., yi = f(xi) + ✏. The
of the optimal configuration is thus a black-
n program subject to noise [27, 33], which
harder than deterministic optimization. A
n is based on sampling that starts with a
pled configurations. The performance of the
sociated to this initial samples can deliver
tanding of f(·) and guide the generation of
of samples. If properly guided, the process
ation-evaluation-feedback-regeneration will
erge and the optimal configuration will be
in DevOps di↵erent versions of a system is delivered
tinuously, (ii) Big Data systems are developed using s
frameworks (e.g., Apache Hadoop, Spark, Kafka) an
on similar platforms (e.g., cloud clusters), (iii) and di↵
versions of a system often share a similar business log
To the best of our knowledge, only one study [9] ex
the possibility of transfer learning in system configur
The authors learn a Bayesian network in the tuning p
of a system and reuse this model for tuning other s
systems. However, the learning is limited to the struct
the Bayesian network. In this paper, we introduce a m
that not only reuse a model that has been learned prev
but also the valuable raw data. Therefore, we are not li
to the accuracy of the learned model. Moreover, we d
consider Bayesian networks and instead focus on MTG
2.4 Motivation
A motivating example. We now illustrate the pre
points on an example. WordCount (cf. Figure 1) is a po
benchmark [12]. WordCount features a three-layer arc
ture that counts the number of words in the incoming s
A Processing Element (PE) of type Spout reads the
havior. The response with
). Throughout, we assume
metrics for response may
blem of finding an optimal
nimizes f(·) over X:
f(x) (1)
(·) is usually unknown or
xi ⇢ X. In practice, such
i.e., yi = f(xi) + ✏. The
figuration is thus a black-
t to noise [27, 33], which
ministic optimization. A
mpling that starts with a
. The performance of the
itial samples can deliver
d guide the generation of
perly guided, the process
in DevOps di↵erent versions of a system is delivered con
tinuously, (ii) Big Data systems are developed using simila
frameworks (e.g., Apache Hadoop, Spark, Kafka) and ru
on similar platforms (e.g., cloud clusters), (iii) and di↵eren
versions of a system often share a similar business logic.
To the best of our knowledge, only one study [9] explore
the possibility of transfer learning in system configuration
The authors learn a Bayesian network in the tuning proces
of a system and reuse this model for tuning other simila
systems. However, the learning is limited to the structure o
the Bayesian network. In this paper, we introduce a metho
that not only reuse a model that has been learned previousl
but also the valuable raw data. Therefore, we are not limite
to the accuracy of the learned model. Moreover, we do no
consider Bayesian networks and instead focus on MTGPs.
2.4 Motivation
A motivating example. We now illustrate the previou
points on an example. WordCount (cf. Figure 1) is a popula
benchmark [12]. WordCount features a three-layer archite
Partially known
Measurements subject to noise
Configuration space
6. Non-linear interactions
0 5 10 15 20
Number of counters
100
120
140
160
180
200
220
240
Latency(ms)
splitters=2
splitters=3
number of counters
number of splitters
latency(ms)
100
150
1
200
250
2
300
Cubic Interpolation Over Finer Grid
243 684 10125 14166 18
Response surface is:
- Non-linear
- Non convex
- Multi-modal
7. The measurements are subject to variability
wc wc+rs wc+sol 2wc 2wc+rs+sol
10
1
10
2
Latency(ms)
The scale of
measurement variability
is different in different
deployments
(heteroscedastic noise)
y at points x that has been
here consider the problem
x⇤
that minimizes f over
w experiments as possible:
f(x) (1)
) is usually unknown or
xi ⇢ X. In practice, such
.e., yi = f(xi) + ✏i. Note
ly partially-known, finding
kbox optimization problem
noise. In fact, the problem
on-convex and multi-modal
P-hard [36]. Therefore, on
locate a global optimum,
st possible local optimum
udget.
It shows the non-convexity, multi-modality and the substantial
performance difference between different configurations.
0 5 10 15 20
Number of counters
100
120
140
160
180
200
220
240
Latency(ms)
splitters=2
splitters=3
Fig. 3: WordCount latency, cut though Figure 2.
demonstrates that if one tries to minimize latency by acting
just on one of these parameters at the time, the resulting
9. GP for modeling blackbox response function
true function
GP mean
GP variance
observation
selected point
true
minimum
mposed by its prior mean (µ(·) : X ! R) and a covariance
nction (k(·, ·) : X ⇥ X ! R) [41]:
y = f(x) ⇠ GP(µ(x), k(x, x0
)), (2)
here covariance k(x, x0
) defines the distance between x
d x0
. Let us assume S1:t = {(x1:t, y1:t)|yi := f(xi)} be
e collection of t experimental data (observations). In this
mework, we treat f(x) as a random variable, conditioned
observations S1:t, which is normally distributed with the
lowing posterior mean and variance functions [41]:
µt(x) = µ(x) + k(x)|
(K + 2
I) 1
(y µ) (3)
2
t (x) = k(x, x) + 2
I k(x)|
(K + 2
I) 1
k(x) (4)
here y := y1:t, k(x)|
= [k(x, x1) k(x, x2) . . . k(x, xt)],
:= µ(x1:t), K := k(xi, xj) and I is identity matrix. The
ortcoming of BO4CO is that it cannot exploit the observa-
ns regarding other versions of the system and as therefore
nnot be applied in DevOps.
2 TL4CO: an extension to multi-tasks
TL4CO 1
uses MTGPs that exploit observations from other
evious versions of the system under test. Algorithm 1
fines the internal details of TL4CO. As Figure 4 shows,
4CO is an iterative algorithm that uses the learning from
her system versions. In a high-level overview, TL4CO: (i)
ects the most informative past observations (details in
ction 3.3); (ii) fits a model to existing data based on kernel
arning (details in Section 3.4), and (iii) selects the next
ork are based on tractable linear algebra.
evious work [21], we proposed BO4CO that ex-
task GPs (no transfer learning) for prediction of
tribution of response functions. A GP model is
y its prior mean (µ(·) : X ! R) and a covariance
·, ·) : X ⇥ X ! R) [41]:
y = f(x) ⇠ GP(µ(x), k(x, x0
)), (2)
iance k(x, x0
) defines the distance between x
us assume S1:t = {(x1:t, y1:t)|yi := f(xi)} be
n of t experimental data (observations). In this
we treat f(x) as a random variable, conditioned
ons S1:t, which is normally distributed with the
sterior mean and variance functions [41]:
µ(x) + k(x)|
(K + 2
I) 1
(y µ) (3)
k(x, x) + 2
I k(x)|
(K + 2
I) 1
k(x) (4)
1:t, k(x)|
= [k(x, x1) k(x, x2) . . . k(x, xt)],
, K := k(xi, xj) and I is identity matrix. The
of BO4CO is that it cannot exploit the observa-
ng other versions of the system and as therefore
pplied in DevOps.
CO: an extension to multi-tasks
uses MTGPs that exploit observations from other
Motivations:
1- mean estimates + variance
2- all computations are linear algebra
3- good estimations when few data
10. Sparsity of Effects
• Correlation-based
feature selector
• Merit is used to select
subsets that are highly
correlated with the
response variable
• At most 2-3 parameters
were strongly interacting
with each other
TABLE I: Sparsity of effects on 5 experiments where we have varied
different subsets of parameters and used different testbeds. Note that
these are the datasets we experimentally measured on the benchmark
systems and we use them for the evaluation, more details including
the results for 6 more experiments are in the appendix.
Topol. Parameters Main factors Merit Size Testbed
1 wc(6D)
1-spouts, 2-max spout,
3-spout wait, 4-splitters,
5-counters, 6-netty min wait
{1, 2, 5} 0.787 2880 C1
2 sol(6D)
1-spouts, 2-max spout,
3-top level, 4-netty min wait,
5-message size, 6-bolts
{1, 2, 3} 0.447 2866 C2
3 rs(6D)
1-spouts, 2-max spout,
3-sorters, 4-emit freq,
5-chunk size, 6-message size
{3} 0.385 3840 C3
4 wc(3D)
1-max spout, 2-splitters,
3-counters {1, 2} 0.480 756 C4
5 wc(5D)
1-spouts, 2-splitters,
3-counters,
4-buffer-size, 5-heap
{1} 0.851 1080 C5
102
s)
Experiments on:
1. C1: OpenNebula (X)
2. C2: Amazon EC2 (Y)
3. C3: OpenNebula (3X)
4. C4: Amazon EC2 (2Y)
5. C5: Microsoft Azure (X)
11. -1.5 -1 -0.5 0 0.5 1 1.5
-1.5
-1
-0.5
0
0.5
1
x1 x2 x3 x4
true function
GP surrogate
mean estimate
observation
Fig. 5: An example of 1D GP model: GPs provide mean esti-
mates as well as the uncertainty in estimations, i.e., variance.
Configuration
Optimisation Tool
performance
repository
Monitoring
Deployment Service
Data Preparation
configuration
parameters
values
configuration
parameters
values
Experimental Suite
Testbed
Doc
Data Broker
Tester
experiment time
polling interval
configuration
parameters
GP model
Kafka
System Under Test
Workload
Generator
Technology Interface
Storm
Cassandra
Spark
Algorithm 1 : BO4CO
Input: Configuration space X, Maximum budget Nmax, Re-
sponse function f, Kernel function K✓, Hyper-parameters
✓, Design sample size n, learning cycle Nl
Output: Optimal configurations x⇤
and learned model M
1: choose an initial sparse design (lhd) to find an initial
design samples D = {x1, . . . , xn}
2: obtain performance measurements of the initial design,
yi f(xi) + ✏i, 8xi 2 D
3: S1:n {(xi, yi)}n
i=1; t n + 1
4: M(x|S1:n, ✓) fit a GP model to the design . Eq.(3)
5: while t Nmax do
6: if (t mod Nl = 0) ✓ learn the kernel hyper-
parameters by maximizing the likelihood
7: find next configuration xt by optimizing the selection
criteria over the estimated response surface given the data,
xt arg maxxu(x|M, S1:t 1) . Eq.(9)
8: obtain performance for the new configuration xt, yt
f(xt) + ✏t
9: Augment the configuration S1:t = {S1:t 1, (xt, yt)}
10: M(x|S1:t, ✓) re-fit a new GP model . Eq.(7)
11: t t + 1
12: end while
13: (x⇤
, y⇤
) = min S1:Nmax
14: M(x)
-1.5 -1 -0.5 0 0.5 1 1.5
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Configuration
Space
Empirical
Model
2
4
6
8
10
12
1
2
3
4
5
6
160
140
120
100
80
60
180
Experiment
(exhastive)
Experiment
Experiment
0 20 40 60 80 100 120 140 160 180 200
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Selection Criteria
(b) Sequential Design
(a) Design of Experiment
12. -1.5 -1 -0.5 0 0.5 1 1.5
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
configuration domain
responsevalue
-1.5 -1 -0.5 0 0.5 1 1.5
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
true response function
GP fit
-1.5 -1 -0.5 0 0.5 1 1.5
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
criteria evaluation
new selected point
-1.5 -1 -0.5 0 0.5 1 1.5
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
new GP fit
Acquisition function:
btaining the measurements
O then fits a GP model to
elief about the underlying
rithm 1). The while loop in
belief until the budget runs
:t = {(xi, yi)}t
i=1, where
a prior distribution Pr(f)
1:t|f) form the posterior
) Pr(f).
ions [37], specified by its
iance (see Section III-E1):
), k(x, x0
)), (3)
where
µt(x) = µ(x) + k(x)|
(K + 2
I) 1
(y µ) (7)
2
t (x) = k(x, x) + 2
I k(x)|
(K + 2
I) 1
k(x) (8)
These posterior functions are used to select the next point xt+1
as detailed in Section III-C.
C. Configuration selection criteria
The selection criteria is defined as u : X ! R that selects
xt+1 2 X, should f(·) be evaluated next (step 7):
xt+1 = argmax
x2X
u(x|M, S1:t) (9)
13. Logical
View
Physical
View
pipe
Spout A Bolt A Bolt B
socket socket
out queue in queue
Worker A Worker B Worker C
out queue in queue
Kafka Spout Splitter Bolt Counter Bolt
(sentence) (word)
[paintings, 3]
[poems, 60]
[letter, 75]
Kafka Topic
Stream to
Kafka
File
(sentence)
(sentence)
(sentence)
Kafka Spout
RollingCount
Bolt
Intermediate
Ranking Bolt
(hashtags)
(hashtag,
count)
Ranking
Bolt
(ranking)
(trending
topics)Kafka Topic
Twitter to
Kafka
(tweet)
Twitter Stream
(tweet)
(tweet)
Storm Architecture
Word Count Architecture
• CPU intensive
Rolling Sort Architecture
• Memory intensive
Applications:
• Fraud detection
• Trending topics
14. Experimental results
0 20 40 60 80 100
Iteration
10
-3
10-2
10-1
100
10
1
102
10
3
104
AbsoluteError
BO4CO
SA
GA
HILL
PS
Drift
0 20 40 60 80 100
Iteration
10
-2
10-1
100
101
102
103
AbsoluteError
BO4CO
SA
GA
HILL
PS
Drift
(a) WordCount(3D) (b) WordCount(5D)
- 30 runs, report average performance
- Yes, we did full factorial
measurements and we know where
global min is…
18. Prediction accuracy over time
0 10 20 30 40 50 60 70 80
Iteration
10
1
10
2
10
3
PredictionError
BO4CO
polyfit1
M5Tree
RegressionTree
M5Rules
LWP(GAU)
PRIM
19. Exploitation vs exploration
0 20 40 60 80 100
Iteration
10
-4
10
-3
10
-2
10
-1
10
0
10
1
10
2
AbsoluteError
BO4CO(adaptive)
BO4CO(µ:=0)
BO4CO(κ:=0.1)
BO4CO(κ:=1)
BO4CO(κ:=6)
BO4CO(κ:=8)
0 2000 4000 6000 8000 10000
Iteration
4
4.5
5
5.5
6
6.5
7
7.5
8
8.5
Kappa
ϵ=1
ϵ=0.1
ϵ=0.01
the next configuration to measure. Intuitively,
lect the minimum response. This is done using
ction u : X ! R that determines xt+1 2 X,
e evaluated next as:
xt+1 = argmax
x2X
u(x|M, S1
1:t) (11)
on criterion depends on the MTGP model M
h its predictive mean µt(xt) and variance 2
t (xt)
on observations S1
1:t. TL4CO uses the Lower
ound (LCB) [24]:
B(x|M, S1
1:t) = argmin
x2X
µt(x) t(x), (12)
xploitation-exploration parameter. For instance,
to find a near optimal configuration we set a
to take the most out of the predictive mean.
e are looking for a globally optimum one, we can
ue in order to skip local minima. Furthermore,
ted over time [22] to perform more explorations.
ws that in TL4CO, can start with a relatively
at the early iterations comparing to BO4CO
mer provides a better estimate of mean and
xt+1 = argmax
x2X
u(x|M, S1
1:t) (11)
e selection criterion depends on the MTGP model M
through its predictive mean µt(xt) and variance 2
t (xt)
tioned on observations S1
1:t. TL4CO uses the Lower
dence Bound (LCB) [24]:
uLCB(x|M, S1
1:t) = argmin
x2X
µt(x) t(x), (12)
is a exploitation-exploration parameter. For instance,
require to find a near optimal configuration we set a
alue to to take the most out of the predictive mean.
ver, if we are looking for a globally optimum one, we can
high value in order to skip local minima. Furthermore,
be adapted over time [22] to perform more explorations.
e 6 shows that in TL4CO, can start with a relatively
r value at the early iterations comparing to BO4CO
the former provides a better estimate of mean and
ore contains more information at the early stages.
4CO output. Once the Nmax di↵erent configurations of
ystem under test are measured, the TL4CO algorithm
nates. Finally, TL4CO produces the outputs including
ptimal configuration (step 14 in Algorithm 1) as well
20. Runtime overhead
0 20 40 60 80 100
Iteration
0.15
0.2
0.25
0.3
0.35
0.4
ElapsedTime(s)
WordCount (3D)
WordCount (6D)
SOL (6D)
RollingSort (6D)
WordCount (5D)
- The computation time in larger
datasets is higher than those with
less data and lower.
- The computation time increases
over time since the matrix size for
Cholesky inversion gets larger.
mean is shown in yellow and the 95% confidence interval at
each point in the shaded red area. The stars indicate ex-
perimental measurements (or observation interchangeably).
Some points x 2 X have a large confidence interval due to
lack of observations in their neighborhood, while others have
a narrow confidence. The main motivation behind the choice
of Bayesian Optimization here is that it o↵ers a framework
in which reasoning can be not only based on mean estimates
but also the variance, providing more informative decision
making. The other reason is that all the computations in
this framework are based on tractable linear algebra.
In our previous work [21], we proposed BO4CO that ex-
ploits single-task GPs (no transfer learning) for prediction of
posterior distribution of response functions. A GP model is
composed by its prior mean (µ(·) : X ! R) and a covariance
function (k(·, ·) : X ⇥ X ! R) [41]:
y = f(x) ⇠ GP(µ(x), k(x, x0
)), (2)
where covariance k(x, x0
) defines the distance between x
and x0
. Let us assume S1:t = {(x1:t, y1:t)|yi := f(xi)} be
the collection of t experimental data (observations). In this
framework, we treat f(x) as a random variable, conditioned
on observations S1:t, which is normally distributed with the
following posterior mean and variance functions [41]:
µt(x) = µ(x) + k(x)|
(K + 2
I) 1
(y µ) (3)
2
t (x) = k(x, x) + 2
I k(x)|
(K + 2
I) 1
k(x) (4)
where y := y1:t, k(x)|
= [k(x, x1) k(x, x2) . . . k(x, xt)],
n approach using a 1-dimensional response. The
blue is the unknown true response, whereas the
hown in yellow and the 95% confidence interval at
t in the shaded red area. The stars indicate ex-
al measurements (or observation interchangeably).
nts x 2 X have a large confidence interval due to
servations in their neighborhood, while others have
confidence. The main motivation behind the choice
an Optimization here is that it o↵ers a framework
easoning can be not only based on mean estimates
he variance, providing more informative decision
The other reason is that all the computations in
ework are based on tractable linear algebra.
previous work [21], we proposed BO4CO that ex-
le-task GPs (no transfer learning) for prediction of
distribution of response functions. A GP model is
by its prior mean (µ(·) : X ! R) and a covariance
k(·, ·) : X ⇥ X ! R) [41]:
y = f(x) ⇠ GP(µ(x), k(x, x0
)), (2)
variance k(x, x0
) defines the distance between x
Let us assume S1:t = {(x1:t, y1:t)|yi := f(xi)} be
tion of t experimental data (observations). In this
k, we treat f(x) as a random variable, conditioned
ations S1:t, which is normally distributed with the
21. Key takeaways
- A principled way for
- locating optimal configurations
- carefully choosing where to sample
- by sequentially reducing uncertainty in the response surface
- in order to reduce the number of measurements
- GPs appeared to be more accurate than other machine
learning regression models in our experiments.
- Applications to SPSs, Batch and NoSQL (see the BO4CO
github page for more details)
22. Acknowledgement: BO4CO as a part of DevOps pipeline in H2020
DICE
Big Data Technologies
Cloud (Priv/Pub)
`
DICE IDE
Profile
Plugins
Sim Ver Opt
DPIM
DTSM
DDSM TOSCAMethodology
Deploy Config Test
M
o
n
Anomaly
Trace
Iter. Enh.
Data Intensive Application (DIA)
Cont.Int. Fault Inj.
WP4
WP3
WP2
WP5
WP1 WP6 - Demonstrators
Code and data: https://github.com/dice-project/DICE-Configuration-BO4CO