The document summarizes computational aspects of vehicle routing problems. It discusses time complexity and space complexity, and how they are measured as functions of problem size. It provides examples of calculating complexity for different algorithms. It also discusses common data structures for representing routes, including array lists, doubly linked lists, and their pros and cons for different operations. The document outlines Java code examples for comparing route representation using these data structures.
This document discusses speaker diarization, which is the process of segmenting an audio stream into homogeneous segments according to speaker identity. It covers feature extraction methods like MFCCs, segmentation using Bayesian Information Criteria to compare Gaussian mixture models, and clustering algorithms like k-means and hierarchical agglomerative clustering. Dendrogram visualizations are used to identify natural speaker clusters. The overall goal is to partition audio recordings of discussions or debates into homogeneous segments to attribute speech segments to individual speakers.
Hyperparameter optimization with approximate gradientFabian Pedregosa
This document discusses hyperparameter optimization using approximate gradients. It introduces the problem of optimizing hyperparameters along with model parameters. While model parameters can be estimated from data, hyperparameters require methods like cross-validation. The document proposes using approximate gradients to optimize hyperparameters more efficiently than costly methods like grid search. It derives the gradient of the objective with respect to hyperparameters and presents an algorithm called HOAG that approximates this gradient using inexact solutions. The document analyzes HOAG's convergence and provides experimental results comparing it to other hyperparameter optimization methods.
Greedy abstract feature generation (GreedyAFG) is an approach for automatically generating feature abstractions to be used in Abstract Zobrist Hashing (AZHDA*) for parallel best-first search. However, GreedyAFG results in high communication overhead because any change in the value of an abstract feature from a parent node to a child node will change the hash value and require communication. To reduce this overhead, a new approach is needed that groups features so that changes within a group do not change the hash value.
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEHONGJOO LEE
45 min talk about collecting home network performance measures, analyzing and forecasting time series data, and building anomaly detection system.
In this talk, we will go through the whole process of data mining and knowledge discovery. Firstly we write a script to run speed test periodically and log the metric. Then we parse the log data and convert them into a time series and visualize the data for a certain period.
Next we conduct some data analysis; finding trends, forecasting, and detecting anomalous data. There will be several statistic or deep learning techniques used for the analysis; ARIMA (Autoregressive Integrated Moving Average), LSTM (Long Short Term Memory).
딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)Hansol Kang
The document summarizes the basics of Deep Convolutional Neural Networks (DCNNs) including AlexNet and VGGNet. It discusses how AlexNet introduced improvements like ReLU activation and dropout to address overfitting issues. It then focuses on the VGGNet, noting that it achieved good performance through increasing depth using small 3x3 filters and adding convolutional layers. The document shares details of VGGNet configurations ranging from 11 to 19 weight layers and their performance on image classification tasks.
Paper Study: A learning based iterative method for solving vehicle routingChenYiHuang5
The document proposes a learning-based iterative method called "Learn to Improve" (L2I) to solve vehicle routing problems. L2I starts with an initial solution and refines it iteratively using improvement and perturbation operators. Improvement operators try to improve the solution while perturbation operators destroy and reconstruct the solution to generate a new starting point. The method trains reinforcement learning policies to select the optimal operator at each step. Experiments show L2I outperforms classical operations research algorithms and achieves state-of-the-art results on capacitated vehicle routing problem instances.
This document discusses speaker diarization, which is the process of segmenting an audio stream into homogeneous segments according to speaker identity. It covers feature extraction methods like MFCCs, segmentation using Bayesian Information Criteria to compare Gaussian mixture models, and clustering algorithms like k-means and hierarchical agglomerative clustering. Dendrogram visualizations are used to identify natural speaker clusters. The overall goal is to partition audio recordings of discussions or debates into homogeneous segments to attribute speech segments to individual speakers.
Hyperparameter optimization with approximate gradientFabian Pedregosa
This document discusses hyperparameter optimization using approximate gradients. It introduces the problem of optimizing hyperparameters along with model parameters. While model parameters can be estimated from data, hyperparameters require methods like cross-validation. The document proposes using approximate gradients to optimize hyperparameters more efficiently than costly methods like grid search. It derives the gradient of the objective with respect to hyperparameters and presents an algorithm called HOAG that approximates this gradient using inexact solutions. The document analyzes HOAG's convergence and provides experimental results comparing it to other hyperparameter optimization methods.
Greedy abstract feature generation (GreedyAFG) is an approach for automatically generating feature abstractions to be used in Abstract Zobrist Hashing (AZHDA*) for parallel best-first search. However, GreedyAFG results in high communication overhead because any change in the value of an abstract feature from a parent node to a child node will change the hash value and require communication. To reduce this overhead, a new approach is needed that groups features so that changes within a group do not change the hash value.
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEHONGJOO LEE
45 min talk about collecting home network performance measures, analyzing and forecasting time series data, and building anomaly detection system.
In this talk, we will go through the whole process of data mining and knowledge discovery. Firstly we write a script to run speed test periodically and log the metric. Then we parse the log data and convert them into a time series and visualize the data for a certain period.
Next we conduct some data analysis; finding trends, forecasting, and detecting anomalous data. There will be several statistic or deep learning techniques used for the analysis; ARIMA (Autoregressive Integrated Moving Average), LSTM (Long Short Term Memory).
딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)Hansol Kang
The document summarizes the basics of Deep Convolutional Neural Networks (DCNNs) including AlexNet and VGGNet. It discusses how AlexNet introduced improvements like ReLU activation and dropout to address overfitting issues. It then focuses on the VGGNet, noting that it achieved good performance through increasing depth using small 3x3 filters and adding convolutional layers. The document shares details of VGGNet configurations ranging from 11 to 19 weight layers and their performance on image classification tasks.
Paper Study: A learning based iterative method for solving vehicle routingChenYiHuang5
The document proposes a learning-based iterative method called "Learn to Improve" (L2I) to solve vehicle routing problems. L2I starts with an initial solution and refines it iteratively using improvement and perturbation operators. Improvement operators try to improve the solution while perturbation operators destroy and reconstruct the solution to generate a new starting point. The method trains reinforcement learning policies to select the optimal operator at each step. Experiments show L2I outperforms classical operations research algorithms and achieves state-of-the-art results on capacitated vehicle routing problem instances.
This document presents a framework for verifying the safety of classification decisions made by deep neural networks. It defines safety as the network producing the same output classification for an input and any perturbations of that input within a bounded region. The framework uses satisfiability modulo theories (SMT) to formally verify safety by attempting to find an adversarial perturbation that causes misclassification. It has been tested on several image classification networks and datasets. The framework provides a method to automatically verify safety properties of deep neural networks.
This document presents methods for computing information flow and quantifying information leakage in non-probabilistic programs using symbolic model checking. It discusses using binary decision diagrams (BDDs) and algebraic decision diagrams (ADDs) to represent program states and calculate fixed points. Algorithms are provided for symbolically computing min-entropy and Shannon entropy leakage by constructing ADDs representing the program summary and sets of possible outputs. The methods were implemented in a tool called Moped-QLeak and evaluated on example programs. Future work includes supporting recursive programs and using other symbolic verification approaches.
Digit recognizer by convolutional neural networkDing Li
A convolutional neural network is used to recognize handwritten digits from images. The CNN uses convolutional and max pooling layers to extract local features from the images. These local features are then fed into fully connected layers to combine them into global features used to predict the digit (0-9) in each image with a softmax output layer. The model is trained on 60,000 images and achieves 99.67% accuracy on the test set after 30 training epochs. While powerful, it is unclear if humans can fully understand the "mind" and logic of artificial neural networks.
Paper Study: Melding the data decision pipelineChenYiHuang5
Melding the data decision pipeline: Decision-Focused Learning for Combinatorial Optimization from AAAI2019.
Derive the math equation from myself and match the same result as two mentioned CMU papers [Donti et. al. 2017, Amos et. al. 2017] while applying the same derivation procedure.
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16MLconf
The document discusses new techniques for improving the k-means clustering algorithm. It begins by describing the standard k-means algorithm and Lloyd's method. It then discusses issues with random initialization for k-means. It proposes using furthest point initialization (k-means++) as an improvement. The document also discusses parallelizing k-means initialization (k-means||) and using nearest neighbor data structures to speed up assigning points to clusters, which allows k-means to scale to many clusters. Experimental results show these techniques provide faster and higher quality clustering compared to standard k-means.
Deep Convolutional GANs - meaning of latent spaceHansol Kang
DCGAN은 GAN에 단순히 conv net을 적용했을 뿐만 아니라, latent space에서도 의미를 찾음.
DCGAN 논문 리뷰 및 PyTorch 기반의 구현.
VAE 세미나 이슈 사항에 대한 리뷰.
my github : https://github.com/messy-snail/GAN_PyTorch
[참고]
https://github.com/znxlwm/pytorch-MNIST-CelebA-GAN-DCGAN
https://github.com/taeoh-kim/Pytorch_DCGAN
Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).
Options and trade offs for parallelism and concurrency in Modern C++Satalia
While threads have become a first class citizen in C++ since C++11, it is not always the case that they are the best abstraction to express parallelism where the objective is to speed up computations. OpenMP is a parallelism API for C/C++ and Fortran that has been around for a long time. Intel's Threading Building Blocks (TBB) is only a little bit more than 10 years old, but is very mature, and specifically for C++.
Mats will introduce OpenMP and TBB and their use in modern C++ and provide some best practices for them as well as try to predict what the C++ standard has in store for us when it comes to parallelism in the future.
This document provides an overview of VAE-type deep generative models, especially RNNs combined with VAEs. It begins with notations and abbreviations used. The agenda then covers the mathematical formulation of generative models, the Variational Autoencoder (VAE), variants of VAE that combine it with RNNs (VRAE, VRNN, DRAW), a Chainer implementation of Convolutional DRAW, other related models (Inverse DRAW, VAE+GAN), and concludes with challenges of VAE-like generative models.
Generative Adversarial Networks (GANs) are a class of machine learning frameworks where two neural networks contest with each other in a game. A generator network generates new data instances, while a discriminator network evaluates them for authenticity, classifying them as real or generated. This adversarial process allows the generator to improve over time and generate highly realistic samples that can pass for real data. The document provides an overview of GANs and their variants, including DCGAN, InfoGAN, EBGAN, and ACGAN models. It also discusses techniques for training more stable GANs and escaping issues like mode collapse.
(chapter 8) A Concise and Practical Introduction to Programming Algorithms in...Frank Nielsen
These are the slides accompanying the textbook:
A Concise and Practical Introduction to Programming Algorithms in Java
by Frank Nielsen
Published by Springer-Verlag (2009), Undergraduate textbook in computer science (UTiCS series)
ISBN: 978-1-84882-338-9
http://www.lix.polytechnique.fr/~nielsen/JavaProgramming/
http://link.springer.com/book/10.1007%2F978-1-84882-339-6
- The document presents a new formulation of the attention mechanism in Transformers using kernels. This allows defining attention over a larger space and integrating positional embeddings.
- Experiments on neural machine translation and sequence prediction tasks study different kernel forms like RBF and their combination with positional encodings.
- Results show the best kernel is the RBF kernel and competitive performance to state-of-the-art models with less computation.
Accelerating Random Forests in Scikit-LearnGilles Louppe
Random Forests are without contest one of the most robust, accurate and versatile tools for solving machine learning tasks. Implementing this algorithm properly and efficiently remains however a challenging task involving issues that are easily overlooked if not considered with care. In this talk, we present the Random Forests implementation developed within the Scikit-Learn machine learning library. In particular, we describe the iterative team efforts that led us to gradually improve our codebase and eventually make Scikit-Learn's Random Forests one of the most efficient implementations in the scientific ecosystem, across all libraries and programming languages. Algorithmic and technical optimizations that have made this possible include:
- An efficient formulation of the decision tree algorithm, tailored for Random Forests;
- Cythonization of the tree induction algorithm;
- CPU cache optimizations, through low-level organization of data into contiguous memory blocks;
- Efficient multi-threading through GIL-free routines;
- A dedicated sorting procedure, taking into account the properties of data;
- Shared pre-computations whenever critical.
Overall, we believe that lessons learned from this case study extend to a broad range of scientific applications and may be of interest to anybody doing data analysis in Python.
This document discusses algorithms complexity and data structures efficiency. It covers topics like time and memory complexity, asymptotic notation, fundamental data structures like arrays, lists, trees and hash tables, and choosing proper data structures. Computational complexity is important for algorithm design and efficient programming. The document provides examples of analyzing complexity for different algorithms.
Our techniques provide fast wavelet tree construction in practice based on recent theoretical work. Experiments on real datasets show our methods using the PEXT and PSHUFB CPU instructions outperform previous approaches. For wavelet trees, our methods are 1.9x faster than naive construction on average and competitive with state-of-the-art. For wavelet matrices, we achieve speedups of 1.1-1.9x over the state-of-the-art. This work provides the first practical implementation of the fastest known wavelet tree construction algorithms.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Linear search examines each element of a list sequentially, one by one, and checks if it is the target value. It has a time complexity of O(n) as it requires searching through each element in the worst case. While simple to implement, linear search is inefficient for large lists as other algorithms like binary search require fewer comparisons.
Crash course on data streaming (with examples using Apache Flink)Vincenzo Gulisano
These are the slides I used for a crash course (4 hours) on data streaming. It contains both theory / research aspects as well as examples based on Apache Flink (DataStream API)
This document presents a framework for verifying the safety of classification decisions made by deep neural networks. It defines safety as the network producing the same output classification for an input and any perturbations of that input within a bounded region. The framework uses satisfiability modulo theories (SMT) to formally verify safety by attempting to find an adversarial perturbation that causes misclassification. It has been tested on several image classification networks and datasets. The framework provides a method to automatically verify safety properties of deep neural networks.
This document presents methods for computing information flow and quantifying information leakage in non-probabilistic programs using symbolic model checking. It discusses using binary decision diagrams (BDDs) and algebraic decision diagrams (ADDs) to represent program states and calculate fixed points. Algorithms are provided for symbolically computing min-entropy and Shannon entropy leakage by constructing ADDs representing the program summary and sets of possible outputs. The methods were implemented in a tool called Moped-QLeak and evaluated on example programs. Future work includes supporting recursive programs and using other symbolic verification approaches.
Digit recognizer by convolutional neural networkDing Li
A convolutional neural network is used to recognize handwritten digits from images. The CNN uses convolutional and max pooling layers to extract local features from the images. These local features are then fed into fully connected layers to combine them into global features used to predict the digit (0-9) in each image with a softmax output layer. The model is trained on 60,000 images and achieves 99.67% accuracy on the test set after 30 training epochs. While powerful, it is unclear if humans can fully understand the "mind" and logic of artificial neural networks.
Paper Study: Melding the data decision pipelineChenYiHuang5
Melding the data decision pipeline: Decision-Focused Learning for Combinatorial Optimization from AAAI2019.
Derive the math equation from myself and match the same result as two mentioned CMU papers [Donti et. al. 2017, Amos et. al. 2017] while applying the same derivation procedure.
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16MLconf
The document discusses new techniques for improving the k-means clustering algorithm. It begins by describing the standard k-means algorithm and Lloyd's method. It then discusses issues with random initialization for k-means. It proposes using furthest point initialization (k-means++) as an improvement. The document also discusses parallelizing k-means initialization (k-means||) and using nearest neighbor data structures to speed up assigning points to clusters, which allows k-means to scale to many clusters. Experimental results show these techniques provide faster and higher quality clustering compared to standard k-means.
Deep Convolutional GANs - meaning of latent spaceHansol Kang
DCGAN은 GAN에 단순히 conv net을 적용했을 뿐만 아니라, latent space에서도 의미를 찾음.
DCGAN 논문 리뷰 및 PyTorch 기반의 구현.
VAE 세미나 이슈 사항에 대한 리뷰.
my github : https://github.com/messy-snail/GAN_PyTorch
[참고]
https://github.com/znxlwm/pytorch-MNIST-CelebA-GAN-DCGAN
https://github.com/taeoh-kim/Pytorch_DCGAN
Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).
Options and trade offs for parallelism and concurrency in Modern C++Satalia
While threads have become a first class citizen in C++ since C++11, it is not always the case that they are the best abstraction to express parallelism where the objective is to speed up computations. OpenMP is a parallelism API for C/C++ and Fortran that has been around for a long time. Intel's Threading Building Blocks (TBB) is only a little bit more than 10 years old, but is very mature, and specifically for C++.
Mats will introduce OpenMP and TBB and their use in modern C++ and provide some best practices for them as well as try to predict what the C++ standard has in store for us when it comes to parallelism in the future.
This document provides an overview of VAE-type deep generative models, especially RNNs combined with VAEs. It begins with notations and abbreviations used. The agenda then covers the mathematical formulation of generative models, the Variational Autoencoder (VAE), variants of VAE that combine it with RNNs (VRAE, VRNN, DRAW), a Chainer implementation of Convolutional DRAW, other related models (Inverse DRAW, VAE+GAN), and concludes with challenges of VAE-like generative models.
Generative Adversarial Networks (GANs) are a class of machine learning frameworks where two neural networks contest with each other in a game. A generator network generates new data instances, while a discriminator network evaluates them for authenticity, classifying them as real or generated. This adversarial process allows the generator to improve over time and generate highly realistic samples that can pass for real data. The document provides an overview of GANs and their variants, including DCGAN, InfoGAN, EBGAN, and ACGAN models. It also discusses techniques for training more stable GANs and escaping issues like mode collapse.
(chapter 8) A Concise and Practical Introduction to Programming Algorithms in...Frank Nielsen
These are the slides accompanying the textbook:
A Concise and Practical Introduction to Programming Algorithms in Java
by Frank Nielsen
Published by Springer-Verlag (2009), Undergraduate textbook in computer science (UTiCS series)
ISBN: 978-1-84882-338-9
http://www.lix.polytechnique.fr/~nielsen/JavaProgramming/
http://link.springer.com/book/10.1007%2F978-1-84882-339-6
- The document presents a new formulation of the attention mechanism in Transformers using kernels. This allows defining attention over a larger space and integrating positional embeddings.
- Experiments on neural machine translation and sequence prediction tasks study different kernel forms like RBF and their combination with positional encodings.
- Results show the best kernel is the RBF kernel and competitive performance to state-of-the-art models with less computation.
Accelerating Random Forests in Scikit-LearnGilles Louppe
Random Forests are without contest one of the most robust, accurate and versatile tools for solving machine learning tasks. Implementing this algorithm properly and efficiently remains however a challenging task involving issues that are easily overlooked if not considered with care. In this talk, we present the Random Forests implementation developed within the Scikit-Learn machine learning library. In particular, we describe the iterative team efforts that led us to gradually improve our codebase and eventually make Scikit-Learn's Random Forests one of the most efficient implementations in the scientific ecosystem, across all libraries and programming languages. Algorithmic and technical optimizations that have made this possible include:
- An efficient formulation of the decision tree algorithm, tailored for Random Forests;
- Cythonization of the tree induction algorithm;
- CPU cache optimizations, through low-level organization of data into contiguous memory blocks;
- Efficient multi-threading through GIL-free routines;
- A dedicated sorting procedure, taking into account the properties of data;
- Shared pre-computations whenever critical.
Overall, we believe that lessons learned from this case study extend to a broad range of scientific applications and may be of interest to anybody doing data analysis in Python.
This document discusses algorithms complexity and data structures efficiency. It covers topics like time and memory complexity, asymptotic notation, fundamental data structures like arrays, lists, trees and hash tables, and choosing proper data structures. Computational complexity is important for algorithm design and efficient programming. The document provides examples of analyzing complexity for different algorithms.
Our techniques provide fast wavelet tree construction in practice based on recent theoretical work. Experiments on real datasets show our methods using the PEXT and PSHUFB CPU instructions outperform previous approaches. For wavelet trees, our methods are 1.9x faster than naive construction on average and competitive with state-of-the-art. For wavelet matrices, we achieve speedups of 1.1-1.9x over the state-of-the-art. This work provides the first practical implementation of the fastest known wavelet tree construction algorithms.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Linear search examines each element of a list sequentially, one by one, and checks if it is the target value. It has a time complexity of O(n) as it requires searching through each element in the worst case. While simple to implement, linear search is inefficient for large lists as other algorithms like binary search require fewer comparisons.
Crash course on data streaming (with examples using Apache Flink)Vincenzo Gulisano
These are the slides I used for a crash course (4 hours) on data streaming. It contains both theory / research aspects as well as examples based on Apache Flink (DataStream API)
On the Necessity and Inapplicability of PythonTakeshi Akutsu
This document discusses the use of Python for numerical software development. It begins by introducing the author and their background in computational mechanics. It then discusses PyHUG, the Python user group in Taiwan, and PyCon Taiwan 2020.
The document notes that while Python is slow for number crunching, NumPy can provide reasonably fast performance. It explains that a hybrid architecture is commonly used, with the core computing kernel written in C++ for speed and Python used for the user-level API to describe complex problems more easily. An example of solving the Laplace equation is provided to demonstrate the speed differences between pure Python, NumPy, and C++ implementations.
The document advocates for training computer scientists in a hybrid approach through a numerical software
On the necessity and inapplicability of pythonYung-Yu Chen
Python is a popular scripting language adopted by numerical software vendors to help users solve challenging numerical problems. It provides easy-to-use interface and offers decent speed through array operations, but it is not suitable for engineering the low-level constructs. To make good numerical software, developers need to be familiar with C++ and computer architecture. The gap of understandings between the high-level applications and low-level implementation motivated me to organize a course to train computer scientists what it takes to build numerical software that the users (application experts) want. This talk will portray a bird view of the advantages and disadvantages of Python and where and how C++ should be used in the context of numerical software. The information may be used to map out a plan to acquire the necessary skill sets for making the software.
Recording https://www.youtube.com/watch?v=OwA-Xt_Ke3Y
Combinatorial optimization and deep reinforcement learning민재 정
The document discusses using deep learning approaches for solving combinatorial optimization problems like task allocation. It reviews different reinforcement learning methods that have been applied to problems like the vehicle routing problem using pointer networks, transformers, and graph neural networks. Future work opportunities are identified in applying these deep learning techniques to multi-vehicle routing problems and using them to solve specific task allocation scenarios.
This document discusses analysis of algorithms and complexity analysis using asymptotic notations like Big O notation. It explains that algorithm analysis is important for efficiency, performance prediction, comparisons, and optimizations. It then covers evaluating time and space complexity, asymptotic notations like Big O, Omega and Theta, examples of complexity classes like O(N) and O(N^2), and analyzing the complexity of common operations like arrays and objects. It also discusses pros and cons of Big O notation and provides resources for further learning.
This document outlines the syllabus for a course on data structures and algorithms using Java. It covers topics such as the role of algorithms and data structures, algorithm design techniques, types of data structures including primitive types, arrays, stacks, queues, linked lists, trees, graphs, and algorithm analysis. Specific algorithms and data structures discussed include sorting, searching, priority queues, stacks, queues, linked lists, trees, graphs, hashing, and complexity theory.
This document discusses algorithm analysis concepts such as time complexity, space complexity, and big-O notation. It explains how to analyze the complexity of algorithms using techniques like analyzing loops, nested loops, and sequences of statements. Common complexities like O(1), O(n), and O(n^2) are explained. Recurrence relations and solving them using iteration methods are also covered. The document aims to teach how to measure and classify the computational efficiency of algorithms.
Functional Programming and Composing Actorslegendofklang
With the world being non-deterministic, with failure being abundant, and with communication latency being very real—how do we design systems that are capable of dealing with these conditions and how can we expose abstractions that are feasible to reason about?
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...Arvind Surve
This deck includes Apache SystemML Runtime techniques. Those include parfor optimization, bufferpool optimization, spark specific rewrites, partitioning preserving operations, update in place, and ongoing research (Compressed Linear Algebra)
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...Arvind Surve
This session includes Apache SystemML Runtime techniques. Those include parfor optimization, bufferpool optimization, spark specific rewrites, partitioning preserving operations, update in place, and ongoing research (Compressed Linear Algebra)
Provenance for Data Munging EnvironmentsPaul Groth
Data munging is a crucial task across domains ranging from drug discovery and policy studies to data science. Indeed, it has been reported that data munging accounts for 60% of the time spent in data analysis. Because data munging involves a wide variety of tasks using data from multiple sources, it often becomes difficult to understand how a cleaned dataset was actually produced (i.e. its provenance). In this talk, I discuss our recent work on tracking data provenance within desktop systems, which addresses problems of efficient and fine grained capture. I also describe our work on scalable provence tracking within a triple store/graph database that supports messy web data. Finally, I briefly touch on whether we will move from adhoc data munging approaches to more declarative knowledge representation languages such as Probabilistic Soft Logic.
Presented at Information Sciences Institute - August 13, 2015
Technical operations is plagued with an unhealthy infatuation of typically untested, imperative code with a high reliance on shared mutable state using dynamically typed languages such as Ruby, Python, Bash, and - ugh - remember Perl? :) In an age where building reliable infrastructure to elastically scale applications and services are paramount to business success, we need to start rethinking the infrastructure engineer’s toolkit and guiding principles. This talk will take a look at applying various functional techniques to building and automating infrastructure. From functional package management and congruent configuration to declarative cloud provisioning we’ll see just how practical these techniques typically used in functional programming for applications can be used to help build more robust and predictable infrastructures. While specific code examples will be given, the emphasis of the talk will be on guiding principles and functional design.
Online learning, Vowpal Wabbit and HadoopHéloïse Nonne
Online learning, Vowpal Wabbit and Hadoop
Online learning has recently caught a lot of attention, following some competitions, and especially after Criteo released 11GB for the training set of a Kaggle contest.
Online learning allows to process massive data as the learner processes data in a sequential way using up a low amount of memory and limited CPU ressources. It is also particularly suited for handling time-evolving date.
Vowpal Wabbit has become quite popular: it is a handy, light and efficient command line tool allowing to do online learning on GB of data, even on a standard laptop with standard memory. After a reminder of the online learning principles, we present how to run Vowpal Wabbit on Hadoop in a distributed fashion.
The document describes solving linear programming problems graphically. It provides an example maximization problem with an objective of maximizing Z=30x1 + 40x2 subject to three constraints. Graphically, the feasible region satisfying all constraints is determined by plotting the points where each constraint equation is equal to 0 for x1 and x2, and shading the correct side of the inequality sign. The optimal solution that maximizes Z can then be found within the feasible region on the graph.
This course introduces students to operations research and its applications. The course covers several optimization techniques including linear programming, network flows, and transportation problems. Students will learn how to formulate mathematical models of real-world systems and solve them to determine optimal resource allocation. The goal is for students to be able to apply operations research methods to decision-making problems in fields like manufacturing, transportation, and public services. Students will be assessed through assignments, tests, and a final exam.
Data structure and algorithm using javaNarayan Sau
This presentation created for people who like to go back to basics of data structure and its implementation. This presentation mostly helps B.Tech , Bsc Computer science students as well as all programmer who wants to develop software in core areas.
2. Agenda
2
• Introduction
• What is complexity? Why is it important?
• Data structures
• How to represent a solution efficiently?
• Algorithmic tricks
• What are the main bottlenecks and how to avoid them?
• Parallelization
• How do parallel computing work? Why, when, and how to
parallelize?
• Software engineering
• How to design flexible and reusable code?
• Resources
• How to avoid reinventing the wheel?
4. About Me
• Finished my Ph. D. in 2012 at the Ecole des Mines de Nantes (France)
and Universidad de Los Andes (Colombia)
• Dynamic vehicle routing: solution methods and computational tools
• Since Oct. 2012, researcher at NICTA (Melbourne,Australia)
• Disaster management team
• NICTA in a few numbers:
• 700 staff, 260 PhDs
• 7 research groups, 4 business teams
• 550+ publications in 2012
4
5. Assumptions
• General knowledge on vehicle routing
• General knowledge of common heuristics
• Local search
• Variable Neighborhood Search (VNS)
• General knowledge of object-oriented programming
• Examples are in Java
5
6. Time Complexity
6
• Measure the worst case number of operations
• Expressed as a function of the size of the problem n
7. Time Complexity
6
• Measure the worst case number of operations
• Expressed as a function of the size of the problem n
For
i
=
1
to
n
a
=
1
+
i
b
=
2
*
a
c
=
a
*
b
+
2
8. Time Complexity
6
• Measure the worst case number of operations
• Expressed as a function of the size of the problem n
For
i
=
1
to
n
a
=
1
+
i
b
=
2
*
a
c
=
a
*
b
+
2
Performs n*(1+1+2) = 4n operations
Complexity is O(n)
9. Time Complexity
6
• Measure the worst case number of operations
• Expressed as a function of the size of the problem n
For
i
=
1
to
n
a
=
1
+
i
b
=
2
*
a
c
=
a
*
b
+
2
Performs n*(1+1+2) = 4n operations
Complexity is O(n)
For
S
⊆
{1..n}
a
=
1
+
|S|
10. Time Complexity
6
• Measure the worst case number of operations
• Expressed as a function of the size of the problem n
For
i
=
1
to
n
a
=
1
+
i
b
=
2
*
a
c
=
a
*
b
+
2
Performs n*(1+1+2) = 4n operations
Complexity is O(n)
For
S
⊆
{1..n}
a
=
1
+
|S|
Performs 2n operations
Complexity is O(2n)
11. Space Complexity
7
• Measure the worst case memory usage
• Expressed in as a function of the size of the problem n
12. Space Complexity
7
• Measure the worst case memory usage
• Expressed in as a function of the size of the problem n
For
i
=
1
to
n
a
=
1
+
i
b
=
2
*
a
c
=
a
*
b
+
2
13. Space Complexity
7
• Measure the worst case memory usage
• Expressed in as a function of the size of the problem n
For
i
=
1
to
n
a
=
1
+
i
b
=
2
*
a
c
=
a
*
b
+
2
Stores at most 4 integers simultaneously
Complexity is O(1)
14. Space Complexity
7
• Measure the worst case memory usage
• Expressed in as a function of the size of the problem n
For
i
=
1
to
n
a
=
1
+
i
b
=
2
*
a
c
=
a
*
b
+
2
Stores at most 4 integers simultaneously
Complexity is O(1)
For
S
⊆
{1..n}
a
=
1
+
|S|
15. Space Complexity
7
• Measure the worst case memory usage
• Expressed in as a function of the size of the problem n
For
i
=
1
to
n
a
=
1
+
i
b
=
2
*
a
c
=
a
*
b
+
2
Stores at most 4 integers simultaneously
Complexity is O(1)
For
S
⊆
{1..n}
a
=
1
+
|S|
Stores at most n+1 integers simultaneously
Complexity is O(n)
16. Complexity In Practice
8
10 100 1000 10,000 100,000 1000,000
n 0.1 ns 1 ns 10 ns 100 ns 1 µs 10 µs
n.log(n) 0.1 ns 2 ns 30 ns 400 ns 5 µs 60 µs
n2 1 ns 100 ns 10 µs 1 ms 100 ms 10 s
n3 1 ns 10 µs 10 ms 1 s 2.7 h 115 d
en 22 µs 8.5 1024 years - - - -
Computational time for a single floating point operation
on a recent desktop processor
17. Complexity In Practice
9
10 100 1000 10,000 100,000 1000,000
n 320 b 3.2 kb 32 kb 320 kb 3.2 Mb 32 Mb
n.log(n) 320 b 64 kb 96 kb 1.28 Mb 16 Mb 190 Mb
n2 3.2 kb 320 kb 32 Mb 3.2 Gb 320 Gb 32Tb
n3 32 kb 32 Mb 32 Gb 32Tb 32 Pb 32 Eb*
en 700 kb 8 1026 Eb
Memory requirement to store a single
floating point precision number
(*The world’s storage capacity is estimated to be 300 Eb - or 300 billion Gb)
24. Representing Routes
12
• Routes are the base of solving vehicle routing problems
• It is critical to have efficient data structures to store them
• There is no best data structure
• Performance depends on how it is used
• Tradeoff between simplicity and performance
• Choice should be motivated by
• Purpose: prototype v.s. state of the art algorithm
• Usage: what are the most common operations?
25. Dynamic Array List
• Common operation complexity
• Access to the customer by position: O(1)
• Access to the position of customer by id: O(n)
• Iteration: O(1)
• Insertion/deletion: O(n)
• See
[ArrayListRoute.java]
13
1 2 3 4 5
0 2 3 4 0
1 2 3 4 5 6
0 1 6 5 7 0
0
7
5
6
4
3
2
1
26. Doubly Linked List
• Common operation complexity
• Access to the customer by position: O(n)
• Access to the position of customer by id: O(n)
• Iteration: O(1)
• Insertion/deletion: O(1)
• See[LinkedListRoute.java]
14
0
0
2
1
3
6
4
6
0
7 0
0
7
5
6
4
3
2
1
27. • Common operation complexity
• Access to the customer by position: O(n)
• Access to the position of customer by id: O(1)
• Iteration: O(1)
• Insertion/deletion: O(1)
Doubly Linked List V2
15
0
7
5
6
4
3
2
1
1 2 3 4 5 6 7
0 0 2 3 6 1 5
1 2 3 4 5 6 7
6 3 4 0 7 5 0
Predecessor
Successor
First
Last
28. • Common operation complexity
• Access to the customer by position: O(n)
• Access to the position of customer by id: O(1)
• Iteration: O(1)
• Insertion/deletion: O(1)
Implementation can be tricky, especially for
repeated nodes (e.g., depot)
Warning: The implementation in
VroomModeling is full of bugs “incomplete”
Doubly Linked List V2
15
0
7
5
6
4
3
2
1
1 2 3 4 5 6 7
0 0 2 3 6 1 5
1 2 3 4 5 6 7
6 3 4 0 7 5 0
Predecessor
Successor
First
Last
46. algorithms package
23
IVRPOptimizationAlgorithm
GRASPVND ParallelGRASP
Heuristic
Concentration
CW
Clarke and
Wright heuristic
to generate
routes
Explore a
number of
neighborhoods
Start with a
solution from CW
and apply VND
Parallel
implementation
of GRASP
Takes a set of
routes and build
a solution
examples package
Each class contains a main method that we will
use to run the examples
util package
Classes to make our life easier
VNS
49. [ExampleRoutesAtomic.java]
• Compares ArrayListRoute and LinkedListRoute
• Append a node
• Get a node at a random position
• Remove the first node
25
ArrayList
Append:123.1ms
GetNodeAt:18.7ms
RemoveFirst:134.2ms
LinkedList
Append:129.4ms
GetNodeAt:66.6ms
RemoveFirst:110.6ms
50. [CW.java]
• Clarke and Wright constructive heuristic
26
0
4
3
2
1
2
13
4
0
4
3
2
1
2
13
4
0
4
3
2
1
2
13
4
Initialization:
create one route
per node
Each step: Merge
the two routes to
generate the
greatest saving
Repeat until there
are no more
feasible merging
51. [CW.java]
• Clarke and Wright constructive heuristic
26
0
4
3
2
1
2
13
4
0
4
3
2
1
2
13
4
0
4
3
2
1
2
13
4
Initialization:
create one route
per node
Each step: Merge
the two routes to
generate the
greatest saving
Repeat until there
are no more
feasible merging
Implemented in VroomHeuristics in package
vroom.common.heuristics.cw
55. Variable Neighborhood Descent
• Explore different neighborhoods sequentially
• The final solution is a local optima for all neighborhoods
28
0
4
3
2
11
0
4
3
2
11
N1 N2
0
4
3
2
11
56. Variable Neighborhood Descent
• Explore different neighborhoods sequentially
• The final solution is a local optima for all neighborhoods
28
0
4
3
2
11
0
4
3
2
11
N1 N2
0
4
3
2
11
N1start
Improvement
found?
57. Variable Neighborhood Descent
• Explore different neighborhoods sequentially
• The final solution is a local optima for all neighborhoods
28
0
4
3
2
11
0
4
3
2
11
N1 N2
0
4
3
2
11
N1start
Improvement
found?
yes
58. Variable Neighborhood Descent
• Explore different neighborhoods sequentially
• The final solution is a local optima for all neighborhoods
28
0
4
3
2
11
0
4
3
2
11
N1 N2
0
4
3
2
11
N1 N2start
Improvement
found?
yes
no
59. Variable Neighborhood Descent
• Explore different neighborhoods sequentially
• The final solution is a local optima for all neighborhoods
28
0
4
3
2
11
0
4
3
2
11
N1 N2
0
4
3
2
11
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
62. [VND.java]
29
The constraints are defined separately
from the neighborhoods.
Each constraint is responsible for
checking if a move is feasible
Ignore for now
63. [VND.java]
29
The constraints are defined separately
from the neighborhoods.
Each constraint is responsible for
checking if a move is feasible
Ignore for now
Instantiate the neighborhood that will
be used later
66. [VND.java]
30
Performs a local search in
the neighborhood of the
current solution
The parameters control how the
search is performed, in this case
deterministic & best improvement
67. [ExampleVND.java]
• Run the main method
• Is the ordering of neighborhoods in VND.java logical?
• How to improve it?
• Is the localSearch implementation in VND.java coherent
with the definition of VND?
31
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
68. [ExampleVND.java]
• Run the main method
• Is the ordering of neighborhoods in VND.java logical?
• How to improve it?
• Is the localSearch implementation in VND.java coherent
with the definition of VND?
31
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
69. [ExampleVND.java]
• Run the main method
• Is the ordering of neighborhoods in VND.java logical?
• How to improve it?
• Is the localSearch implementation in VND.java coherent
with the definition of VND?
31
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
72. Store Routes
33
• Store routes for future use
• Requirements
• Memory-efficient
• Avoid repeated routes
• Store a minimalistic route representation
• Low computation overhead
• Two approaches
• Exhaustive list
• Issue: repeated routes
• Hash based set
73. Hash Functions
• Compress the information stored in a route
• Desired characteristics
• Determinism
• Uniformity
• Issues
• Two different routes can have the same hash (hash collision)
• Computational cost of hash evaluation
34
75. • See Groer et al. 2010 - [GroerSolutionHasher.java]
• Produces a 32-bit integer that depend on the set and
sequence of nodes in the route
010
XOR
111
=
101
(2)
(7)
(5)
Sequence Dependent Hash
36
Input:
-‐
rnd:
An
array
of
n
random
integers
-‐
route:
A
route
Output:
-‐
A
hash
value
for
route
1.if
route.first
>
route.last
1.route
←
reverse
ordering
of
route
2.hash
←
0
3.For
each
edge
(i,j)
in
route
1.hash
←
hash
XOR
rnd[i+j
%
n]
4.return
hash
76. Sequence Independent Hash
• See Pillac et al. 2012 - [NodeSetSolutionHasher.java]
• Produce a 32-bit integer that depends on the set of nodes
visited by the route
• Advantage:
• Implicit filtering of
duplicated routes
37
Input:
-‐
rnd:
An
array
of
n
random
integers
-‐
route:
A
route
Output:
-‐
A
hash
value
for
route
1.hash
←
0
2.For
each
node
i
in
route
1.hash
←
hash
XOR
rnd[i
%
n]
3.return
hash
77. Example
• Greedy Randomized Adaptive Search Procedure
38
Randomized Constructive
Heuristic
Start
Local Search
End
78. [GRASP.java]
• Clarke and Wright construction heuristic
• Variable Neighborhood Descent optimization
39
79. [GRASP.java]
• Clarke and Wright construction heuristic
• Variable Neighborhood Descent optimization
39
80. [GRASP.java]
• Clarke and Wright construction heuristic
• Variable Neighborhood Descent optimization
39
91. • Adapt the GRASP procedure to collect routes
• Add the following fragment where needed
• Hint: we want to collect as many routes as possible
• Experiment with different route pools
• What is the impact on the number of routes and HC time?
[ExampleGRASPHC.java]
43
94. Bottlenecks In Heuristics For VRP
46
• Size of the neighborhood
• Areas of the neighborhood are not interesting
• Only minor changes are made to the solution at each move
• How different is the new neighborhood?
• How to avoid restarting from scratch?
• Move evaluation
• Cost & Feasibility
• Performed millions of times
• Which is most costly? Which should be done first?
95. Granular Neighborhoods
47
• Reduce the size of the neighborhoods
• SeeToth andVigo (2003)
• Costly (long) arcs are less likely to be in good solutions
• Filter out moves that involves only costly arcs
• Costly arc threshold
5
4
3
2
1
0
Heuristic solution
Number of nodes
+ Number of vehicles
# = · z0
n+K0
Sparsification parameter
(e.g., =2.5)
96. Granular Neighborhoods
47
• Reduce the size of the neighborhoods
• SeeToth andVigo (2003)
• Costly (long) arcs are less likely to be in good solutions
• Filter out moves that involves only costly arcs
• Costly arc threshold
5
4
3
2
1
0
Inserting 5 between 3 and
4 involves 2 costly arcsHeuristic solution
Number of nodes
+ Number of vehicles
# = · z0
n+K0
Sparsification parameter
(e.g., =2.5)
97. Granular Neighborhoods
47
• Reduce the size of the neighborhoods
• SeeToth andVigo (2003)
• Costly (long) arcs are less likely to be in good solutions
• Filter out moves that involves only costly arcs
• Costly arc threshold
5
4
3
2
1
0
Inserting 5 between 3 and
4 involves 2 costly arcs
Inserting 5 between 1 and
2 involves 1 costly arcs
Heuristic solution
Number of nodes
+ Number of vehicles
# = · z0
n+K0
Sparsification parameter
(e.g., =2.5)
98. Static Move Descriptor (SMD)
• Store information between moves
• See Zachariadis and Kiranoudis (2010)
• Precompute and maintain all moves
• Example with relocate (relocation of a single node)
48
5
4
3
2
1
0
n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... 0 0 ...
0 ... ... ... ...
0 ... ... ... ...
... ... 0 ... ...
... ... ... ... 0
... 0 ... ... ...
x
x
99. Static Move Descriptor (SMD)
• Store information between moves
• See Zachariadis and Kiranoudis (2010)
• Precompute and maintain all moves
• Example with relocate (relocation of a single node)
48
5
4
3
2
1
0
n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... 0 0 ...
0 ... ... ... ...
0 ... ... ... ...
... ... 0 ... ...
... ... ... ... 0
... 0 ... ... ...
x
Cost of relocating 4 after 3:
c3,4+c0,5-c5,4-c3,0
x
100. Static Move Descriptor (SMD)
• Store information between moves
• See Zachariadis and Kiranoudis (2010)
• Precompute and maintain all moves
• Example with relocate (relocation of a single node)
48
5
4
3
2
1
0
n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... 0 0 ...
0 ... ... ... ...
0 ... ... ... ...
... ... 0 ... ...
... ... ... ... 0
... 0 ... ... ...
x
Cost of relocating 4 after 3:
c3,4+c0,5-c5,4-c3,0
Cost of relocating 1 after 5:
c0,5+c1,4-c0,1-c5,4
x
101. n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
SMD Update
• One static SMD table is created per neighborhood
• Static update rules are predefined to know which SMDs need
to be updated after a move was executed
49
5
4
3
2
1
0
102. n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
SMD Update
• One static SMD table is created per neighborhood
• Static update rules are predefined to know which SMDs need
to be updated after a move was executed
49
5
4
3
2
1
0
103. n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
SMD Update
• One static SMD table is created per neighborhood
• Static update rules are predefined to know which SMDs need
to be updated after a move was executed
49
5
4
3
2
1
0
104. n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
SMD Update
• One static SMD table is created per neighborhood
• Static update rules are predefined to know which SMDs need
to be updated after a move was executed
49
5
4
3
2
1
0
105. Selecting The Best Neighbor
• All SMDs are store in a Fibonacci Heap
• O(1) access to the lowest cost SMD
• O(1) insertion
• O(n.log(n)) deletion
• How to find the
best feasible neighbor?
• Pop the lowest cost SMD
until a feasible move is found
50
Source: http://en.wikipedia.org/wiki/Fibonacci_heap
106. SMD In Practice
51
t tag of
m, so that
e heaps.
t update
opriately
s caused
tances.
D repre-
executed
ew SMD
ic move
was the
was the
in local
0
2
0
n: Problem Size
150 450 750 1050 150 450 750 1050
acted until the first admissible is obtained against problem size.
0
500
1000
1500
2000
2500
n: Problem Size
CPUTimefor50000iterations(sec)
Classic
Representation
SMD
reprentation
200 400 600 800 1000 1200
Fig. 8. The acceleration role of the SMD representation.
)
Source: Zachariadis and Kiranoudis (2010)
Comparison of computational times
107. Sequential Search
• Explore neighborhoods in a smart way
• See Irnich et al. (2006)
• Decompose moves in partial moves
• Example with swap
52
54
321
5
108. Sequential Search
• Explore neighborhoods in a smart way
• See Irnich et al. (2006)
• Decompose moves in partial moves
• Example with swap
52
54
321
5
109. Sequential Search
• Explore neighborhoods in a smart way
• See Irnich et al. (2006)
• Decompose moves in partial moves
• Example with swap
52
54
321
5
5
321
110. Sequential Search
• Explore neighborhoods in a smart way
• See Irnich et al. (2006)
• Decompose moves in partial moves
• Example with swap
52
54
321
5
5
321
54
2
5
111. Sequential Search In Practice
• Neighborhoods are explored by considering partial moves
• Exploration is pruned using bounds on the partial move cost
53
S. Irnich et al. / Computers & Operations Research 33 (2006) 2405–2429 2423
Or-Opt
40
60
80
100
120
140
elerationFactor
String-Exchange
0
100
200
300
400
500
600
700
800
900
0 500 1000 1500 2000 2500
AccelerationFactor
Special 2-Opt*
0
20
40
60
80
100
120
0 500 1000 1500 2000 2500
AccelerationFactor
Swap
0
20
40
60
80
100
120
0 500 1000 1500 2000 2500
AccelerationFactor
Relocation
20
30
40
50
60
70
elerationFactor
2-Opt
0
5
10
15
20
25
30
35
0 500 1000 1500 2000 2500
AccelerationFactor
f =100
f =75
f =50
f =25
f =100
f =75
f =50
f =25
f =100
f =75
f =50
f =100
f =75
f =50
f =100
f =75
f =50
f =25
f =100
f =75
f =50
f =25
Size n
Size n
Size n
Size n
2424 S. Irnich et al. / Computers & Operations Re
3-Opt*
0
2000
4000
6000
8000
10000
12000
14000
16000
200 300 400 500
AccelerationFactor
0
500
1000
1500
2000
AvgTimeperSearch[ms]
Size n
f =100
f =75
f =50
f =25
Fig. 8. Acceleration factor comparing lexicographic search and sequent
sequential search iteration for 3-opt* moves.f: average number of customers in a route
Speedup for swap and 3-Opt* neighborhoods
Source: Irnich et al. (2006)
112. Store Cumulative Information
• Reduce the complexity of move evaluation
• Store and maintain useful information
• For example: waiting time / forward slack time
• See Savelsbergh (1992)
• Constant time time window feasibility check
• More details in Module 2
54
118. Promises Of Parallelization
58
• Overcome the stalling of CPU performance increase
• Increased availability of parallel computing
• Personal computers with multiples CPUs/cores
• Most universities have access to large grids
• On demand cloud services (e.g.,Amazon)
119. Promises Of Parallelization
58
• Overcome the stalling of CPU performance increase
• Increased availability of parallel computing
• Personal computers with multiples CPUs/cores
• Most universities have access to large grids
• On demand cloud services (e.g.,Amazon)
Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusion
126. CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1
=
new
Thread(do
A,
do
B)
thread1.run()
thread2
=
new
Thread(do
C)
thread2.run()
Operating System
thread2
Create
a
new
thread
Assign
it
to
cpu
core
#4
Execute
the
instructions
A,B
Create
a
new
thread
Assign
it
to
cpu
core
#1
Execute
the
instructions
C
thread1
127. CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1
=
new
Thread(do
A,
do
B)
thread1.run()
thread2
=
new
Thread(do
C)
thread2.run()
Operating System
thread2
Create
a
new
thread
Assign
it
to
cpu
core
#4
Execute
the
instructions
A,B
Create
a
new
thread
Assign
it
to
cpu
core
#1
Execute
the
instructions
C
thread1
Takes time
128. CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1
=
new
Thread(do
A,
do
B)
thread1.run()
thread2
=
new
Thread(do
C)
thread2.run()
Operating System
thread2
Create
a
new
thread
Assign
it
to
cpu
core
#4
Execute
the
instructions
A,B
Create
a
new
thread
Assign
it
to
cpu
core
#1
Execute
the
instructions
C
Limited control on the actual
execution sequence
thread1
Takes time
129. CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1
=
new
Thread(do
A,
do
B)
thread1.run()
thread2
=
new
Thread(do
C)
thread2.run()
Operating System
thread2
Create
a
new
thread
Assign
it
to
cpu
core
#4
Execute
the
instructions
A,B
Create
a
new
thread
Assign
it
to
cpu
core
#1
Execute
the
instructions
C
Limited control on the actual
execution sequence
thread1
Increased memory
usage
Takes time
130. CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1
=
new
Thread(do
A,
do
B)
thread1.run()
thread2
=
new
Thread(do
C)
thread2.run()
Operating System
thread2
Create
a
new
thread
Assign
it
to
cpu
core
#4
Execute
the
instructions
A,B
Create
a
new
thread
Assign
it
to
cpu
core
#1
Execute
the
instructions
C
Takes time
Limited control on the actual
execution sequence
thread1
Increased memory
usage
Takes time
131. CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1
=
new
Thread(do
A,
do
B)
thread1.run()
thread2
=
new
Thread(do
C)
thread2.run()
Operating System
thread2
Create
a
new
thread
Assign
it
to
cpu
core
#4
Execute
the
instructions
A,B
Create
a
new
thread
Assign
it
to
cpu
core
#1
Execute
the
instructions
C
Takes time
Concurrent access
to shared resources
Limited control on the actual
execution sequence
thread1
Increased memory
usage
Takes time
134. Sharing Is Caring
61
Thread 1 Thread 2object
x
=
1 ✓ z
=
object.getX()
z
=
z
+
2
y
=
object.getX()
y
=
y
+
1
1
2
3
4
1
2
3
4
135. Sharing Is Caring
61
Thread 1 Thread 2object
x
=
1 ✓
x
=
?
z
=
object.getX()
z
=
z
+
2
object.setX(z)
y
=
object.getX()
y
=
y
+
1
object.setX(y)
1
2
3
4
1
2
3
4
136. Sharing Is Caring
61
Thread 1 Thread 2object
x
=
1 ✓
x
=
?x
=
2
z
=
object.getX()
z
=
z
+
2
y
=
object.getX()
y
=
y
+
1
object.setX(y)
1
2
3
4
1
2
3
4
137. Sharing Is Caring
61
Thread 1 Thread 2object
x
=
1 ✓
x
=
?
x
=
3
x
=
2
z
=
object.getX()
z
=
z
+
2
object.setX(z)
y
=
object.getX()
y
=
y
+
1
object.setX(y)
1
2
3
4
1
2
3
4
138. Sharing Is Caring
61
Thread 1 Thread 2object
x
=
1 ✓
x
=
?
x
=
3
x
=
2
z
=
object.getX()
z
=
z
+
2
object.setX(z)
y
=
object.getX()
y
=
y
+
1
object.setX(y)
y
=
object.getX() ? ✗
1
2
3
4
1
2
3
4
139. Sharing Is Caring
61
Thread 1 Thread 2object
x
=
1 ✓
x
=
?
x
=
3
x
=
2
x
=
1
+
1
+
2
≠
3✗
z
=
object.getX()
z
=
z
+
2
object.setX(z)
y
=
object.getX()
y
=
y
+
1
object.setX(y)
y
=
object.getX() ? ✗
1
2
3
4
1
2
3
4
142. Sharing With Care
6224
Thread 1 Thread 2object
x
=
1
x
=
2
y
=
object.getX()
y
=
y
+
1
object.setX(y)
lock(object)
(waiting)
lock(object)1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
143. Sharing With Care
6224
Thread 1 Thread 2object
x
=
1
x
=
2
y
=
object.getX()
y
=
y
+
1
object.setX(y)
release(object)
lock(object)
(waiting)
lock(object)1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
144. Sharing With Care
6224
Thread 1 Thread 2object
x
=
1
x
=
2
y
=
object.getX()
y
=
y
+
1
object.setX(y)
release(object)
lock(object)
(waiting)
lock(object)
getlock(object)
(waiting)
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
145. Sharing With Care
6224
Thread 1 Thread 2object
x
=
1
x
=
2
x
=
2 z
=
object.getX()
z
=
z
+
2
object.setX(z)
release(object)
y
=
object.getX()
y
=
y
+
1
object.setX(y)
release(object)
lock(object)
(waiting)
lock(object)
getlock(object)
(waiting)
x
=
4
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
146. Sharing With Care
6224
Thread 1 Thread 2object
x
=
1
x
=
2
x
=
2
x
=
1
+
1
+
2
=
4✓
z
=
object.getX()
z
=
z
+
2
object.setX(z)
release(object)
y
=
object.getX()
y
=
y
+
1
object.setX(y)
release(object)
lock(object)
(waiting)
lock(object)
getlock(object)
(waiting)
y
=
object.getX()
x
=
4
x
=
4
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
147. The Limits Of Sharing
63
Thread 1 Thread 2object
lock(object)
(waiting)
release(object)
getlock(object)
(waiting)
release(object)
lock(object)
• Lock/release mechanisms force
threads to wait
• In the worst case the execution
is sequential
• In general
• Lock an object while it may
be modified
• Do no lock for read-only
operations
• Check for inconsistencies at
runtime
148. Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusion
Parallelization In Practice
64
149. Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
150. Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
151. Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
152. Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
x (1- e- time spent debugging )
153. Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
x (1- e- time spent debugging )
x (1- e- number of headaches )
154. Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
x (1- e- time spent debugging )
x (1- e- number of headaches )
155. Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
x (1- e- time spent debugging )
x (1- e- number of headaches )
Not so random
156. Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Converges to
1
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
x (1- e- time spent debugging )
x (1- e- number of headaches )
Not so random
157. Amdahl’s Law
65
S =
1
1 ↵
P + ↵
Given:
- A fraction α of the
code can be
parallelized
- P processors
The speedup is
bounded by:
158. Amdahl’s Law
65
S =
1
1 ↵
P + ↵
Given:
- A fraction α of the
code can be
parallelized
- P processors
The speedup is
bounded by:
Source: http://en.wikipedia.org/wiki/Parallel_computing
Illusion
(P)
(↵)
(S)
159. Amdahl’s Law
65
S =
1
1 ↵
P + ↵
Given:
- A fraction α of the
code can be
parallelized
- P processors
The speedup is
bounded by:
“When a task cannot be partitioned
because of sequential constraints, the
application of more effort has no effect
on the schedule.The bearing of a child
takes nine months, no matter how many
women are assigned.”
Fred Brooks
160. Two Approaches
• Run a sequential algorithm in different threads
• E.g., different experiments, or runs of a same algorithm
• No synchronization issues
• Limited shared resources issues
• Design a parallel algorithm
• Potentially a real speedup of the algorithm
• Increase complexity and harder to debug
66
161. Learnt From Experience
• Limit number of shared resources
• Avoid risk of concurrent modifications
• Use bullet proof synchronization / locks / error checks
• Limit complex debugging
• Limit communication between threads
• Reduce waiting for other threads to exchange information
• Execute a significant number of operations in each thread
• Execution time ≫ thread creation overhead
67
167. [ParallelGRASP.java]
70
Ask the system how many
processors are available
Create one GRASP instance per
iteration
The executor will be responsible for
the creation of threads
170. [ParallelGRASP.java]
71
The executor will creates threads as needed,
execute the GRASP subprocesses, and return
the results
Loop through pairs
<GRASP subprocess, Best solution>
171. [ExampleParallelGRASP.java]
• Run the for different instances and compare with the
sequential version
• What is the speedup?
• Are the solutions identical?
• Going further ....
• Why do we create GRASP instances with a single iteration?
• What are the synchronization issues?
72
172. LS LS
Variable Neighborhood Search
• Similar to theVND
• Random exploration of each neighborhood
• Local search
73
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
LS
173. LS LS
Variable Neighborhood Search
• Similar to theVND
• Random exploration of each neighborhood
• Local search
73
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
LS
See [VNS.java]
174. LS LS
Variable Neighborhood Search
• Similar to theVND
• Random exploration of each neighborhood
• Local search
73
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
LS
See [VNS.java]
NeighborhoodExplorer
176. LS
LS
Parallel Variable Neighborhood Search
• Explore all neighborhoods in parallel
• Select best neighbor
75
N1
N2start
Nn
end
Improvement
found?
yes
no
LS
177. LS
LS
Parallel Variable Neighborhood Search
• Explore all neighborhoods in parallel
• Select best neighbor
75
N1
N2start
Nn
end
Improvement
found?
yes
no
LS
See [ParallelVNS.java]
178. LS
LS
Parallel Variable Neighborhood Search
• Explore all neighborhoods in parallel
• Select best neighbor
75
N1
N2start
Nn
end
Improvement
found?
yes
no
LS
See [ParallelVNS.java]
NeighborhoodExplorer
179. [ParallelVNS.java]
• In the provided version, neighborhoods are explored in the
same thread
• Exercise: explore each neighborhood in a separate thread
• Hints:
• Use mExecutor
• See ParallelGRASP.java for reference
• Compare the speed-up for small and large instances
76
180. Parallel Algorithms Classification
• Classification according to three dimensions (Crainic 2008)
• Search control cardinality
• 1-control / p-control
• Search control and communications
• Rigid / Knowledge synchronization
• Collegial / Knowledge Collegial
• Search differentiations
• Same initial point / Multiple initial point
• Same search strategy / Different search strategy
• In which category fall the ParallelGRASP and ParallelVNS?
77
191. Software Development For Research
82
Specifications
Design
Implementation
Test
Prototype
Final product
192. Software Development For Research
82
Specifications
Design
Implementation
Test
Prototype
Final product
What do I need now?
What will I need in the future?
What may I need in the future
193. Software Development For Research
82
Specifications
Design
Implementation
Test
Prototype
Final product
What do I need now?
What will I need in the future?
What may I need in the future
How to implement what I need,
will need and may need?
How to ensure I will be able to
reuse/extend my code?
194. Software Development For Research
82
Specifications
Design
Implementation
Test
Prototype
Final product
What do I need now?
What will I need in the future?
What may I need in the future
How to implement what I need,
will need and may need?
How to ensure I will be able to
reuse/extend my code?
How to do what I need now?
195. Software Development For Research
82
Specifications
Design
Implementation
Test
Prototype
Final product
What do I need now?
What will I need in the future?
What may I need in the future
How to implement what I need,
will need and may need?
How to ensure I will be able to
reuse/extend my code?
How to do what I need now?
Is everything working as expected?
Is something that worked before
now broken?
196. A Typical Design Problem
83
• Current need: two-opt local search for theVRP
• Data model
• How to represent an instance, customer, solution, route?
• Optimization algorithm
• How to represent
• A local search?
• A neighborhood?
• A move?
• How to check the feasibility of a move?
197. A First Design
84
Instance
int[]
customers
double[]
demands
double[][]
distances
int
fleetSize
double
vehicleCapacity
Solution
int[]<>
routes
double[]
loads
TwoOpt
twoOpt(Instance,Solution){
for
each
move:
check
if
feasible
evaluate
return
best
move
}
198. A First Design
84
Instance
int[]
customers
double[]
demands
double[][]
distances
int
fleetSize
double
vehicleCapacity
Solution
int[]<>
routes
double[]
loads
TwoOpt
twoOpt(Instance,Solution){
for
each
move:
check
if
feasible
evaluate
return
best
move
}
What if I now want to
solve theVRPTW?
199. A First Design
84
Instance
int[]
customers
double[]
demands
double[][]
distances
int
fleetSize
double
vehicleCapacity
Solution
int[]<>
routes
double[]
loads
TwoOpt
twoOpt(Instance,Solution){
for
each
move:
check
if
feasible
evaluate
return
best
move
}
What if I now want to
solve theVRPTW?
What if I want an
Or-Opt local search?
200. Some Design Tips
• Identify what is reusable
• For instance, logic common to all neighborhoods
• Separate clearly responsibilities
• An instance stores the data
• A solution stores a solution
• An objective function evaluates a solution and moves
• A constraint evaluates the feasibility of a solution or move
• Keep in mind possible extensions
• What other problems may I have to solve?
• Warning: avoid over-designing
85
201. Flexible And Extensible Designs
86
Neighborhood
Constraint<>
constraints
localSearch(instance,solution,objective){
for
each
move
in
listAllMoves(instance,solution):
for
each
constraint
in
constraints:
constraint.check(move)
objective.evaluate(instance,solution,move)
return
best
feasible
move
}
abstract
listAllMoves(instance,solution)
TwoOpt
listAllMoves(instance,solution)
{
...
}
202. Flexible And Extensible Designs
86
Neighborhood
Constraint<>
constraints
localSearch(instance,solution,objective){
for
each
move
in
listAllMoves(instance,solution):
for
each
constraint
in
constraints:
constraint.check(move)
objective.evaluate(instance,solution,move)
return
best
feasible
move
}
abstract
listAllMoves(instance,solution)
TwoOpt
listAllMoves(instance,solution)
{
...
}
OrOpt
listAllMoves(instance,solution)
{
...
}
205. Designing Tools
88
• Create UML diagrams to model the organization of the code
• Generate code from a model
• Once the design is stable
• Generate all the code skeleton in one click
• Generate a model from code (hazardous)
• Examples
• Visual Paradigm (free community edition)
• Enterprise Architect ($$$)
• Check with your university / computer science department
206. Implementation
• Document your code and use coherent conventions
• Explain what are the inputs, outputs, main steps
• Saves a lot of time when you have to come back to it
• Make your code reusable and extensible
• Use the benefits of object-oriented programming
• Spend time now, save time tomorrow
• Build on top of existing libraries
• Avoid reinventing the wheel
89
207. Testing
• Create simple test cases that check key functionalities
• Unit test cases
• E.g., check that the methods to manipulate a solutions are
working
• More elaborate test cases
• E.g., solution found by a 2-Opt neighborhood
• Profile your code to detect bottlenecks and memory leaks
90
208. Final product
Development Process
91
Define problem
Relaxation
(simplify the problem)
Select approach
Design & Implement
Test & Debug
Benchmark & Profile
Restore relaxation
Adjust parameters
Publish paper!
Prototype
214. VRPRep Instance Repository
97
• VRPREP Website: http://rhodes.ima.uco.fr/vrprep/web/home
• XML schema to describe most vehicle routing problems
• Easy to read for your program
• XML data binding: creates the objects for you
• Repository of exiting instances
• Possibility to define your own problem
• Tool to generate a sample XML file
• Upload your instances
216. Today We Have Seen ...
99
• Introduction
• What is complexity? Why is it important?
• Data structures
• How to represent a solution efficiently?
• Algorithmic tricks
• What are the main bottlenecks and how to avoid them?
• Parallelization
• How do parallel computing work? Why, when, and how to
parallelize?
• Software engineering
• How to design flexible and reusable code?
• Resources
• How to avoid reinventing the wheel?
217. Take Away
1. Developing efficient optimization algorithms requires careful
software engineering
✓ Complexity of the problems at hand
✓ Efficient data structures,Algorithmic tricks, Parallelization
2. Invest in developing flexible and extensible code
✓ Detailed design, Documentation
✓ Will save you time later
3. Use existing libraries and share your code
✓ Do not reinvent the wheel
✓ Help others (good for your resumé too)
100
218. Discrete Optimization
101
• PascalVan Hentenryck - http://www.coursera.org/course/optimization
• Online community of thousands of students
• Topics
• Dynamic programming
• Constraint programming
• Local search
• Linear programming
• Join the challenge to solveTSPs andVRPs!
219. 2nd International Optimisation Summer School
• 12th to 17th January 2014, Kioloa, NSW,Australia
• http://www.cse.unsw.edu.au/~tw/school/
• Lectures
• Constraint programming, Integer programming, Column generation
• Modelling
• Uncertainty
• Vehicle routing, Scheduling, Supply networks
• Research skills.
102
220. NICTA IS DEDICATED TO RESEARCH
EXCELLENCE IN ICT AND WEALTH
CREATION FOR AUSTRALIA
NICTA IS AUSTRALIA'S PRE-EMINENT NATIONAL
ICT RESEARCH CENTRE OF EXCELLENCE
AUSTRALIA’S ICT PHD FACTORY
CONNECTING SMALL BUSINESS AND COMMERCIALISING
TECHNOLOGY
* NICTA IS
CREATING NEW BUSINESSES
* NICTA IS
TRANSFORMING INDUSTRY
* NICTA IS
BUILDING SKILLS AND CAPACITY
FOR THE DIGITAL ECONOMY
NICTA TECHNOLOGY IS IN 1.5 BILLION MOBILE
PHONES AROUND THE WORLD
17PARTNER
UNIVERSITIES
700OF THE BEST ICT
SCIENTISTS AND STUDENTS.
BRISBANE
SYDNEY
CANBERRA
MELBOURNE
Crash-proof code: “One of the
world’s top 10 technologies”: MIT.
Making Amazon's business
more secure in the cloud.
A global leader in digital audio networking, used at the
London 2012 Olympics and the Queen's Jubilee Concert.
Revolutionising pain management
through the use of implants in the
spinal cord.
25% OF
ICT PHD STUDENTS
IN AUSTRALIA
340 graduates,
260 enrolled students, that’s:
NICTA is working with schools to
promote opportunities in ICT.
Fleet logistics helping to
save 15% of transport costs
for Tip Top.
Big Data analytics for the
Australian finance sector.
AUDINATE
OPEN KERNEL LABS
SALUDA MEDICAL
Optimising freight pick-ups and
deliveries across Australia.
OPTURION
Reduce roadside maintenance
costs by $60m using computer
vision for Sensis.
Providing cutting edge services and applications
for the NBN.
INCREASING
PRODUCTIVITY
AND PROFIT.
Helping diagnose and treat
prostate cancer for Peter Mac
Cancer Centre, and helping to
build the bionic eye.
Saved water costs by 30%
for the Australian dairy
industry.
YURUWARE
who have gone onto work for:
11 spin-outs, 5 more in the pipeline... creating
highly skilled jobs for Australians.
FOR MORE INFORMATION,
CONTACT US ON
INFO@NICTA.COM.AU
including UNSW, UoM,
USYD, MOnash, and ANU.
* INFRASTRUCTURE
* FINANCE
* AGRICULTURE
* TRANSPORT
* MEDICINE
221. References
104
• Crainic,T., Parallel Solution Methods forVehicle Routing Problems,TheVehicle Routing Problem: Latest
Advances and New Challenges, Operations Research/Computer Science InterfacesVolume 43, 2008, pp
171-198
• Groer, C.; Golden, B. & Wasil, E.,A library of local search heuristics for the vehicle routing problem,
Mathematical Programming Computation, Springer Berlin / Heidelberg, 2010, 2, 79-101
• Irnich, S., Funke, B., & Grünert,T. (2006). Sequential search and its application to vehicle-routing problems.
Computers & Operations Research, 33(8), 2405-2429.
• Lysgaard, J.,(2004).CVRPSP:A package of separation routines for the capacitated vehicle routing problem,
Working Paper 03–04.
• Pillac,V.; Guéret, C. & Medaglia,A. L. (2012). A parallel matheuristic for theTechnician Routing and Scheduling
Problem, Optimization Letters, doi:10.1007/s11590-012-0567-4
• Savelsbergh, M. (1992).The vehicle routing problem with time windows: minimizing route duration. INFORMS,
4(2):146–154, doi:10.1287/ijoc.4.2.146.
• Toth, P. andVigo, D. (2003). The GranularTabu Search and Its Application to theVehicle-Routing Problem,
INFORMS Journal on Computing15, 333-346;
• Zachariadis, E. E., & Kiranoudis, C.T. (2010).A strategy for reducing the computational complexity of local
search-based methods for the vehicle routing problem. Computers & Operations Research, 37(12),
2089-2105.