Two methods are described for optimizing cognitive model parameters: differential evolution (DE) and high-throughput computing with HTCondor. DE is a genetic algorithm that uses a population of models to explore the parameter space in parallel. It is well-suited for models with few parameters or short run times. HTCondor allows running a population of models over a computer network, making it suitable for larger, more complex models or simulating many participants. Examples of using each method with an ACT-R paired associate model are provided.
I am Jack U. I am a Matlab Assignment Expert at matlabassignmentexperts.com. I hold a Ph.D. in Matlab, Middlesex University, UK. I have been helping students with their homework for the past 8 years. I solve assignments related to Data Analysis.
Visit matlabassignmentexperts.com or email info@matlabassignmentexperts.com. You can also call on +1 678 648 4277 for any assistance with Data Analysis Assignments.
Machine learning and linear regression programmingSoumya Mukherjee
Overview of AI and ML
Terminology awareness
Applications in real world
Use cases within Nokia
Types of Learning
Regression
Classification
Clustering
Linear Regression Single Variable with python
This project is based on programming with R. Use a huge dataset from Kaggle with purpose to predict the survival of the passengers of the Titanic with incomplete data.
Lab 2: Classification and Regression Prediction Models, training and testing ...Yao Yao
https://github.com/yaowser/data_mining_group_project
https://www.kaggle.com/c/zillow-prize-1/data
From the Zillow real estate data set of properties in the southern California area, conduct the following data cleaning, data analysis, predictive analysis, and machine learning algorithms:
Lab 2: Classification and Regression Prediction Models, training and testing splits, optimization of K Nearest Neighbors (KD tree), optimization of Random Forest, optimization of Naive Bayes (Gaussian), advantages and model comparisons, feature importance, Feature ranking with recursive feature elimination, Two dimensional Linear Discriminant Analysis
I am Jack U. I am a Matlab Assignment Expert at matlabassignmentexperts.com. I hold a Ph.D. in Matlab, Middlesex University, UK. I have been helping students with their homework for the past 8 years. I solve assignments related to Data Analysis.
Visit matlabassignmentexperts.com or email info@matlabassignmentexperts.com. You can also call on +1 678 648 4277 for any assistance with Data Analysis Assignments.
Machine learning and linear regression programmingSoumya Mukherjee
Overview of AI and ML
Terminology awareness
Applications in real world
Use cases within Nokia
Types of Learning
Regression
Classification
Clustering
Linear Regression Single Variable with python
This project is based on programming with R. Use a huge dataset from Kaggle with purpose to predict the survival of the passengers of the Titanic with incomplete data.
Lab 2: Classification and Regression Prediction Models, training and testing ...Yao Yao
https://github.com/yaowser/data_mining_group_project
https://www.kaggle.com/c/zillow-prize-1/data
From the Zillow real estate data set of properties in the southern California area, conduct the following data cleaning, data analysis, predictive analysis, and machine learning algorithms:
Lab 2: Classification and Regression Prediction Models, training and testing splits, optimization of K Nearest Neighbors (KD tree), optimization of Random Forest, optimization of Naive Bayes (Gaussian), advantages and model comparisons, feature importance, Feature ranking with recursive feature elimination, Two dimensional Linear Discriminant Analysis
In machine learning, model selection is a bit more nuanced than simply picking the 'right' or 'wrong' algorithm. In practice, the workflow includes (1) selecting and/or engineering the smallest and most predictive feature set, (2) choosing a set of algorithms from a model family, and (3) tuning the algorithm hyperparameters to optimize performance. Recently, much of this workflow has been automated through grid search methods, standardized APIs, and GUI-based applications. In practice, however, human intuition and guidance can more effectively hone in on quality models than exhaustive search.
This talk presents a new open source Python library, Yellowbrick (scikit-yb.org), which extends the Scikit-Learn API with a visual transfomer (visualizer) that can incorporate visualizations of the model selection process into pipelines and modeling workflow. Visualizers enable machine learning practitioners to visually interpret the model selection process, steer workflows toward more predictive models, and avoid common pitfalls and traps. For users, Yellowbrick can help evaluate the performance, stability, and predictive value of machine learning models, and assist in diagnosing problems throughout the machine learning workflow.
These slides provide an overview of the most popular approaches up to date to solve the task of object detection with deep neural networks. It reviews both the two stages approaches such as R-CNN, Fast R-CNN and Faster R-CNN, and one-stage approaches such as YOLO and SSD. It also contains pointers to relevant datasets (Pascal, COCO, ILSRVC, OpenImages) and the definition of the Average Precision (AP) metric.
Full program:
https://www.talent.upc.edu/ing/estudis/formacio/curs/310400/postgraduate-course-artificial-intelligence-deep-learning/
Abstract: This PDSG workshop introduces basic concepts of the grandfather of neural networks - the Perceptron. Concepts covered are history, algorithm and limitations.
Level: Fundamental
Requirements: No prior programming or statistics knowledge required.
The ABC of Implementing Supervised Machine Learning with Python.pptxRuby Shrestha
It is to our fact that machine learning has taken a significant height. However, knowing and understanding how small problems can be solved from a machine learning perspective is necessary to form a good base, appreciate the process of implementation and get started in this domain. Therefore, in this post, I would like to talk about the ABC of implementing Supervised Machine Learning with Python by navigating through a simple example, which is, adding two numbers. So, to put it in simple terms, I would like to make a machine learn to add. This can be put in other words; I would like to develop a predictive model that can add. Sounds simple, right? View the presentation for more details.
Yellowbrick: Steering machine learning with visual transformersRebecca Bilbro
In machine learning, model selection is a bit more nuanced than simply picking the 'right' or 'wrong' algorithm. In practice, the workflow includes (1) selecting and/or engineering the smallest and most predictive feature set, (2) choosing a set of algorithms from a model family, and (3) tuning the algorithm hyperparameters to optimize performance. Recently, much of this workflow has been automated through grid search methods, standardized APIs, and GUI-based applications. In practice, however, human intuition and guidance can more effectively hone in on quality models than exhaustive search.
This talk presents a new Python library, Yellowbrick, which extends the Scikit-Learn API with a visual transfomer (visualizer) that can incorporate visualizations of the model selection process into pipelines and modeling workflow. Yellowbrick is an open source, pure Python project that extends Scikit-Learn with visual analysis and diagnostic tools. The Yellowbrick API also wraps matplotlib to create publication-ready figures and interactive data explorations while still allowing developers fine-grain control of figures. For users, Yellowbrick can help evaluate the performance, stability, and predictive value of machine learning models, and assist in diagnosing problems throughout the machine learning workflow.
In this talk, we'll explore not only what you can do with Yellowbrick, but how it works under the hood (since we're always looking for new contributors!). We'll illustrate how Yellowbrick extends the Scikit-Learn and Matplotlib APIs with a new core object: the Visualizer. Visualizers allow visual models to be fit and transformed as part of the Scikit-Learn Pipeline process - providing iterative visual diagnostics throughout the transformation of high dimensional data.
I am Walker D. I am a Civil and Environmental Engineering assignment Expert at statisticsassignmenthelp.com. I hold a Ph.D. in Civil and Environmental Engineering. I have been helping students with their homework for the past 8 years. I solve assignments related to Civil and Environmental Engineering Assignment. Visit statisticsassignmenthelp.com or email info@statisticsassignmenthelp.com.
You can also call on +1 678 648 4277 for any assistance with Civil and Environmental Engineering assignments.
Learning machine learning with YellowbrickRebecca Bilbro
Yellowbrick is an open source Python library that provides visual diagnostic tools called “Visualizers” that extend the Scikit-Learn API to allow human steering of the model selection process. For teachers and students of machine learning, Yellowbrick can be used as a framework for teaching and understanding a large variety of algorithms and methods.
Accounting for uncertainty in species delineation during the analysis of envi...methodsecolevol
Tutorial accompanying the paper of the same name, published in Methods in Ecology and Evolution
Full paper
http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00122.x/abstract
Approaches to online quantile estimationData Con LA
Data Con LA 2020
Description
This talk will explore and compare several compact data structures for estimation of quantiles on streams, including a discussion of how they balance accuracy against computational resource efficiency. A new approach providing more flexibility in specifying how computational resources should be expended across the distribution will also be explained. Quantiles (e.g., median, 99th percentile) are fundamental summary statistics of one-dimensional distributions. They are particularly important for SLA-type calculations and characterizing latency distributions, but unlike their simpler counterparts such as the mean and standard deviation, their computation is somewhat more expensive. The increasing importance of stream processing (in observability and other domains) and the impossibility of exact online quantile calculation together motivate the construction of compact data structures for estimation of quantiles on streams. In this talk we will explore and compare several such data structures (e.g., moment-based, KLL sketch, t-digest) with an eye towards how they balance accuracy against resource efficiency, theoretical guarantees, and desirable properties such as mergeability. We will also discuss a recent variation of the t-digest which provides more flexibility in specifying how computational resources should be expended across the distribution. No prior knowledge of the subject is assumed. Some familiarity with the general problem area would be helpful but is not required.
Speaker
Joe Ross, Splunk, Principal Data Scientist
In machine learning, model selection is a bit more nuanced than simply picking the 'right' or 'wrong' algorithm. In practice, the workflow includes (1) selecting and/or engineering the smallest and most predictive feature set, (2) choosing a set of algorithms from a model family, and (3) tuning the algorithm hyperparameters to optimize performance. Recently, much of this workflow has been automated through grid search methods, standardized APIs, and GUI-based applications. In practice, however, human intuition and guidance can more effectively hone in on quality models than exhaustive search.
This talk presents a new open source Python library, Yellowbrick (scikit-yb.org), which extends the Scikit-Learn API with a visual transfomer (visualizer) that can incorporate visualizations of the model selection process into pipelines and modeling workflow. Visualizers enable machine learning practitioners to visually interpret the model selection process, steer workflows toward more predictive models, and avoid common pitfalls and traps. For users, Yellowbrick can help evaluate the performance, stability, and predictive value of machine learning models, and assist in diagnosing problems throughout the machine learning workflow.
These slides provide an overview of the most popular approaches up to date to solve the task of object detection with deep neural networks. It reviews both the two stages approaches such as R-CNN, Fast R-CNN and Faster R-CNN, and one-stage approaches such as YOLO and SSD. It also contains pointers to relevant datasets (Pascal, COCO, ILSRVC, OpenImages) and the definition of the Average Precision (AP) metric.
Full program:
https://www.talent.upc.edu/ing/estudis/formacio/curs/310400/postgraduate-course-artificial-intelligence-deep-learning/
Abstract: This PDSG workshop introduces basic concepts of the grandfather of neural networks - the Perceptron. Concepts covered are history, algorithm and limitations.
Level: Fundamental
Requirements: No prior programming or statistics knowledge required.
The ABC of Implementing Supervised Machine Learning with Python.pptxRuby Shrestha
It is to our fact that machine learning has taken a significant height. However, knowing and understanding how small problems can be solved from a machine learning perspective is necessary to form a good base, appreciate the process of implementation and get started in this domain. Therefore, in this post, I would like to talk about the ABC of implementing Supervised Machine Learning with Python by navigating through a simple example, which is, adding two numbers. So, to put it in simple terms, I would like to make a machine learn to add. This can be put in other words; I would like to develop a predictive model that can add. Sounds simple, right? View the presentation for more details.
Yellowbrick: Steering machine learning with visual transformersRebecca Bilbro
In machine learning, model selection is a bit more nuanced than simply picking the 'right' or 'wrong' algorithm. In practice, the workflow includes (1) selecting and/or engineering the smallest and most predictive feature set, (2) choosing a set of algorithms from a model family, and (3) tuning the algorithm hyperparameters to optimize performance. Recently, much of this workflow has been automated through grid search methods, standardized APIs, and GUI-based applications. In practice, however, human intuition and guidance can more effectively hone in on quality models than exhaustive search.
This talk presents a new Python library, Yellowbrick, which extends the Scikit-Learn API with a visual transfomer (visualizer) that can incorporate visualizations of the model selection process into pipelines and modeling workflow. Yellowbrick is an open source, pure Python project that extends Scikit-Learn with visual analysis and diagnostic tools. The Yellowbrick API also wraps matplotlib to create publication-ready figures and interactive data explorations while still allowing developers fine-grain control of figures. For users, Yellowbrick can help evaluate the performance, stability, and predictive value of machine learning models, and assist in diagnosing problems throughout the machine learning workflow.
In this talk, we'll explore not only what you can do with Yellowbrick, but how it works under the hood (since we're always looking for new contributors!). We'll illustrate how Yellowbrick extends the Scikit-Learn and Matplotlib APIs with a new core object: the Visualizer. Visualizers allow visual models to be fit and transformed as part of the Scikit-Learn Pipeline process - providing iterative visual diagnostics throughout the transformation of high dimensional data.
I am Walker D. I am a Civil and Environmental Engineering assignment Expert at statisticsassignmenthelp.com. I hold a Ph.D. in Civil and Environmental Engineering. I have been helping students with their homework for the past 8 years. I solve assignments related to Civil and Environmental Engineering Assignment. Visit statisticsassignmenthelp.com or email info@statisticsassignmenthelp.com.
You can also call on +1 678 648 4277 for any assistance with Civil and Environmental Engineering assignments.
Learning machine learning with YellowbrickRebecca Bilbro
Yellowbrick is an open source Python library that provides visual diagnostic tools called “Visualizers” that extend the Scikit-Learn API to allow human steering of the model selection process. For teachers and students of machine learning, Yellowbrick can be used as a framework for teaching and understanding a large variety of algorithms and methods.
Accounting for uncertainty in species delineation during the analysis of envi...methodsecolevol
Tutorial accompanying the paper of the same name, published in Methods in Ecology and Evolution
Full paper
http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00122.x/abstract
Approaches to online quantile estimationData Con LA
Data Con LA 2020
Description
This talk will explore and compare several compact data structures for estimation of quantiles on streams, including a discussion of how they balance accuracy against computational resource efficiency. A new approach providing more flexibility in specifying how computational resources should be expended across the distribution will also be explained. Quantiles (e.g., median, 99th percentile) are fundamental summary statistics of one-dimensional distributions. They are particularly important for SLA-type calculations and characterizing latency distributions, but unlike their simpler counterparts such as the mean and standard deviation, their computation is somewhat more expensive. The increasing importance of stream processing (in observability and other domains) and the impossibility of exact online quantile calculation together motivate the construction of compact data structures for estimation of quantiles on streams. In this talk we will explore and compare several such data structures (e.g., moment-based, KLL sketch, t-digest) with an eye towards how they balance accuracy against resource efficiency, theoretical guarantees, and desirable properties such as mergeability. We will also discuss a recent variation of the t-digest which provides more flexibility in specifying how computational resources should be expended across the distribution. No prior knowledge of the subject is assumed. Some familiarity with the general problem area would be helpful but is not required.
Speaker
Joe Ross, Splunk, Principal Data Scientist
Heuristic design of experiments w meta gradient searchGreg Makowski
Once you have started learning about predictive algorithms, and the basic knowledge discovery in databases process, what is the next level of detail to learn for a consulting project?
* Give examples of the many model training parameters
* Track results in a "model notebook"
* Use a model metric that combines both accuracy and generalization to rank models
* How to strategically search over the model training parameters - use a gradient descent approach
* One way to describe an arbitrarily complex predictive system is by using sensitivity analysis
This tutor introduces the basic idea of machine learning with a very simple example. Machine learning teaches machines (and me too) to learn to carry out tasks and concepts by themselves. It is that simple, so here is an overview:
http://www.softwareschule.ch/examples/machinelearning.jpg
A Tale of Data Pattern Discovery in ParallelJenny Liu
In the era of IoTs and A.I., distributed and parallel computing is embracing big data driven and algorithm focused applications and services. With rapid progress and development on parallel frameworks, algorithms and accelerated computing capacities, it still remains challenging on deliver an efficient and scalable data analysis solution. This talk shares a research experience on data pattern discovery in domain applications. In particular, the research scrutinizes key factors in analysis workflow design and data parallelism improvement on cloud.
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)byteLAKE
See our presentation from the 6th International EULAG Users Workshop. We talked about taking HPC to the "Industry 4.0" by implementing smart techniques to optimize the codes in terms of performance and energy consumption. It explains how Machine Learning can dynamically optimize HPC simulations and byteLAKE's software autotuning solution.
Find out more about byteLAKE at: www.byteLAKE.com
A TALE of DATA PATTERN DISCOVERY IN PARALLELJenny Liu
In the era of IoTs and A.I., distributed and parallel computing is embracing big data driven and algorithm focused applications and services. With rapid progress and development on parallel frameworks, algorithms and accelerated computing capacities, it still remains challenging on deliver an efficient and scalable data analysis solution. This talk shares a research experience on data pattern discovery in domain applications. In particular, the research scrutinizes key factors in analysis workflow design and data parallelism improvement on cloud.
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Spark Summit
Recent workload trends indicate rapid growth in the deployment of machine learning, genomics and scientific workloads using Apache Spark. However, efficiently running these applications on
cloud computing infrastructure like Amazon EC2 is challenging and we find that choosing the right hardware configuration can significantly
improve performance and cost. The key to address the above challenge is having the ability to predict performance of applications under
various resource configurations so that we can automatically choose the optimal configuration. We present Ernest, a performance prediction
framework for large scale analytics. Ernest builds performance models based on the behavior of the job on small samples of data and then
predicts its performance on larger datasets and cluster sizes. Our evaluation on Amazon EC2 using several workloads shows that our prediction error is low while having a training overhead of less than 5% for long-running jobs.
This paper explores the effectiveness of the recently devel- oped surrogate modeling method, the Adaptive Hybrid Functions (AHF), through its application to complex engineered systems design. The AHF is a hybrid surrogate modeling method that seeks to exploit the advantages of each component surrogate. In this paper, the AHF integrates three component surrogate mod- els: (i) the Radial Basis Functions (RBF), (ii) the Extended Ra- dial Basis Functions (E-RBF), and (iii) the Kriging model, by characterizing and evaluating the local measure of accuracy of each model. The AHF is applied to model complex engineer- ing systems and an economic system, namely: (i) wind farm de- sign; (ii) product family design (for universal electric motors); (iii) three-pane window design; and (iv) onshore wind farm cost estimation. We use three differing sampling techniques to inves- tigate their influence on the quality of the resulting surrogates. These sampling techniques are (i) Latin Hypercube Sampling
∗Doctoral Student, Multidisciplinary Design and Optimization Laboratory, Department of Mechanical, Aerospace and Nuclear Engineering, ASME student member.
†Distinguished Professor and Department Chair. Department of Mechanical and Aerospace Engineering, ASME Lifetime Fellow. Corresponding author.
‡Associate Professor, Department of Mechanical Aerospace and Nuclear En- gineering, ASME member (LHS), (ii) Sobol’s quasirandom sequence, and (iii) Hammers- ley Sequence Sampling (HSS). Cross-validation is used to evalu- ate the accuracy of the resulting surrogate models. As expected, the accuracy of the surrogate model was found to improve with increase in the sample size. We also observed that, the Sobol’s and the LHS sampling techniques performed better in the case of high-dimensional problems, whereas the HSS sampling tech- nique performed better in the case of low-dimensional problems. Overall, the AHF method was observed to provide acceptable- to-high accuracy in representing complex design systems.
Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...Pluribus One
Many modern face verification algorithms use a small set of reference templates to save memory and computa- tional resources. However, both the reference templates and the combination of the corresponding matching scores are heuristically chosen. In this paper, we propose a well- principled approach, named sparse support faces, that can outperform state-of-the-art methods both in terms of recog- nition accuracy and number of required face templates, by jointly learning an optimal combination of matching scores and the corresponding subset of face templates. For each client, our method learns a support vector machine using the given matching algorithm as the kernel function, and de- termines a set of reference templates, that we call support faces, corresponding to its support vectors. It then dras- tically reduces the number of templates, without affecting recognition accuracy, by learning a set of virtual faces as well-principled transformations of the initial support faces. The use of a very small set of support face templates makes the decisions of our approach also easily interpretable for designers and end users of the face verification system.
Cost Optimized Design Technique for Pseudo-Random Numbers in Cellular Automataijait
In this research work, we have put an emphasis on the cost effective design approach for high quality pseudo-random numbers using one dimensional Cellular Automata (CA) over Maximum Length CA. This work focuses on different complexities e.g., space complexity, time complexity, design complexity and searching complexity for the generation of pseudo-random numbers in CA. The optimization procedure for
these associated complexities is commonly referred as the cost effective generation approach for pseudorandom numbers. The mathematical approach for proposed methodology over the existing maximum length CA emphasizes on better flexibility to fault coverage. The randomness quality of the generated patterns for the proposed methodology has been verified using Diehard Tests which reflects that the randomness quality
achieved for proposed methodology is equal to the quality of randomness of the patterns generated by the maximum length cellular automata. The cost effectiveness results a cheap hardware implementation for the concerned pseudo-random pattern generator. Short version of this paper has been published in [1].
Multiple representations and visual mental imagery in artificial cognitive sy...University of Huddersfield
Invited talk to the Socio-Technical Systems Research Group in the School of Computing Science and Digital Media, Robert Gordon University. March 21, 2018
Diagrammatic Cognition: Discovery and Design workshop, Humboldt University, B...University of Huddersfield
This workshop is designed to integrate a wide variety of cognitive science perspectives on the roles diagrams play in cognition, addressing various ways in which people design and use diagrams to spatialize thought and make it public, to work through ideas and clarify thinking, to reduce working memory load, to communicate ideas to others, to promote collaborative work by providing an external representation that can be pointed to and animated by gestures and collectively revised. The morning session (talks by Tversky, Healey, and Kirsh) will focus on creating and diagrams and using them to coordinate various activities, the afternoon (talks by Bechtel, Cheng, and Hegarty) will examine uses of diagrams in science. Both session will also include blitz talks presenting one major idea; scholars who would like to present blitz talks should contact the organizer.
http://mechanism.ucsd.edu/diagrammaticcognition.html
A cognitive architecture-based modelling approach to understanding biases in ...University of Huddersfield
Title: "A cognitive architecture-based modelling approach to
understanding biases in visualisation behaviour". A talk given at the "Dealing with Cognitive Biases in Visualisations (DECISIVe 2014) workshop at IEEE VIS, Paris, November 2014.
Title: "Sources of bias when working with visualisations". Introduction to the "Dealing with Cognitive Biases in Visualisations (DECISIVe 2014) workshop at IEEE VIS, Paris, November 2014.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Two methods for optimising cognitive model parameters
1. Two methods for optimising
cognitive model parameters
David Peebles
—
9th August 2016
2. The problem
Searching parameter spaces can be laborious and time
consuming – particularly if done iteratively by hand
It’s difficult to know if you have an optimal solution
Worse when model has several interacting parameters, is
non-differentiable, non-continuous, non-linear, stochastic, or
has many local optima.
Guilt and shame < boredom and frustration.
3. One solution, two methods
Use a population of models to explore the parameter space.
Differential evolution (DE)
Genetic algorithm for multidimensional real-valued functions.
Ideal for optimising models with relatively few parameters or
short run times on a single computer.
High-throughput computing with HTCondor
Open source software for managing job scheduling on
computer clusters or cycle scavenging idle computers.
Ideal for larger, more complex models or when simulating the
behaviour of many participants.
Used by many universities and easy to set up on local
networks (or so I’m told).
4. Differential evolution
An evolutionary strategy for real numbers used for
multidimensional numerical optimisation [2, 3].
Attractions:
- Simplicity of the algorithm
- Only three control parameters
- Fast, accurate, and robust performance
- Wide applicability
Control parameters:
NP the population size
F a scale factor applied to the mutation process
Cr a constant that regulates the crossover process
5. The DE algorithm in a nutshell
DE applies repeated cycles of mutation, recombination, and
selection on an initial, randomly generated population of vectors to
create a single vector that produces the best solution to a problem.
RecombinationInitialisation Mutation Selection
6. Initialisation
RecombinationInitialisation Mutation Selection
Create a population of real-valued vectors – one dimension
for each of the model parameters to be optimised.
Initialise vectors with uniformly distributed random numbers
within maximum and minimum bounds set for each dimension.
(:rt −5.0 −0.01) (:lf 0.1 1.5) (:ans 0.1 0.8)
To create the next population of vectors, each vector i in the
current population is selected in sequence, designated as the
target vector, and subjected to a competitive process. . .
7. Mutation
RecombinationInitialisation Mutation Selection
Mutation is the weighted difference between two vectors in the
current population.
A mutated donor vector is created by randomly selecting three
unique vectors, j, k and l from the population and adding the
difference between j and k (scaled by the F parameter) to l.
}=
j
k
l
}=
Donor vector
Sum
(weighted by F)
Difference
This ensures that differences in the scale and sensitivity of
different vector parameters are taken into account and that
the search space is explored equally on all dimensions.
8. Recombination
RecombinationInitialisation Mutation Selection
Cross the donor vector with the target vector to create the trial
vector to allow successful solutions from the previous
generation to be incorporated into the trial vector.
}=
Trial vector
Crossover
Donor
vector
vector
Target
Achieved by series of Bernoulli trials which determine for each
dimension which parent will donate its value.
Moderated by the crossover rate parameter Cr (0 ≤ Cr ≤ 1.).
9. Selection
RecombinationInitialisation Mutation Selection
Run the model with the parameter values from the trial vector.
If fitnesstrial ≥ fitnesstarget replace target vector with trial vector
in next generation, else retain target vector.
Repeat for each vector in the current population until the next
population is created and the evolutionary process continues
for a user-defined number of cycles.
Retain vector with highest fitness in each population.
Winning vector in the final population is considered the best
solution to the problem.
10. Paired Associate model example
1. Load (a) ACT-R, (b) DE code file, and (c) the fan effect model.
2. Modify the output function to just return one correlation.
3. Define a function to run the task suppressing the model output
and any warnings resulting from poor parameter choices.
4. Specify the three parameters to adjust with their ranges and
use the default DE configuration.
(defun output-data (data n)
(let ((probability (mapcar (lambda (x) (/ (first x) n)) data))
(latency (mapcar (lambda (x) (/ (or (second x) 0) n)) data)))
;;(correlation latency *paired-latencies*)
(correlation probability *paired-probability*)))
(defun run-paired-with-no-output ()
(let ((*standard-output* (make-string-output-stream)))
(suppress-warnings (paired-experiment 100))))
(optim ’run-paired-with-no-output ’((:rt -5.0 -0.01) (:lf 0.1 1.5) (:ans 0.1 0.8)))
11. HTCondor
DE useful for models with relatively few parameters or short
run times on a single computer.
DE too slow for large, complex models or when simulating the
behaviour of many participants.
HTCondor Pool
HPC Resources
Cloud Resources
HTCondor
Job
Manager
Users
Run a population of models over a network using HTCondor.
HTCondor – open source, cross-platform software system for
managing and scheduling tasks over computer networks [1].
12. Running ACT-R models on HTCondor
1. Modify your model to allocate random values to parameters.
2. Create a submit description file specifying job details
(executable, platform, the model files, commands etc.).
requirements =(OpSys == “WINNT61” && Arch == “INTEL”)
(OpSys == “WINDOWS” && Arch == “INTEL”)
(OpSys == “WINDOWS” && Arch == “X86 64”))
executable = actr-s-64.exe
arguments = “-l ’paired.lisp’ -e ’(collect-data 20)’ -e ’(quit)’ ”
transfer executable = ALWAYS
transfer input files = paired.lisp
output = out.stdout.$(Cluster).$(Process)
error = out.err.$(Cluster).$(Process)
log = out.clog.$(Cluster).$(Process)
queue 100
13. Random number generation
A potential problem arises with how the pseudorandom
number generator (PRNG) seed is set.
Each model instance must be initialised with a unique seed.
Many Lisps (and the ACT-R random module) use the
computer’s clock to generate seed. Problem if different
instances are given the same clock time.
To avoid this, use an ACT-R PRNG that doesn’t using the
system clock, uses a random number generated from CCL
(the Lisp used to create the MS Windows standalone ACT-R).
This method is employed in a new ACT-R
create-random-module which is loaded prior to loading the
model file.
14. Why not use MindModeling@Home?
NOT a toothbrush thing
MAYBE a lack of knowledge thing
MAYBE a World Community Grid thing (Zika, cancer cures).
Some models are small/trivial
Speed, convenience and flexibility
15. Try them out
Code and instructions for using both methods using the paired
associate model from tutorial unit 4.
DE: https://github.com/peebz/actr-paired-de
HTCondor: https://github.com/peebz/actr-paired-htc
Thanks as always to Dan Bothell for his patience and advice.
16. References I
M. J. Litzkow, M. Livny, and M. W. Mutka. ‘Condor—A Hunter
of Idle Workstations’. In: Proceedings of the 8th International
Conference on Distributed Computing Systems. IEEE. June
1988, pp. 104–111.
R. Storn and K. Price. Differential evolution: A simple and
efficient adaptive scheme for global optimization over
continuous spaces. Tech. rep. TR-95-012. Berkeley, CA: ICSI
Berkeley, 1995.
R. Storn and K. Price. ‘Differential evolution: A simple and
efficient heuristic for global optimization over continuous
spaces’. In: Journal of global optimization 11.4 (1997),
pp. 341–359.