The document discusses how asymmetry helps load balancing by presenting the Always-Go-Left algorithm. It summarizes that:
1) The Always-Go-Left algorithm improves upon the uniform greedy algorithm by partitioning bins into groups and placing balls into bins with the fewest balls using an asymmetric tie-breaking rule.
2) The maximum load achieved by Always-Go-Left is related to Fibonacci numbers and is significantly less than the uniform greedy algorithm.
3) The combination of partitioning bins and an unfair tie-breaking rule is crucial to reducing maximum load, as partitioning alone or an asymmetric rule alone does not provide the same benefits.
K-Nearest Neighbor (KNN) is a supervised machine learning algorithm that can be used for classification and prediction. It finds the K closest training examples to a new data point and assigns the most common class among those K examples to the new data point. Euclidean distance is often used to calculate the distance between points. An example is provided of classifying a new paper sample as good or bad based on acid durability and strength attributes by finding its 3 nearest neighbors and assigning it the majority class.
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian ClassifiersPier Luca Lanzi
This document provides an overview of nearest neighbor and Bayesian classifiers for machine learning. It discusses how nearest neighbor classifiers work by finding the k closest training examples in attribute space and classifying new examples based on the class of its neighbors. It also explains naive Bayesian classification, which uses Bayes' theorem and assumes attribute independence to calculate class probabilities. Examples are given of how to apply these techniques to classification problems.
Ant colony optimization (ACO) is a heuristic optimization algorithm inspired by the foraging behavior of ants. It is used to find optimal paths in graph problems. The algorithm operates by simulating ants walking around the problem space, depositing and following pheromone trails. Over time, as ants discover short paths, the pheromone density increases on those paths, making them more desirable for future ants. This positive feedback eventually leads all ants to converge on the shortest path. ACO has been applied successfully to problems like the traveling salesman problem.
This document discusses the k-nearest neighbors (k-NN) algorithm. It begins by explaining the basic principles of k-NN, including that records close to each other in a data space will be of the same type. It then discusses issues with k-NN like computational expense, storage requirements, and performance with high-dimensional data. The document goes on to discuss techniques to improve k-NN, including condensing the training data set to reduce redundant points while retaining the decision boundary, and using proximity graphs and editing algorithms to further refine the training set.
This document discusses absorbing random walks on graphs. It explains that absorbing random walks can be used to compute the proximity of non-absorbing nodes to chosen absorbing nodes in a graph. The absorption probabilities provide a measure of how close each non-absorbing node is to the different absorbing nodes. These probabilities can be computed by iteratively updating the absorption probabilities of each non-absorbing node based on the probabilities of its neighbors. Absorbing random walks have applications in areas like information propagation, opinion formation, and semi-supervised learning.
This document discusses various uninformed search strategies including breadth-first search, uniform-cost search, depth-first search, depth-limited search, iterative deepening depth-first search, and bidirectional search. Breadth-first search expands the root node first and then all successors at each level, with complexity of O(bd) for graph search. Uniform-cost search expands the lowest cost node first without regard for depth. Depth-first search expands the deepest node first in a LIFO manner, using less memory than breadth-first search. Depth-limited search and iterative deepening depth-first search address issues with infinite paths. Bidirectional search runs simultaneous forward and backward searches from initial and goal states.
slide about balls-and-bins model
In this slide, it explain how to use balls-and-bins model for analyzing the performance of randomized algorithm.
For example, chaining hash, bloom filter, random graph, and Hamiltonian cycle problem
K-Nearest Neighbor (KNN) is a supervised machine learning algorithm that can be used for classification and prediction. It finds the K closest training examples to a new data point and assigns the most common class among those K examples to the new data point. Euclidean distance is often used to calculate the distance between points. An example is provided of classifying a new paper sample as good or bad based on acid durability and strength attributes by finding its 3 nearest neighbors and assigning it the majority class.
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian ClassifiersPier Luca Lanzi
This document provides an overview of nearest neighbor and Bayesian classifiers for machine learning. It discusses how nearest neighbor classifiers work by finding the k closest training examples in attribute space and classifying new examples based on the class of its neighbors. It also explains naive Bayesian classification, which uses Bayes' theorem and assumes attribute independence to calculate class probabilities. Examples are given of how to apply these techniques to classification problems.
Ant colony optimization (ACO) is a heuristic optimization algorithm inspired by the foraging behavior of ants. It is used to find optimal paths in graph problems. The algorithm operates by simulating ants walking around the problem space, depositing and following pheromone trails. Over time, as ants discover short paths, the pheromone density increases on those paths, making them more desirable for future ants. This positive feedback eventually leads all ants to converge on the shortest path. ACO has been applied successfully to problems like the traveling salesman problem.
This document discusses the k-nearest neighbors (k-NN) algorithm. It begins by explaining the basic principles of k-NN, including that records close to each other in a data space will be of the same type. It then discusses issues with k-NN like computational expense, storage requirements, and performance with high-dimensional data. The document goes on to discuss techniques to improve k-NN, including condensing the training data set to reduce redundant points while retaining the decision boundary, and using proximity graphs and editing algorithms to further refine the training set.
This document discusses absorbing random walks on graphs. It explains that absorbing random walks can be used to compute the proximity of non-absorbing nodes to chosen absorbing nodes in a graph. The absorption probabilities provide a measure of how close each non-absorbing node is to the different absorbing nodes. These probabilities can be computed by iteratively updating the absorption probabilities of each non-absorbing node based on the probabilities of its neighbors. Absorbing random walks have applications in areas like information propagation, opinion formation, and semi-supervised learning.
This document discusses various uninformed search strategies including breadth-first search, uniform-cost search, depth-first search, depth-limited search, iterative deepening depth-first search, and bidirectional search. Breadth-first search expands the root node first and then all successors at each level, with complexity of O(bd) for graph search. Uniform-cost search expands the lowest cost node first without regard for depth. Depth-first search expands the deepest node first in a LIFO manner, using less memory than breadth-first search. Depth-limited search and iterative deepening depth-first search address issues with infinite paths. Bidirectional search runs simultaneous forward and backward searches from initial and goal states.
slide about balls-and-bins model
In this slide, it explain how to use balls-and-bins model for analyzing the performance of randomized algorithm.
For example, chaining hash, bloom filter, random graph, and Hamiltonian cycle problem
The document discusses gradient descent methods for unconstrained convex optimization problems. It introduces gradient descent as an iterative method to find the minimum of a differentiable function by taking steps proportional to the negative gradient. It describes the basic gradient descent update rule and discusses convergence conditions such as Lipschitz continuity, strong convexity, and condition number. It also covers techniques like exact line search, backtracking line search, coordinate descent, and steepest descent methods.
This document discusses decision trees and their use for classification. It provides examples to illustrate key concepts:
- Decision trees classify instances by sorting them down the tree from root to leaf node, where each leaf represents a classification outcome. Nodes test attribute values and branches represent test outcomes.
- An example decision tree classifies whether to play golf based on weather attributes like temperature and humidity. It generates rules like "if sunny and humidity below 75% then play."
- Classification accuracy is measured by how many test instances the tree correctly classifies. Information gain is used to select the most informative attribute to split on at each node, improving classification.
This document provides an overview of genetic algorithms:
- It describes the biological inspiration from Darwin's theory of evolution and natural selection. Genetic algorithms mimic this process to optimize solutions.
- The core components of a genetic algorithm are explained - populations of chromosomes are evolved over generations using selection, crossover and mutation operators.
- Examples are given to demonstrate how genetic algorithms can be used to find the minimum of a function and solve a checkboard coloring problem.
- Potential applications to parameter optimization in molecular modeling are discussed.
This presentation discusses the state space problem formulation and different search techniques to solve these. Techniques such as Breadth First, Depth First, Uniform Cost and A star algorithms are covered with examples. We also discuss where such techniques are useful and the limitations.
The document discusses decision trees and their algorithms. It introduces decision trees, describing their structure as having root, internal, and leaf nodes. It then discusses Hunt's algorithm, the basis for decision tree induction algorithms like ID3 and C4.5. Hunt's algorithm grows a decision tree recursively by partitioning training records into purer subsets based on attribute tests. The document also covers methods for expressing test conditions based on attribute type, measures for selecting the best split like information gain, and advantages and disadvantages of decision trees.
The document discusses limitations of algorithms and methods for establishing lower bounds on algorithmic complexity. It covers four main topics: [1] efficiency classes and lower bounds, [2] decision trees for deriving lower bounds, [3] adversary arguments and problem reduction techniques for lower bounds, and [4] classifications of problem complexity including P, NP, NP-complete, and exponential time problems.
Learn from Example and Learn Probabilistic ModelJunya Tanaka
This document summarizes machine learning techniques including learning from examples, probabilistic modeling, and the EM algorithm. It covers nonparametric models, ensemble learning, statistical learning, maximum likelihood parameter estimation, density estimation, Bayesian parameter learning, and clustering with mixtures of Gaussians. The key points are that Bayesian learning calculates hypothesis probabilities given data, predictions average individual hypothesis predictions, and the EM algorithm alternates between expectation and maximization steps to handle hidden variables.
Mean shift clustering finds clusters by locating peaks in the probability density function of the data. It iteratively moves data points to the mean of nearby points until convergence. Hierarchical clustering builds clusters gradually by either merging or splitting clusters at each step. There are two types: divisive which splits clusters, and agglomerative which merges clusters. Agglomerative clustering starts with each point as a cluster and iteratively merges the closest pair of clusters until all are merged based on a chosen linkage method like complete or average linkage. The choice of distance metric and linkage method impacts the resulting clusters.
The document proposes a new multi-parent crossover operator for genetic algorithms. It selects four parents and creates two offspring by finding the midpoints between parents 1-2 and 3-4, then using difference vectors to determine discovery areas bounded by the parents from which to sample offspring values. The operator aims to maintain diversity better than BLX and UNDX operators while still converging efficiently. It is tested on single-objective and multi-objective problems, showing better optimization performance than the other operators.
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Maninda Edirisooriya
Decision Trees and Ensemble Methods is a different form of Machine Learning algorithm classes. This was one of the lectures of a full course I taught in University of Moratuwa, Sri Lanka on 2023 second half of the year.
Kohonen networks are a type of self-organizing map (SOM) neural network that is effective for clustering analysis. SOMs reduce the dimensionality of input data while preserving the topological properties of the input. Kohonen networks learn through competitive learning where output nodes compete to be activated by an input observation. The weights of the winning node and its neighbors are adjusted to be more similar to the input values. This allows the network to cluster similar input patterns together on the output map. The document provides a detailed example of how Kohonen networks work through competition, cooperation, and adaptation steps to cluster a sample dataset.
Hill climbing is a local search algorithm that starts with a random solution and iteratively makes small changes to improve the solution. It terminates when no further improvements can be made. Hill climbing can get stuck at local optima rather than finding the global optimum. Simulated annealing is similar to hill climbing but allows occasional "downhill moves" that worsen the solution based on a probability function involving the change in solution quality and temperature parameter. The temperature is gradually decreased, reducing the probability of downhill moves over time. This helps simulated annealing avoid local optima and find better solutions than hill climbing.
This document discusses decision trees and entropy. It begins by providing examples of binary and numeric decision trees used for classification. It then describes characteristics of decision trees such as nodes, edges, and paths. Decision trees are used for classification by organizing attributes, values, and outcomes. The document explains how to build decision trees using a top-down approach and discusses splitting nodes based on attribute type. It introduces the concept of entropy from information theory and how it can measure the uncertainty in data for classification. Entropy is the minimum number of questions needed to identify an unknown value.
1. Game playing is an important domain for artificial intelligence research as games provide formal reasoning problems that allow direct comparison between computer programs and humans.
2. Alpha-beta pruning can speed up minimax search in game trees by pruning branches that cannot alter the outcome. It works by maintaining lower and upper bounds on the score.
3. Evaluating leaf nodes is challenging. For chess, linear evaluation functions combining weighted features like material and position are commonly used, and reinforcement learning can help tune the weights.
Valencian Summer School 2015
Day 1
Lecture 3
Decision Trees
Gonzalo Martínez (UAM)
https://bigml.com/events/valencian-summer-school-in-machine-learning-2015
This document discusses solving Sudoku puzzles using constraint satisfaction techniques. It presents the objectives, which are to examine Sudoku as a constraint satisfaction problem and determine if puzzle symmetry affects solving time. It then provides details on constraint satisfaction problems, Sudoku rules, proposed solutions using backtracking search and forward checking, evaluation of solving times for different puzzle types and analysis of results. Symmetric puzzles were found to solve faster than asymmetric ones due to providing more constraints.
This document discusses various sorting algorithms and their complexities. It begins by defining an algorithm and complexity measures like time and space complexity. It then defines sorting and common sorting algorithms like bubble sort, selection sort, insertion sort, quicksort, and mergesort. For each algorithm, it provides a high-level overview of the approach and time complexity. It also covers sorting algorithm concepts like stable and unstable sorting. The document concludes by discussing future directions for sorting algorithms and their applications.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
The document discusses gradient descent methods for unconstrained convex optimization problems. It introduces gradient descent as an iterative method to find the minimum of a differentiable function by taking steps proportional to the negative gradient. It describes the basic gradient descent update rule and discusses convergence conditions such as Lipschitz continuity, strong convexity, and condition number. It also covers techniques like exact line search, backtracking line search, coordinate descent, and steepest descent methods.
This document discusses decision trees and their use for classification. It provides examples to illustrate key concepts:
- Decision trees classify instances by sorting them down the tree from root to leaf node, where each leaf represents a classification outcome. Nodes test attribute values and branches represent test outcomes.
- An example decision tree classifies whether to play golf based on weather attributes like temperature and humidity. It generates rules like "if sunny and humidity below 75% then play."
- Classification accuracy is measured by how many test instances the tree correctly classifies. Information gain is used to select the most informative attribute to split on at each node, improving classification.
This document provides an overview of genetic algorithms:
- It describes the biological inspiration from Darwin's theory of evolution and natural selection. Genetic algorithms mimic this process to optimize solutions.
- The core components of a genetic algorithm are explained - populations of chromosomes are evolved over generations using selection, crossover and mutation operators.
- Examples are given to demonstrate how genetic algorithms can be used to find the minimum of a function and solve a checkboard coloring problem.
- Potential applications to parameter optimization in molecular modeling are discussed.
This presentation discusses the state space problem formulation and different search techniques to solve these. Techniques such as Breadth First, Depth First, Uniform Cost and A star algorithms are covered with examples. We also discuss where such techniques are useful and the limitations.
The document discusses decision trees and their algorithms. It introduces decision trees, describing their structure as having root, internal, and leaf nodes. It then discusses Hunt's algorithm, the basis for decision tree induction algorithms like ID3 and C4.5. Hunt's algorithm grows a decision tree recursively by partitioning training records into purer subsets based on attribute tests. The document also covers methods for expressing test conditions based on attribute type, measures for selecting the best split like information gain, and advantages and disadvantages of decision trees.
The document discusses limitations of algorithms and methods for establishing lower bounds on algorithmic complexity. It covers four main topics: [1] efficiency classes and lower bounds, [2] decision trees for deriving lower bounds, [3] adversary arguments and problem reduction techniques for lower bounds, and [4] classifications of problem complexity including P, NP, NP-complete, and exponential time problems.
Learn from Example and Learn Probabilistic ModelJunya Tanaka
This document summarizes machine learning techniques including learning from examples, probabilistic modeling, and the EM algorithm. It covers nonparametric models, ensemble learning, statistical learning, maximum likelihood parameter estimation, density estimation, Bayesian parameter learning, and clustering with mixtures of Gaussians. The key points are that Bayesian learning calculates hypothesis probabilities given data, predictions average individual hypothesis predictions, and the EM algorithm alternates between expectation and maximization steps to handle hidden variables.
Mean shift clustering finds clusters by locating peaks in the probability density function of the data. It iteratively moves data points to the mean of nearby points until convergence. Hierarchical clustering builds clusters gradually by either merging or splitting clusters at each step. There are two types: divisive which splits clusters, and agglomerative which merges clusters. Agglomerative clustering starts with each point as a cluster and iteratively merges the closest pair of clusters until all are merged based on a chosen linkage method like complete or average linkage. The choice of distance metric and linkage method impacts the resulting clusters.
The document proposes a new multi-parent crossover operator for genetic algorithms. It selects four parents and creates two offspring by finding the midpoints between parents 1-2 and 3-4, then using difference vectors to determine discovery areas bounded by the parents from which to sample offspring values. The operator aims to maintain diversity better than BLX and UNDX operators while still converging efficiently. It is tested on single-objective and multi-objective problems, showing better optimization performance than the other operators.
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Maninda Edirisooriya
Decision Trees and Ensemble Methods is a different form of Machine Learning algorithm classes. This was one of the lectures of a full course I taught in University of Moratuwa, Sri Lanka on 2023 second half of the year.
Kohonen networks are a type of self-organizing map (SOM) neural network that is effective for clustering analysis. SOMs reduce the dimensionality of input data while preserving the topological properties of the input. Kohonen networks learn through competitive learning where output nodes compete to be activated by an input observation. The weights of the winning node and its neighbors are adjusted to be more similar to the input values. This allows the network to cluster similar input patterns together on the output map. The document provides a detailed example of how Kohonen networks work through competition, cooperation, and adaptation steps to cluster a sample dataset.
Hill climbing is a local search algorithm that starts with a random solution and iteratively makes small changes to improve the solution. It terminates when no further improvements can be made. Hill climbing can get stuck at local optima rather than finding the global optimum. Simulated annealing is similar to hill climbing but allows occasional "downhill moves" that worsen the solution based on a probability function involving the change in solution quality and temperature parameter. The temperature is gradually decreased, reducing the probability of downhill moves over time. This helps simulated annealing avoid local optima and find better solutions than hill climbing.
This document discusses decision trees and entropy. It begins by providing examples of binary and numeric decision trees used for classification. It then describes characteristics of decision trees such as nodes, edges, and paths. Decision trees are used for classification by organizing attributes, values, and outcomes. The document explains how to build decision trees using a top-down approach and discusses splitting nodes based on attribute type. It introduces the concept of entropy from information theory and how it can measure the uncertainty in data for classification. Entropy is the minimum number of questions needed to identify an unknown value.
1. Game playing is an important domain for artificial intelligence research as games provide formal reasoning problems that allow direct comparison between computer programs and humans.
2. Alpha-beta pruning can speed up minimax search in game trees by pruning branches that cannot alter the outcome. It works by maintaining lower and upper bounds on the score.
3. Evaluating leaf nodes is challenging. For chess, linear evaluation functions combining weighted features like material and position are commonly used, and reinforcement learning can help tune the weights.
Valencian Summer School 2015
Day 1
Lecture 3
Decision Trees
Gonzalo Martínez (UAM)
https://bigml.com/events/valencian-summer-school-in-machine-learning-2015
This document discusses solving Sudoku puzzles using constraint satisfaction techniques. It presents the objectives, which are to examine Sudoku as a constraint satisfaction problem and determine if puzzle symmetry affects solving time. It then provides details on constraint satisfaction problems, Sudoku rules, proposed solutions using backtracking search and forward checking, evaluation of solving times for different puzzle types and analysis of results. Symmetric puzzles were found to solve faster than asymmetric ones due to providing more constraints.
This document discusses various sorting algorithms and their complexities. It begins by defining an algorithm and complexity measures like time and space complexity. It then defines sorting and common sorting algorithms like bubble sort, selection sort, insertion sort, quicksort, and mergesort. For each algorithm, it provides a high-level overview of the approach and time complexity. It also covers sorting algorithm concepts like stable and unstable sorting. The document concludes by discussing future directions for sorting algorithms and their applications.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
it describes the bony anatomy including the femoral head , acetabulum, labrum . also discusses the capsule , ligaments . muscle that act on the hip joint and the range of motion are outlined. factors affecting hip joint stability and weight transmission through the joint are summarized.
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analytics” feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
3. Formal task
• Given n balls and n bins
• Randomly placing sequentially n balls into n bins
• Goal: minimize the maximum number of balls in the same bin
• balls are indistinguishable and global knowledge of previously
assigned balls is not available
4. Basic idea
• Each ball is placed into a bin chosen independently and
uniformly at random from the set of all bins
• The expected maximum load of
• Give the proof later
5. Better Idea
• Azar et al.[1994, 1999] suggest uniform greedy algorithm
• For each ball, choose d>= 2 locations independently and
uniformly at random from the set of bins
• Place the ball into the bin with the fewest ball
• Maximum load of only
• d=2 yields great improvement over one choice, larger d just a
smaller factor better than 2 choice
•
6. Three types of selection
– (1) uniform and independent
– (2) nonuniform and independent
• A nonuniform algorithm may choose the first location from
the bins 0 to n/2-1 and the second location from the bins
n/2 to n -1
– (3) nonuniform and dependent
• The second choice may depend on the first choice, if the
first location is i then the second location is
– Goal is to improve the uniform greedy algorithm
7. Three Classes of Algorithm
• [n]={0…n-1} denote the set of bins, 3 classes of algorithms
depending on how the sample location is chosen from the
probability space [n]^d
• Class 1: Uniform and independent. Each of the d locations of a
ball is chosen uniformly and independently at random from [n].
• Class 2: (Possibly) nonuniform and independent. For 1<=i<=d,
the ith location of a ball is chosen independently at random
from [n] as defined by a probability distribution Di : [n][0,1].
• Class 3: (Possibly) nonuniform and (possibly) dependent. The d
locations of a ball are chosen at random from the set [n]^d as
defined by a probability distribution D : [n]^d[0,1].
8. Always-Go-Left Algorithm
• Introduce a multi-choice algorithm of class2 giving
smaller maximum load than uniform greedy
algorithm in class1
• Partitions the bins into d groups of almost equal size
• For each ball, choose one location from each group
• The i-th location of each ball is chose uniformly and
independently at random from the i-th group
• Insert ball into a bin with minimum load among d
locations
• Tie-breaking by asymmetric Always-Go-Left rule
10. d-ary Fibonacci numbers
• d-ary Fibonacci numbers
• For k<= 0, Fd(k)=0, Fd(1) =1, and for k>=1,
• if d =2, it’s standard fibonacci sequence
11. Analysis of Always-Go-Left Algorithm
• Max load is related to the Fibonacci numbers
• Define , is golden ration
•
• In general, and
•
12. Analysis of Always-Go-Left Algorithm
• We have so that
• Even for d =2, there is a significant improvement
• AGL yields maximum load of instead of
13. Analysis of Always-Go-Left Algorithm
• Uniform greedy scheme achieves the best load balancing among
all algorithms of class 1
• This results holds regardless of the used tie-breaking mechanism,
showing that the tie-breaking mechanism is irrelevant in the
uniform case
• Partitioning the bins and using a fair tie-breaking doesn’t reduce
the number of balls in the fullest bin below
• The combination of partitioning and unfair tie breaking is very
crucial for our result
14. Is The Further Improvement Possible ?
• Whether other kind of choices for the d locations or other
schemes for deciding which of these locations receives the ball
can improve the result
• Negative answer by this theorem
15. Conclude
• By Theorems 1 and 2, apart from some additive constants, the
AGL algorithm achieves the best possible maximum load
among all the sequential multi-choice algorithms, namely
16. Generalization
• Interesting to assuming more balls than bins or even an infinite
sequence of insertions and deletions
• An oblivious adversary specifies a possibly infinite sequence
of insertions and deletions of balls
• All the requests on-line, the sequence of insertions and
deletions is presented one by one without knowing future
requests
• Time t denote time at which request is presented but not yet
served. A ball is said to exist at time t if it is stored in one of
the bins at this time.
17. On-line Model conclusion
• The uniform greedy get a maximum load of
• Always-Go-Left get
• Multiple choice processes are fundamentally different from
the single-choice variant because the multiple-choice does not
increase with the number of balls but depending only on n and
d.
18. Proof of the upper bounds
• Use a witness tree to upper-bound the probability for the
event a bin contains too many balls
• This witness tree is a rooted tree the nodes of which represent
balls whose randomly chosen locations are arranged in a bad
fashion
• Simplifying assumptions
• All the events are stochastically independent
• At most n balls exist at any time, h=1
• Finally it will remove all these assumption
19. Witness Tree
• A bad event when the maximum load exceeds some threshold
value, implies the “ activation of a witness tree”
• the probability for the existence of an activated witness tree
upper-bounds the probability that a bad event occurs.
• It will show the activation of a witness tree is unlikely.
Consequently, the bad event that is witnessed by this structure
is unlikely as well.
20. Symmetric Witness Tree
• A symmetric witness tree of order L is a complete d-ary tree of
height L with d^L leaf nodes
• Each node v represents a ball, but some ball may be
represented by several nodes
• Not every assignment of balls to nodes gives a witness tree
• Each non-root node v with parent node u has to exist at the
insertion time of u’s ball
• Each node of the witness tree describes an event that may
occur or not depending on the random choices for the
locations of the balls
21. Symmetric Witness Tree
• The edge event are defined in terms of the alternative
locations of balls instead of their final resting place
• If all of edges and all its leaf node are activated then we say
this tree is activated.
• existence of a bin with more than L+3 balls implies the
existence of an activated witness tree of order L.
24. Summary
• Always-Go-Left process yields a smaller maximum
load than the uniform greedy process
• The both of asymmetry and the partitioning of the set
of bins are crucial
• Using an asymmetric tie-breaking mechanism without
partitioning doesn’t help
• Also using partitioning but with fair tie-breaking
doesn’t help either
• Multiple choice is totally different from the single
choice