This document discusses the hiring problem, a model of decision-making under uncertainty. The hiring problem is closely related to the secretary problem. Candidates are modeled as a sequence of random variables, and at each step the decision maker must either hire or discard the current candidate based only on their relative rank among candidates seen so far. The goal is to hire candidates at a reasonable rate while improving the average quality of hires. Rank-based hiring strategies that depend only on relative rank are discussed.
This document provides an overview of number theory basics relevant to cryptography. It defines concepts like divisibility, prime and composite numbers, the fundamental theorem of arithmetic, greatest common divisor, modular arithmetic, and congruence relations. It also covers algorithms like the Euclidean algorithm for finding the greatest common divisor and the extended Euclidean algorithm. Finally, it discusses solving linear congruences using the inverse of elements modulo n and properties of congruence relations.
Practical and Worst-Case Efficient ApportionmentRaphael Reitzig
Proportional apportionment is the problem of assigning seats to parties according to their relative share of votes. Divisor methods are the de-facto standard solution, used in many countries.
In recent literature, there are two algorithms that implement divisor methods: one by Cheng and Eppstein (ISAAC, 2014) has worst-case optimal running time but is complex, while the other (Pukelsheim, 2014) is relatively simple and fast in practice but does not offer worst-case guarantees.
This talk presents the ideas behind a novel algorithm that avoids the shortcomings of both. We investigate the three contenders in order to determine which is most useful in practice.
Read more over here: http://reitzig.github.io/publications/RW2015b
learning boolean weight learning real valued weights rank learning as ordina...jaishriramm0
This document discusses different methods for learning to rank documents in information retrieval systems, including:
1. Learning boolean weights by determining weights for different document fields (e.g. title, body) that minimize error compared to human judgments on a training set.
2. Learning real-valued weights by using machine learning to determine a scoring function that combines factors like cosine similarity and query term proximity to predict relevance, based on a training set.
3. Rank learning can be viewed as an ordinal regression problem, where the goal is to learn a function that maps document features to a relevance score approximating human judgments.
The document discusses using machine learning techniques like reinforcement learning and generative adversarial networks to improve query optimization in databases. Specifically, it summarizes work using deep Q-learning (DQ) and a neural optimizer (Neo) to learn join ordering, as well as using intra-query learning with SkinnerDB. It proposes using generative adversarial networks and Monte Carlo tree search to address shortcomings in existing approaches like lack of training data and balancing exploration vs exploitation. Generative adversarial networks could generate additional training data while Monte Carlo tree search would help optimize join ordering on a per-query basis.
The document discusses algorithms, including their definition, properties, analysis of time and space complexity, and examples of recursion and iteration. It defines an algorithm as a finite set of instructions to accomplish a task. Properties include inputs, outputs, finiteness, definiteness, and effectiveness. Time complexity is analyzed using big-O notation, while space complexity considers static and variable parts. Recursion uses function calls to solve sub-problems, while iteration uses loops. Examples include factorial calculation, GCD, and Towers of Hanoi solved recursively.
The document discusses asymptotic notations and analysis. It defines common asymptotic notations like Big-O, Big-Omega, and Theta notation that are used to categorize algorithms based on their asymptotic growth rate (e.g. linear, quadratic, exponential). These notations ignore small constants and inputs, and describe how the running time of an algorithm grows as the input size n approaches infinity. Examples are provided to demonstrate how to determine the asymptotic tight upper and lower bounds of functions using these notations.
The document contains the solutions to a week 1 quiz with multiple choice questions about Python functions. It provides the question, the student's answer, whether the answer is correct or not, the score awarded, and feedback about the question. The quiz covers topics like defining functions, returning values from functions, loops, and base conversions.
This document discusses time complexity analysis of algorithms using asymptotic notations. It defines key notations like O(g(n)) which represents an upper bound, Ω(g(n)) for a lower bound, and Θ(g(n)) for a tight bound. Examples are provided to demonstrate proving classifications like n^2 ∈ O(n^3) and 5n^2 ∈ Ω(n). Limitations of the notations are also noted, such as not capturing variable behavior between even and odd inputs. The notations provide a way to categorize algorithms and compare their growth rates to determine asymptotic efficiency.
This document provides an overview of number theory basics relevant to cryptography. It defines concepts like divisibility, prime and composite numbers, the fundamental theorem of arithmetic, greatest common divisor, modular arithmetic, and congruence relations. It also covers algorithms like the Euclidean algorithm for finding the greatest common divisor and the extended Euclidean algorithm. Finally, it discusses solving linear congruences using the inverse of elements modulo n and properties of congruence relations.
Practical and Worst-Case Efficient ApportionmentRaphael Reitzig
Proportional apportionment is the problem of assigning seats to parties according to their relative share of votes. Divisor methods are the de-facto standard solution, used in many countries.
In recent literature, there are two algorithms that implement divisor methods: one by Cheng and Eppstein (ISAAC, 2014) has worst-case optimal running time but is complex, while the other (Pukelsheim, 2014) is relatively simple and fast in practice but does not offer worst-case guarantees.
This talk presents the ideas behind a novel algorithm that avoids the shortcomings of both. We investigate the three contenders in order to determine which is most useful in practice.
Read more over here: http://reitzig.github.io/publications/RW2015b
learning boolean weight learning real valued weights rank learning as ordina...jaishriramm0
This document discusses different methods for learning to rank documents in information retrieval systems, including:
1. Learning boolean weights by determining weights for different document fields (e.g. title, body) that minimize error compared to human judgments on a training set.
2. Learning real-valued weights by using machine learning to determine a scoring function that combines factors like cosine similarity and query term proximity to predict relevance, based on a training set.
3. Rank learning can be viewed as an ordinal regression problem, where the goal is to learn a function that maps document features to a relevance score approximating human judgments.
The document discusses using machine learning techniques like reinforcement learning and generative adversarial networks to improve query optimization in databases. Specifically, it summarizes work using deep Q-learning (DQ) and a neural optimizer (Neo) to learn join ordering, as well as using intra-query learning with SkinnerDB. It proposes using generative adversarial networks and Monte Carlo tree search to address shortcomings in existing approaches like lack of training data and balancing exploration vs exploitation. Generative adversarial networks could generate additional training data while Monte Carlo tree search would help optimize join ordering on a per-query basis.
The document discusses algorithms, including their definition, properties, analysis of time and space complexity, and examples of recursion and iteration. It defines an algorithm as a finite set of instructions to accomplish a task. Properties include inputs, outputs, finiteness, definiteness, and effectiveness. Time complexity is analyzed using big-O notation, while space complexity considers static and variable parts. Recursion uses function calls to solve sub-problems, while iteration uses loops. Examples include factorial calculation, GCD, and Towers of Hanoi solved recursively.
The document discusses asymptotic notations and analysis. It defines common asymptotic notations like Big-O, Big-Omega, and Theta notation that are used to categorize algorithms based on their asymptotic growth rate (e.g. linear, quadratic, exponential). These notations ignore small constants and inputs, and describe how the running time of an algorithm grows as the input size n approaches infinity. Examples are provided to demonstrate how to determine the asymptotic tight upper and lower bounds of functions using these notations.
The document contains the solutions to a week 1 quiz with multiple choice questions about Python functions. It provides the question, the student's answer, whether the answer is correct or not, the score awarded, and feedback about the question. The quiz covers topics like defining functions, returning values from functions, loops, and base conversions.
This document discusses time complexity analysis of algorithms using asymptotic notations. It defines key notations like O(g(n)) which represents an upper bound, Ω(g(n)) for a lower bound, and Θ(g(n)) for a tight bound. Examples are provided to demonstrate proving classifications like n^2 ∈ O(n^3) and 5n^2 ∈ Ω(n). Limitations of the notations are also noted, such as not capturing variable behavior between even and odd inputs. The notations provide a way to categorize algorithms and compare their growth rates to determine asymptotic efficiency.
Growth of Functions
CMSC 56 | Discrete Mathematical Structure for Computer Science
October 6, 2018
Instructor: Allyn Joy D. Calcaben
College of Arts & Sciences
University of the Philippines Visayas
This document provides an overview of topics in number theory that will be covered in a discrete structures course, including divisibility, greatest common divisors, primes, and modular arithmetic. It introduces basic concepts such as integers dividing other integers, prime numbers, the fundamental theorem of arithmetic, the Euclidean algorithm for finding greatest common divisors, and modular arithmetic involving remainders when dividing integers. Examples are provided to illustrate key definitions and properties related to these foundational number theory topics.
This document contains notes on probability theory from a course. It begins with definitions of measures, σ-algebras, Borel σ-algebras, and related concepts. It then proves some key properties, including that the Borel σ-algebra on the real line can be generated by open intervals with rational endpoints. The document also contains proofs showing when two measures are equal and the Monotone Class Theorem for sets.
Integral Calculus Anti Derivatives reviewerJoshuaAgcopra
This document provides an overview of integration concepts and formulas covered in Calculus 2 (Math 112) at the University of Science and Technology of Southern Philippines. It includes the following:
- Course outcomes focus on carrying out integration using fundamental formulas and techniques for single and multiple integrals.
- Topic outline covers anti-differentiation, simple power formulas, and simple trigonometric functions.
- Worked examples demonstrate evaluating indefinite integrals using power, trigonometric, and other basic integration rules.
- Important notes emphasize that the general solution for an indefinite integral includes an unknown constant C and the differential dx.
In this playlist
https://youtube.com/playlist?list=PLT...
I'll illustrate algorithms and data structures course, and implement the data structures using java programming language.
the playlist language is arabic.
The Topics:
--------------------
1- Arrays
2- Linear and Binary search
3- Linked List
4- Recursion
5- Algorithm analysis
6- Stack
7- Queue
8- Binary search tree
9- Selection sort
10- Insertion sort
11- Bubble sort
12- merge sort
13- Quick sort
14- Graphs
15- Hash table
16- Binary Heaps
Reference : Object-Oriented Data Structures Using Java - Third Edition by NELL DALE, DANEIEL T.JOYCE and CHIP WEIMS
Slides is owned by College of Computing & Information Technology
King Abdulaziz University, So thanks alot for these great materials
Recursive Definitions in Discrete Mathmatcs.pptxgbikorno
The document discusses recursive definitions, which define an object in terms of itself. It provides examples of recursively defined sequences, functions, and sets. Recursion is related to mathematical induction. Recursive algorithms solve problems by reducing them to smaller instances of the same problem. While recursive definitions and algorithms are elegant, iterative equivalents are typically more efficient in terms of time and space usage.
Cryptography and data security involves number theory concepts like groups, rings, fields, and modular arithmetic. Some key ideas discussed include:
1) The integers under addition form a cyclic group, and the theorem that for any finite group G and element a in G, a raised to the order of G is the identity element.
2) Modular arithmetic defines equivalence classes for integers modulo n, and the set of residues Zn forms an abelian group under addition.
3) The multiplicative integers modulo n, Zn*, form a group whose size is given by Euler's totient function φ(n). For prime p, φ(p) = p - 1.
This document discusses algorithms and their analysis. It begins by defining an algorithm and analyzing its time and space complexity. It then discusses different asymptotic notations used to describe an algorithm's runtime such as Big-O, Omega, and Theta notations. Examples are provided to illustrate how to determine the tight asymptotic bound of functions. The document also covers algorithm design techniques like divide-and-conquer and analyzes merge sort as an example. It concludes by defining recurrences used to describe algorithms and provides an example recurrence for merge sort.
This document discusses trigonometric ratios and identities. It begins by defining angles, their measurement in different systems including degrees, radians and grades. It then defines trigonometric functions including sine, cosine, tangent etc and discusses their domains, ranges and signs in different quadrants. The document also covers trigonometric identities, ratios of compound angles and periodicity of trig functions.
The document discusses several pre-calculus problems involving geometry, functions, and rates of change.
(1) The first problem involves calculating arc length, sector area, and angles given a circle with diameter 10km and a central angle of 240 degrees.
(2) The second problem involves sketching the graphs of a function stretched, translated, and its inverse and reciprocal.
(3) The third problem proves the identity cotθ = 1/tanθ using trigonometric identities.
(4) The fourth problem calculates the rate of increase in student population at a school over 2 years and finds how long it will take for the population to reach 1500.
how to calclute time complexity of algortihmSajid Marwat
This document discusses algorithm analysis and complexity. It defines key terms like asymptotic complexity, Big-O notation, and time complexity. It provides examples of analyzing simple algorithms like a sum function to determine their time complexity. Common analyses include looking at loops, nested loops, and sequences of statements. The goal is to classify algorithms according to their complexity, which is important for large inputs and machine-independent. Algorithms are classified based on worst, average, and best case analyses.
This document discusses algorithm analysis and complexity. It defines key terms like algorithm, asymptotic complexity, Big-O notation, and time complexity. It provides examples of analyzing simple algorithms like summing array elements. The running time is expressed as a function of input size n. Common complexities like constant, linear, quadratic, and exponential time are introduced. Nested loops and sequences of statements are analyzed. The goal of analysis is to classify algorithms into complexity classes to understand how input size affects runtime.
This document discusses two algorithms: divide-and-conquer and dynamic programming. Divide-and-conquer breaks problems into independent subproblems, solves the subproblems, and combines their solutions. Dynamic programming solves subproblems once and saves their solutions in a table to solve the original problem more efficiently. Examples include computing the Fibonacci sequence and matrix chain multiplication.
The security of the RSA algorithm depends on the difficulty of factoring large numbers. The best known factoring algorithms are trial division, Dixon's algorithm, the quadratic sieve, and the number field sieve. The quadratic sieve and number field sieve are parallelizable algorithms that improve on Dixon's algorithm by using a "sieving" technique to more efficiently find relations between factors. While factoring performance improves incrementally over time, a large key size (over 300 bits) is still considered secure against the best known factoring methods.
Design and analysis of algorithm ppt pptsrushtiivp
The document discusses asymptotic analysis and algorithmic complexity. It introduces asymptotic notations like Big O, Omega, and Theta that are used to analyze how an algorithm's running time grows as the input size increases. These notations allow algorithms to be categorized based on their worst-case upper and lower time bounds. Common time complexities include constant, logarithmic, linear, quadratic, and exponential time. The document provides examples of problems that fall into each category and discusses how asymptotic notations are used to prove upper and lower bounds for functions.
1. The document discusses matrices and determinants. It defines different types of matrices such as rectangular, square, diagonal, scalar, row, column, identity, zero, upper triangular, and lower triangular matrices.
2. It explains how to calculate determinants of matrices. The determinant of a 1x1 matrix is the single element. The determinant of a 2x2 matrix is calculated using a formula. Determinants of higher order matrices are calculated by expanding along rows or columns.
3. It introduces concepts of minors, cofactors, and explains how the value of a determinant can be written in terms of its minors and cofactors. It also lists some properties and operations for determinants.
This document contains lecture notes on asymptotic notation for analyzing algorithms. It defines big O, Ω, and Θ notation for describing the worst-case, best-case, and average-case time complexity of algorithms. It explains that these notations describe the upper and lower bounds of the growth rate of an algorithm's run time as the problem size increases. The document also provides examples of using asymptotic notation to classify common functions and discusses properties like how complexity is affected by addition, subtraction, multiplication, and more.
This PPT discusses about some programming puzzles that are related to Encryption and also it emphasis the need for strengthening bit-wise operators concept.
Growth of Functions
CMSC 56 | Discrete Mathematical Structure for Computer Science
October 6, 2018
Instructor: Allyn Joy D. Calcaben
College of Arts & Sciences
University of the Philippines Visayas
This document provides an overview of topics in number theory that will be covered in a discrete structures course, including divisibility, greatest common divisors, primes, and modular arithmetic. It introduces basic concepts such as integers dividing other integers, prime numbers, the fundamental theorem of arithmetic, the Euclidean algorithm for finding greatest common divisors, and modular arithmetic involving remainders when dividing integers. Examples are provided to illustrate key definitions and properties related to these foundational number theory topics.
This document contains notes on probability theory from a course. It begins with definitions of measures, σ-algebras, Borel σ-algebras, and related concepts. It then proves some key properties, including that the Borel σ-algebra on the real line can be generated by open intervals with rational endpoints. The document also contains proofs showing when two measures are equal and the Monotone Class Theorem for sets.
Integral Calculus Anti Derivatives reviewerJoshuaAgcopra
This document provides an overview of integration concepts and formulas covered in Calculus 2 (Math 112) at the University of Science and Technology of Southern Philippines. It includes the following:
- Course outcomes focus on carrying out integration using fundamental formulas and techniques for single and multiple integrals.
- Topic outline covers anti-differentiation, simple power formulas, and simple trigonometric functions.
- Worked examples demonstrate evaluating indefinite integrals using power, trigonometric, and other basic integration rules.
- Important notes emphasize that the general solution for an indefinite integral includes an unknown constant C and the differential dx.
In this playlist
https://youtube.com/playlist?list=PLT...
I'll illustrate algorithms and data structures course, and implement the data structures using java programming language.
the playlist language is arabic.
The Topics:
--------------------
1- Arrays
2- Linear and Binary search
3- Linked List
4- Recursion
5- Algorithm analysis
6- Stack
7- Queue
8- Binary search tree
9- Selection sort
10- Insertion sort
11- Bubble sort
12- merge sort
13- Quick sort
14- Graphs
15- Hash table
16- Binary Heaps
Reference : Object-Oriented Data Structures Using Java - Third Edition by NELL DALE, DANEIEL T.JOYCE and CHIP WEIMS
Slides is owned by College of Computing & Information Technology
King Abdulaziz University, So thanks alot for these great materials
Recursive Definitions in Discrete Mathmatcs.pptxgbikorno
The document discusses recursive definitions, which define an object in terms of itself. It provides examples of recursively defined sequences, functions, and sets. Recursion is related to mathematical induction. Recursive algorithms solve problems by reducing them to smaller instances of the same problem. While recursive definitions and algorithms are elegant, iterative equivalents are typically more efficient in terms of time and space usage.
Cryptography and data security involves number theory concepts like groups, rings, fields, and modular arithmetic. Some key ideas discussed include:
1) The integers under addition form a cyclic group, and the theorem that for any finite group G and element a in G, a raised to the order of G is the identity element.
2) Modular arithmetic defines equivalence classes for integers modulo n, and the set of residues Zn forms an abelian group under addition.
3) The multiplicative integers modulo n, Zn*, form a group whose size is given by Euler's totient function φ(n). For prime p, φ(p) = p - 1.
This document discusses algorithms and their analysis. It begins by defining an algorithm and analyzing its time and space complexity. It then discusses different asymptotic notations used to describe an algorithm's runtime such as Big-O, Omega, and Theta notations. Examples are provided to illustrate how to determine the tight asymptotic bound of functions. The document also covers algorithm design techniques like divide-and-conquer and analyzes merge sort as an example. It concludes by defining recurrences used to describe algorithms and provides an example recurrence for merge sort.
This document discusses trigonometric ratios and identities. It begins by defining angles, their measurement in different systems including degrees, radians and grades. It then defines trigonometric functions including sine, cosine, tangent etc and discusses their domains, ranges and signs in different quadrants. The document also covers trigonometric identities, ratios of compound angles and periodicity of trig functions.
The document discusses several pre-calculus problems involving geometry, functions, and rates of change.
(1) The first problem involves calculating arc length, sector area, and angles given a circle with diameter 10km and a central angle of 240 degrees.
(2) The second problem involves sketching the graphs of a function stretched, translated, and its inverse and reciprocal.
(3) The third problem proves the identity cotθ = 1/tanθ using trigonometric identities.
(4) The fourth problem calculates the rate of increase in student population at a school over 2 years and finds how long it will take for the population to reach 1500.
how to calclute time complexity of algortihmSajid Marwat
This document discusses algorithm analysis and complexity. It defines key terms like asymptotic complexity, Big-O notation, and time complexity. It provides examples of analyzing simple algorithms like a sum function to determine their time complexity. Common analyses include looking at loops, nested loops, and sequences of statements. The goal is to classify algorithms according to their complexity, which is important for large inputs and machine-independent. Algorithms are classified based on worst, average, and best case analyses.
This document discusses algorithm analysis and complexity. It defines key terms like algorithm, asymptotic complexity, Big-O notation, and time complexity. It provides examples of analyzing simple algorithms like summing array elements. The running time is expressed as a function of input size n. Common complexities like constant, linear, quadratic, and exponential time are introduced. Nested loops and sequences of statements are analyzed. The goal of analysis is to classify algorithms into complexity classes to understand how input size affects runtime.
This document discusses two algorithms: divide-and-conquer and dynamic programming. Divide-and-conquer breaks problems into independent subproblems, solves the subproblems, and combines their solutions. Dynamic programming solves subproblems once and saves their solutions in a table to solve the original problem more efficiently. Examples include computing the Fibonacci sequence and matrix chain multiplication.
The security of the RSA algorithm depends on the difficulty of factoring large numbers. The best known factoring algorithms are trial division, Dixon's algorithm, the quadratic sieve, and the number field sieve. The quadratic sieve and number field sieve are parallelizable algorithms that improve on Dixon's algorithm by using a "sieving" technique to more efficiently find relations between factors. While factoring performance improves incrementally over time, a large key size (over 300 bits) is still considered secure against the best known factoring methods.
Design and analysis of algorithm ppt pptsrushtiivp
The document discusses asymptotic analysis and algorithmic complexity. It introduces asymptotic notations like Big O, Omega, and Theta that are used to analyze how an algorithm's running time grows as the input size increases. These notations allow algorithms to be categorized based on their worst-case upper and lower time bounds. Common time complexities include constant, logarithmic, linear, quadratic, and exponential time. The document provides examples of problems that fall into each category and discusses how asymptotic notations are used to prove upper and lower bounds for functions.
1. The document discusses matrices and determinants. It defines different types of matrices such as rectangular, square, diagonal, scalar, row, column, identity, zero, upper triangular, and lower triangular matrices.
2. It explains how to calculate determinants of matrices. The determinant of a 1x1 matrix is the single element. The determinant of a 2x2 matrix is calculated using a formula. Determinants of higher order matrices are calculated by expanding along rows or columns.
3. It introduces concepts of minors, cofactors, and explains how the value of a determinant can be written in terms of its minors and cofactors. It also lists some properties and operations for determinants.
This document contains lecture notes on asymptotic notation for analyzing algorithms. It defines big O, Ω, and Θ notation for describing the worst-case, best-case, and average-case time complexity of algorithms. It explains that these notations describe the upper and lower bounds of the growth rate of an algorithm's run time as the problem size increases. The document also provides examples of using asymptotic notation to classify common functions and discusses properties like how complexity is affected by addition, subtraction, multiplication, and more.
This PPT discusses about some programming puzzles that are related to Encryption and also it emphasis the need for strengthening bit-wise operators concept.
Best Digital Marketing Strategy Build Your Online Presence 2024.pptxpavankumarpayexelsol
This presentation provides a comprehensive guide to the best digital marketing strategies for 2024, focusing on enhancing your online presence. Key topics include understanding and targeting your audience, building a user-friendly and mobile-responsive website, leveraging the power of social media platforms, optimizing content for search engines, and using email marketing to foster direct engagement. By adopting these strategies, you can increase brand visibility, drive traffic, generate leads, and ultimately boost sales, ensuring your business thrives in the competitive digital landscape.
1. The Hiring Problem
Conrado Martínez
U. Politècnica Catalunya
Joint work with M. Archibald
July 2010
Univ. Cape Town, South Africa
2. The hiring problem
The hiring problem is a simple model of decision-making under
uncertainty
It is closely related to the well-known Secretary Problem:
A sequence of n candidates is to be interviewed to
ll a post. For each interviewed candidate we only
learn about his/her relative rank among the
candidates we've seen so far. After each interview,
hire and nish, or discard and interview a new
candidate. The nth candidate must be hired if we
have reached that far.
The goal: devise an strategy that maximizes the
probability of hiring the best of the n candidates.
3. The hiring problem
Originally introduced by Broder et al. (SODA 2008)
The candidates are modellized by a (potentially innite)
sequence of i.i.d. random variables Qi uniformly distributed in
[0; 1]
At step i you either hire or discard candidate i with score Qi
Decisions are irrevocable
Goals: hire candidates at some reasonable rate, improve the
mean quality of the company's sta
4. The hiring problem
Our model: a permutation of length n, candidate i has score
(i); the permutation is actually presented as a sequence of
unknown length S = s1; s2; s3; : : : with 1 si i + 1, si is the
rank of the ith candidate relative to the candidates seen so far
(i included)
Example
= 62817435
0 =
S =
5. The hiring problem
Our model: a permutation of length n, candidate i has score
(i); the permutation is actually presented as a sequence of
unknown length S = s1; s2; s3; : : : with 1 si i + 1, si is the
rank of the ith candidate relative to the candidates seen so far
(i included)
Example
= 62817435
0 = 1
S = 1
6. The hiring problem
Our model: a permutation of length n, candidate i has score
(i); the permutation is actually presented as a sequence of
unknown length S = s1; s2; s3; : : : with 1 si i + 1, si is the
rank of the ith candidate relative to the candidates seen so far
(i included)
Example
= 62817435
0 = 21
S = 11
7. The hiring problem
Our model: a permutation of length n, candidate i has score
(i); the permutation is actually presented as a sequence of
unknown length S = s1; s2; s3; : : : with 1 si i + 1, si is the
rank of the ith candidate relative to the candidates seen so far
(i included)
Example
= 62817435
0 = 213
S = 113
8. The hiring problem
Our model: a permutation of length n, candidate i has score
(i); the permutation is actually presented as a sequence of
unknown length S = s1; s2; s3; : : : with 1 si i + 1, si is the
rank of the ith candidate relative to the candidates seen so far
(i included)
Example
= 62817435
0 = 3241
S = 1131
9. The hiring problem
Our model: a permutation of length n, candidate i has score
(i); the permutation is actually presented as a sequence of
unknown length S = s1; s2; s3; : : : with 1 si i + 1, si is the
rank of the ith candidate relative to the candidates seen so far
(i included)
Example
= 62817435
0 = 32514
S = 11314
10. The hiring problem
Our model: a permutation of length n, candidate i has score
(i); the permutation is actually presented as a sequence of
unknown length S = s1; s2; s3; : : : with 1 si i + 1, si is the
rank of the ith candidate relative to the candidates seen so far
(i included)
Example
= 62817435
0 = 426153
S = 113143
11. The hiring problem
Our model: a permutation of length n, candidate i has score
(i); the permutation is actually presented as a sequence of
unknown length S = s1; s2; s3; : : : with 1 si i + 1, si is the
rank of the ith candidate relative to the candidates seen so far
(i included)
Example
= 62817435
0 = 5271643
S = 1131433
12. The hiring problem
Our model: a permutation of length n, candidate i has score
(i); the permutation is actually presented as a sequence of
unknown length S = s1; s2; s3; : : : with 1 si i + 1, si is the
rank of the ith candidate relative to the candidates seen so far
(i included)
Example
= 62817435
0 = 62817435
S = 11314335
13. Rank-based hiring
A hiring strategy is rank-based if and only if it only depends on the
relative rank of the current candidate compared to the candidates
seen so far.
14. Rank-based hiring
Rank-based strategies modelize actual restrictions to measure
qualities
Many natural strategies are rank-based, e.g.,
above the best
above the mth best
above the median
above the P% best
Assume only relative ranks of candidates are known, like the
standard secretary problem
Some hiring strategies are not rank-based, e.g., above the
average, above a threshold.
15. Intermezzo: A crash course on generating functions and the
symbolic method
Excerpts from my short course Analytic Combinatorics: A Primer
16. Two basic counting principles
Let A and B be two nite sets.
The Addition Principle
If A and B are disjoint then
jA[Bj = jAj+ jBj
The Multiplication Principle
jABj = jAjjBj
17. Combinatorial classes
Denition
A combinatorial class is a pair (A;jj), where A is a nite or
denumerable set of values (combinatorial objects, combinatorial
structures), jj : A ! N is the size function and for all n 0
An = fx 2 Ajjxj = ng is nite
18. Combinatorial classes
Example
A = all nite strings from a binary alphabet;
jsj = the length of string s
B = the set of all permutations;
jj = the order of the permutation
Cn = the partitions of the integer n; jpj = n if p 2 Cn
19. Labelled and unlabelled classes
In unlabelled classes, objects are made up of indistinguisable
atoms; an atom is an object of size 1
In labelled classes, objects are made up of distinguishable
atoms; in an object of size n, each of its n atoms bears a
distinct label from f1;:::;ng
20. Counting generating functions
Denition
Let an = #An = the number of objects of size n in A. Then the
formal power series
A(z) =
X
n0
anzn =
X
2A
zj j
is the (ordinary) generating function of the class A.
The coecient of zn in A(z) is denoted [zn]A(z):
[zn]A(z) = [zn]
X
n0
anzn = an
21. Counting generating functions
Ordinary generating functions (OGFs) are mostly used to
enumerate unlabelled classes.
Example
L = fw 2 (0 + 1) jw does not contain two consecutive 0'sg
= f;0;1;01;10;11;010;011;101;110;111;:::g
L(z) = zjj + zj0j + zj1j + zj01j + zj10j + zj11j +
= 1 + 2z + 3z2 + 5z3 + 8z4 +
Exercise: Can you guess the value of Ln = [zn]L(z)?
22. Counting generating functions
Denition
Let an = #An = the number of objects of size n in A. Then the
formal power series
^
A(z) =
X
n0
an
zn
n!
=
X
2A
zj j
j j!
is the exponential generating function of the class A.
23. Counting generating functions
Exponential generating functions (EGFs) are used to enumerate
labelled classes.
Example
C = circular permutations
= f;1;12;123;132;1234;1243;1324;1342;
1423;1432;12345;:::g
^
C(z) =
1
0!
+
z
1!
+
z2
2!
+ 2
z3
3!
+ 6
z4
4!
+
cn = n! [zn] ^
C(z) = (n 1)!; n 0
24. Disjoint union
Let C = A+ B, the disjoint union of the unlabelled classes A and
B (AB = ;). Then
C(z) = A(z) + B(z)
And
cn = [zn]C(z) = [zn]A(z) + [zn]B(z) = an + bn
25. Cartesian product
Let C = AB, the Cartesian product of the unlabelled classes A
and B. The size of ( ; ) 2 C, where a 2 A and 2 B, is the sum
of sizes: j( ; )j = j j+ j j.
Then
C(z) = A(z) B(z)
Proof.
C(z) =
X
2C
zj j =
X
( ; )2AB
zj j+j j =
X
2A
X
2B
zj j zj j
=
0
@
X
2A
zj j
1
A
0
@
X
2B
zj j
1
A = A(z) B(z)
26. Cartesian product
The nth coecient of the OGF for a Cartesian product is the
convolution of the coecients fang and fbng:
cn = [zn]C(z) = [zn]A(z) B(z)
=
n
X
k=0
ak bn k
27. Sequences
Let A be a class without any empty object (A0 = ;). The class
C = Seq(A) denotes the class of sequences of A's.
C = f( 1;:::; k)jk 0; i 2 Ag
= fg+ A+ (AA) + (AAA) + = fg+ AC
Then
C(z) =
1
1 A(z)
Proof.
C(z) = 1 + A(z) + A2(z) + A3(z) + = 1 + A(z) C(z)
28. Labelled objects
Disjoint unions of labelled classes are dened as for unlabelled
classes and ^
C(z) = ^
A(z) + ^
B(z), for C = A+ B. Also,
cn = an + bn.
To dene labelled products, we must take into account that for
each pair ( ; ) where j j = k and j j+ j j = n, we construct
n
k
distinct pairs by consistently relabelling the atoms of and :
= (2;1;4;3); = (1;3;2)
= f(2;1;4;3;5;7;6);(2;1;5;3;4;7;6);:::;
(5;4;7;6;1;3;2)g
#( ) =
7
4
!
= 35
The size of an element in is j j+ j j.
29. Labelled products
For a class C that is labelled product of two labelled classes A and
B
C = AB =
[
2A
2B
the following relation holds for the corresponding EGFs
^
C(z) =
X
2C
zj j!
j j!
=
X
2A
X
2B
j j+ j j
j j
!
zj j+j j
(j j+ j j)!
=
X
2A
X
2B
1
j j!j j!
zj j+j j =
0
@
X
2A
zj j
j j!
1
A
0
@
X
2B
zj j
j j!
1
A
= ^
A(z) ^
B(z)
30. Labelled products
The nth coecient of ^
C(z) = ^
A(z) ^
B(z) is also a convolution
cn = [zn] ^
C(z) =
n
X
k=0
n
k
!
ak bn k
31. Sequences
Sequences of labelled object are dened as in the case of unlabelled
objects. The construction C = Seq(A) is well dened if A0 = ;.
If C = Seq(A) = fg+ AC then
^
C(z) =
1
1 ^
A(z)
Example
Permutations are labelled sequences of atoms, P = Seq(Z). Hence,
^
P(z) =
1
1 z
=
X
n0
zn
n! [zn] ^
P(z) = n!
32. A dictionary of admissible unlabelled operators
Class OGF Name
1 Epsilon
Z z Atomic
A+ B A(z) + B(z) Disjoint union
AB A(z) B(z) Product
Seq(A) 1
1 A(z) Sequence
A A(z) = zA0(z) Marking
MSet(A) exp
P
k0 A(zk)=k
Multiset
PSet(A) exp
P
k0( 1)kA(zk)=k
Powerset
Cycle(A)
P
k0
(k)
k ln 1
1 A(zk) Cycle
33. A dictionary of admissible labelled operators
Class EGF Name
1 Epsilon
Z z Atomic
A+ B ^
A(z) + ^
B(z) Disjoint union
AB ^
A(z) ^
B(z) Product
Seq(A) 1
1 ^
A(z) Sequence
A ^
A(z) = z ^
A0(z) Marking
Set(A) exp( ^
A(z)) Set
Cycle(A) ln
1
1 ^
A(z)
Cycle
34. Bivariate generating functions
We need often to study some characteristic of combinatorial
structures, e. g., the number of left-to-right maxima in a
permutation, the height of a rooted tree, the number of complex
components in a graph, etc.
Suppose X : An ! N is a characteristic under study. Let
an;k = #f 2 Ajj j = n;X( ) = kg
We can view the restriction Xn : An ! N as a random variable.
Then under the usual uniform model
P[Xn = k] =
an;k
an
36. Bivariate generating functions
We can also dene
B(z;u) =
X
n;k0
P[Xn = k] znuk
=
X
2A
P[ ]zj juX( )
and thus B(z;u) is a generating function whose coecient of zn is
the probability generating function of the r.v. Xn
B(z;u) =
X
n0
Pn(u)zn
Pn(u) = [zn]B(z;u) =
X
k0
P[Xn = k]uk
37. Bivariate generating functions
Proposition
If P(u) is the probability generating function of a random variable
X then
P(1) = 1;
P0(1) = E[X];
P00(1) = E
h
X2
i
= E[X(X 1)];
V[X] = P00(1) + P0(1) (P0(1))2
38. Bivariate generating functions
We can study the moments of Xn by successive dierentiation of
B(z;u) (or A(z;u)). For instance,
B(z) =
X
n0
E[Xn]zn =
@B
@u u=1
For the rth factorial moments of Xn
B(r)(z) =
X
n0
E[Xn
r]zn =
@rB
@ur u=1
Xnr = Xn(Xn 1) (Xn r + 1)
39. The number of left-to-right maxima in a permutation
Consider the following specication for permutations
P = f;g+ P Z
The BGF for the probability that a random permutation of size n
has k left-to-right maxima is
M(z;u) =
X
2P
zjj
jj!
uX();
where X() = # of left-to-right maxima in
40. The number of left-to-right maxima in a permutation
With the recursive descomposition of permutations and since the
last element of a permutation of size n is a left-to-right maxima i
its label is n
M(z;u) =
X
2P
X
1jjj+1
zjj+1
(jj+ 1)!
uX()+[[j=jj+1]]
[
[P ]
] = 1 if P is true, [
[P ]
] = 0 otherwise.
41. The number of left-to-right maxima in a permutation
M(z;u) =
X
2P
zjj+1
(jj+ 1)!
uX() X
1jjj+1
u[[j=jj+1]]
=
X
2P
zjj+1
(jj+ 1)!
uX)(jj+ u)
Taking derivatives w.r.t. z
@
@z
M =
X
2P
zjj
jj!
uX)(jj+ u) = z
@
@z
M + uM
Hence,
(1 z)
@
@z
M(z;u) uM(z;u) = 0
42. The number of left-to-right maxima in a permutation
Solving, since M(0;u) = 1
M(z;u) =
1
1 z
u
=
X
n;k0
n
k
#
zn
n!
uk
where
n
k
denote the (signless) Stirling numbers of the rst kind,
also called Stirling cycle numbers.
Taking the derivative w.r.t. u and setting u = 1
m(z) =
@
@z
M(z;u)
u=1
=
1
1 z
ln
1
1 z
Thus the average number of left-to-right maxima in a random
permutation of size n is
[zn]m(z) = E[Xn] = Hn = 1+
1
2
+
1
3
++
1
n
= lnn+ +O(1=n)
1
1 z
ln
1
1 z
=
X
`
z` X
m0
zm
m
=
X
n0
zn
n
X
k=1
1
k
44. Rank-based hiring
The recursive decomposition of permutations
P = + P Z
is the natural choice for the analysis of rank-based strategies,
with denoting the labelled product.
For each in P, fgZ is the set of jj+ 1 permutations
f ? 1; ? 2; : : : ; ? (n + 1)g; n = jj
? j denotes the permutation one gets after relabelling j,
j + 1, . . . , n = jj in to j + 1, j + 2, . . . , n + 1 and
appending j at the end
Example
32451 ? 3 = 425613
32451 ? 2 = 435612
45. Rank-based hiring
H() = the set of candidates hired in permutation
h() = #H()
Let Xj() = 1 if candidate with score j is hired after and
Xj() = 0 otherwise.
h( ? j) = h() + Xj()
46. Rank-based hiring
Theorem
Let H(z; u) =
P
2P
zjj
jj! u
h().
Then
(1 z)
@
@z
H(z; u) H(z; u) = (u 1)
X
2P
X ()
z
jj
jj!
u
h()
;
where X () the number of j such that Xj() = 1.
47. Rank-based hiring
We can write h() = 0 if is the empty permutation and
h( ? j) = h() + Xj().
H(z; u) =
X
2P
z
jj
jj!
u
h()
= 1 +
X
n0
X
2Pn
z
jj
jj!
u
h()
= 1 +
X
n0
X
1jn
X
2Pn 1
z
j?jj
j ? jj!
u
h(?j)
= 1 +
X
n0
X
1jn
X
2Pn 1
z
jj+1
(jj+ 1)!
u
h()+Xj ()
= 1 +
X
n0
X
2Pn 1
z
jj+1
(jj+ 1)!
u
h()
X
1jn
u
Xj ()
:
48. Rank-based hiring
Since Xj() is either 0 or 1 for all j and all , we have
X
1jn
u
Xj ()
= (jj+ 1 X ()) + uX ();
where X () =
P
1jjj+1 Xj().
H(z; u) = 1+
X
n0
X
2Pn 1
z
jj+1
(jj+ 1)!
u
h()
(jj+1 X ())+uX ()
:
The theorem follows after dierentiation and a few additional
algebraic manipulations.
49. Pragmatic strategies
A hiring strategy is pragmatic if and only if
Whenever it would hire a candidate with score j, it would hire
a candidate with a larger score
Xj() = 1 =) Xj0() = 1 for all j
0 j
The number of scores it would potentially hire increases at
most by one if and only if the candidate in the previous step
was hired
X ( ? j) X () + Xj()
50. Pragmatic strategies
The rst condition is very natural and reasonable; the second
one is technically necessary for several results we discuss later
Above the best, above the mth best, above the P% best,
. . . are all pragmatic
51. Pragmatic strategies
Theorem
For any pragmatic hiring strategy and any permutation , the X ()
best candidates of have been hired (and possibly others).
53. Pragmatic strategies
Let rn denote the rank of the last hired candidate in a random
permutation, and
gn = 1
rn
n
is called the gap.
Theorem
For any pragmatic hiring strategy,
E[gn] =
1
2n
(E[Xn] 1);
where E[Xn] = [z
n]
P
2P X ()z
jj=jj!.
54. Hiring above the maximum
Candidate i is hired if and only if her score is above the score of the
best currently hired candidate.
X () = 1
H() = fi : i is a left-to-right maximumg
E[hn] = [z
n] @H
@u u=1
= ln n + O(1)
Variance of hn is also ln n + O(1) and after proper
normalization h
n converges to N(0; 1)
55. Hiring above the mth best
Candidate i is hired if and only if her score is above the score of the
mth best currently hired candidate.
X () = jj+ 1 if jj m; X () = m if jj m
E[hn] = [z
n] @H
@u u=1
= m ln n + O(1) for xed m
Variance of hn is also m ln n + O(1) and after proper
normalization h
n converges to N(0; 1)
The case of arbitrary m can be studied by introducing
H(z; u; v ) =
P
m1 v
mH
(m)(z; u), where H
(m)(z; u) is the GF
that corresponds to a given particular m.
We can show that
E[hn] = m(Hn Hm + 1) m ln(n=m) + m + O(1), with Hn
the nth harmonic number
56. Hiring above the median
Candidate i is hired if and only if her score is above the score of the
median of the scores of currently hired candidates.
X () = d(h() + 1)=2e
q
n
(1 + O(n
1)) E[hn] 3
q
n
(1 + O(n
1))
This result follows easily by using previous theorem with
XL() = (h() + 1)=2 and XU() = (h() + 3)=2 to lower
and upper bound
57. Hiring above the median
n 2 f1000; : : : ; 10000g, M = 100 random permutations for each n
103
8
150
100
4
175
125
75
50
25
10
6
2
In red: lower bound (using XL); in green: upper bound (using XU);
in yellow: simulation
58. Final remarks
Other quantities, e.g. time of the last hiring, etc. can also be
analyzed using techniques from analytic combinatorics
We have also analyzed hiring above the P% best candidate
with the same machinery, actually we have explicit solutions
for H(z; u)
We have extensions of these results to cope with randomized
hiring strategies
Many variants of the problem are interesting and natural; for
instance, include ring policies