Evolving Universal Hash Function using Genetic Algorithms
Upcoming SlideShare
Loading in...5
×
 

Evolving Universal Hash Function using Genetic Algorithms

on

  • 2,324 views

The ppt presented at the International Conference on Future Computer and Communication, 2009 at Kuala Lumpur, Malaysia. Includes the early work done in the project: "Evolving Universal Hash ...

The ppt presented at the International Conference on Future Computer and Communication, 2009 at Kuala Lumpur, Malaysia. Includes the early work done in the project: "Evolving Universal Hash Functions using Genetic Algorithms". The revised version of this project was presented at GECCO 2009.

Statistics

Views

Total Views
2,324
Views on SlideShare
2,314
Embed Views
10

Actions

Likes
1
Downloads
0
Comments
0

3 Embeds 10

http://www.linkedin.com 8
http://www.slideshare.net 1
https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Evolving Universal Hash Function using Genetic Algorithms Evolving Universal Hash Function using Genetic Algorithms Presentation Transcript

  • Evolving Universal Hash Functions Using Genetic Algorithms Ramprasad Joshi, Mustafa Safdari 2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI GOA CAMPUS
  • Outline
    • Introduction
    • Implementation of Genetic Algorithms
    • Simulation and Result
    • Conclusion and future work
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION
  • Introduction
    • Universal Hash Functions
    • Selecting h randomly
  • Universal Hash Functions
    • Mapping integers in the range [0,M-1] to [0,N-1]
    • A Set H of hash functions is Universal if for any 2 keys j and k and a randomly chose hash function h,
    • Expected no. of collisions for any key is n/N
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION
  • Selecting h randomly
    • One such type of Hash function:
    • p is a prime number,
    • a, b are any two random integers,
    • How do we select a, b, p ?
    • Minimize collisions as much as possible
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION
  • Implementation of GA
    • Chromosome, Fitness Function, Crossover, Mutation
    • p_values, p_Array
  • Elements of the GA
    • Chromosome:
    • Fitness function:
    • Crossover types: single point, 2 point, midway and random
    • Mutation: single point, multi point
    • Roulette Wheel Selection
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION
  • p_values, p_Array
    • p is any prime number such that M ≤ p < 2M . An array p_values called keeps track of the allowable values of p so that it can be used in the above steps. p_values can be constructed and populated it using any sieve algorithm (from Primality testing) to find out prime numbers within a range. The method used in our implementation of the algorithm uses Sieve of Eratosthenes.
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION
  • p_Array
    • For every chromosome of ( a, b) there is an associated value for p such that
    • To store this information in the chromosome, we create a separate array called p_Array which stores for each chromosome, the index of the prime number present in p_values. For example, if a chromosome in the population has a=9, b=7, p=4, it means that the value of p assigned for this chromosome is the one found in p_values at index 4.
    • Index values of p don’t undergo crossover/mutation. Only a, b do. But after each such operation, a suitable p is found for the new resultant a, b pair if the one associated with the parent chromosome doesn’t satisfy (1).
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION
  • Simulations and Results
    • Simulation settings
    • Results
  • Simulation Settings
    • No. of generations = 30
    • Size of populations = 50
    • p c = 0.8, p m = 0.01
    • Input set of keys N
      • Uniformly Randomly Generated in (0, 50000)
      • Different sets of size 10, 100, 1000, 10000
      • Taking N as prime gives better results
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION
  • Results 2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION TABLE I RESULTS OF RUNNING THE ALGORITHM FOR RANDOM INPUT DISTRIBUTIONS Sr. No. Range Of Input Crossover Type * Mutation Type * No. of keys n No. of buckets N No. of initial collisions n collisions n filled p a b 1. 0-10 1 2 10 10 0 0 10 11 3 2 2. 0-500 1 2 10 11 1 4 6 701 67 452 3. 0-600 1 2 20 23 2 2 18 1013 626 635 4. 0-100 1 1 100 100 0 0 100 179 109 114 5. 0-50000 1 2 100 101 8 21 79 98869 54339 35059 6. 0-1000 1 2 500 499 0 1 499 1823 747 581 7. 0-50000 1 2 500 499 37 108 392 69313 46631 9950 8. 1 2 10000 10000 0 0 10000 14153 9347 517 9. 1 2 10000 10000 0 0 10000 57203 25869 37769 10. 0-50000 1 2 10000 10000 911 2397 6692 79063 33068 31178 * Indices from the crossover and mutation type as mentioned in the previous section
  • Case 1
    • Multiple point mutations (2 points) gave a much better result in lesser number of generations as compared to single point or more than 2 point mutation, Single Point Random crossover was found to produce much better results.
    • The convergence of the algorithm under any case was within 7-8 generations in the worst case.
    • For some cases, where the range of distribution was really big and not coincident with [0, N -1], the number of collisions was relatively more. However, this number was drastically reduced when N was taken as a prime number in the nearby range.
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION
  • Case 2 (Comparative Runs)
    • In the next type of simulation, the algorithm was tested against randomly selecting h.
    • The algorithm performed much better than the random selection, giving lesser number of collisions.
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION Table 2. Results of Comparative Run 1 Input File n collisions by random selection n collisions by GA generated function 1 286 251 2 273 256 3 267 245 4 285 244 5 285 255 6 285 262 7 281 259 8 273 255 9 273 258 10 304 259 Setting for GA: P=100, N=1423, p c =0.75 (1), p m =0.01 (1)
  • In the End…
    • Conclusion
    • Future Work
    • Acknowledgement
  • Conclusion
    • The proposed algorithm produces an efficient Universal Hash function for hashing a given distribution of keys which results in the relatively less number of collisions.
    • The problem of clustering is avoided by generating a hash function using metaheuristic, in this case Genetic Algorithms.
    • It performs better than random selection of h .
    • This algorithm is ideal for scenarios where the input distribution to be hashed is changing frequently and the hash function needs to be changed dynamically to rehash the input.
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION
  • Future Work
    • The scope for future work on this algorithm include
      • selection of an efficient Sieve algorithm
      • an efficient encoding of the chromosome
      • understanding the effect of various types of crossover and mutation on the result
      • better design of fitness function so that the few exceptional cases are also taken care of
      • Testing the algorithm against some standard hash functions.
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION
  • Acknowledgment
    • My sincere thanks to Mr. Ramprasad Joshi, my mentor and guide for this project.
    • I also thank my colleague Miss Joanna Mary Oommen for assistance with the paper and presentation.
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION
  • Thank You!
    • Any Questions?
    2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI GOA CAMPUS