1. Improved File Accesses with
Neural Networks
Hemanth Mantri
Makarand Damle
Department of Computer Science
2. Motivation
• Processor and Memory Speeds
– Very High
• Disk access speeds
– Bottle Neck
• How to overcome?
– Caching
• Replacement (LRU, LFU)
– Prefetching
• Need prediction
3. Current Approach
• Linux Read Ahead
– Limited to file level
– Sequential / Random
• Application has limited say
– Hints to the kernel
– fadvise(), madvise()
– Only sequential or Random
4. Type of Predictors
• Access Tree based
– Track the processes along with file accesses
• Hint Based (Applications)
– find, grep, etc
6. Perceptron Approach
• Neural network inputs
– File IDs of the last 6 observed successors of the
requested file
• Outputs:
– Probability of access Type
• Sequential (1, 2, 3, 4, 5)
• Alternating (1, 3, 5, 7, 9)
• Every third (1, 4, 7, 10, 13)
• Random (10, 4, 11, 6, 2)
7. The Neural Network
Last 6 P(Seq)
accessed 4
Layer
P(Alt)
File IDs MLP
P(E3) PT
Access Type
Neural
Network
P(Rand)
8. Characteristics
• 4 Layer, Back Propagation NN
• Tunable # hidden units
• Input unit: Identity function
• Hidden unit: Sigmoid function
• Used FANN Library
9. Job of the NN
• Predict the type of access pattern
• Let the File System know of it before every
read
• Evaluated on a NN simulator
– Generated 0.3 million traces (inode numbers)
– Training set: 0.3 million
– Testing set: 0.3 million
10. Sample Application
• 4 Steps
– Read files in a directory
– Update the access sequence
– Invoke the NN predictor
– Hint the kernel about the access pattern
• Informed prefetching
– Reduces unnecessary reads
– Decreased read response time
15. Work In Progress
• Bench Marking
– Evaluation metrics such as EMR
– Different work loads
– Fine grained profiling
• Learn block level access patterns
– Per file
• Bigger data set