This document discusses using Q-learning to optimize neural network hyperparameters. It introduces neural networks and hyperparameters like regularization constant and hidden layer size. Q-learning creates a Q-matrix to iteratively select hyperparameters based on bias, variance, and F-score reward. Testing on mushroom datasets, Q-learning decreased computational time over random selection while maintaining accuracy. Future work could use more hyperparameters and datasets to further validate the approach. Optimizing hyperparameters with reinforcement learning could eliminate human tuning and maximize neural network performance.