This document discusses using machine learning algorithms to predict traffic flow and control traffic lights. Specifically, it explores using deep reinforcement learning (DRL) with Q-learning. The DRL agent is trained in the SUMO traffic simulation environment to optimize traffic light timing and reduce overall vehicle wait times at intersections. The agent represents traffic light states as actions and vehicle positions/speeds as states. It is rewarded based on decreasing total wait times. Through experience replay and training on historical state-action-reward data, the agent learns which traffic light patterns minimize congestion. The experiments showed the DRL approach improved traffic flow compared to traditional reinforcement learning methods for high-dimensional problems like city-wide traffic control.