This document discusses using deep reinforcement learning for contention window optimization in IEEE 802.11ax networks. It presents reinforcement learning as a machine learning paradigm where an agent learns through trial and error using a reward function. It then models contention window optimization as a Markov decision process where an access point agent takes CW value actions based on observed collision rates to maximize throughput. The document proposes a centralized contention window optimization with deep reinforcement learning approach that was tested in ns-3 simulations, showing up to 40% throughput improvements over static CW settings, especially in dynamic network topologies.