This document discusses non-stationary multi-armed bandit problems and algorithms to address them. It introduces the Thompson Sampling algorithm and discusses its ability to handle uncertainty. For non-stationary environments where reward distributions change over time, it describes approaches like Exp3, UCB-V, CUSUM-UCB, and Dynamic Thompson Sampling that can adapt. It specifically focuses on the Change Detection based UCB (CD-UCB) algorithm, which uses change detection to identify distribution changes and adjust its exploration rate accordingly. The document then outlines simulations conducted to evaluate the proposed Thompson Sampling with Change Detection (TS-CD) algorithms on problems involving multi-connectivity and radio access technology selection in wireless networks. Results