This document proposes using deep learning approaches like convolutional neural networks to detect suspicious activities from surveillance videos. It aims to overcome limitations of existing depth sensors by using neural networks for more accurate human pose estimation. The system would monitor public areas through video surveillance and categorize human activities as usual or suspicious in real-time. It addresses gaps in previous research which focused on images rather than videos. The proposed methodology includes collecting and preprocessing surveillance video data, designing and training deep learning models to recognize and classify activities, and integrating the system with existing surveillance infrastructure for deployment.