This document describes a voice assistance-based remote surveillance system that uses an ESP32 Cam module. The system detects motion using a PIR sensor which triggers the camera to capture images and video. It then performs face recognition on the images using Python and TensorFlow to identify if the captured person is known or unknown. For unknown persons, it generates a voice alert using text-to-speech and can also send an email notification to the owner. The system provides real-time video streaming, remote monitoring capabilities, and voice alerts for intrusion detection at low cost. It is developed using Python for image processing and GUI creation, and Arduino IDE for programming the ESP32 camera module.