Speech recognition works by recording a voice request, removing noise, and dividing the recording into frames. Each frame is analyzed using acoustic and language models to map sounds to phonemes and determine probable word sequences. The most likely sequence is output. Modern systems use machine learning to analyze context and achieve over 90% accuracy, recognizing various accents. While systems from Google, Yandex, and Microsoft use different approaches, all three major players have achieved high recognition quality for their supported languages.