The document proposes an optimized audio classification and segmentation algorithm that segments audio streams into four types - pure speech, music, environment sound, and silence - using ensemble methods. It uses a hybrid classification approach of bagged support vector machines and artificial neural networks. The algorithm aims to accurately segment audio with minimum misclassification and requires less training data, making it suitable for real-time applications. It segments non-speech portions into music or environment sound and further divides speech into silence and pure speech. The algorithm achieves approximately 98% accurate segmentation.