The document discusses a method for pruning convolutional neural networks to make them more efficient for resource-constrained inference. The method uses a Taylor expansion to calculate the saliency of parameters, allowing it to prune those with the least effect on the network's loss. Experiments on networks like VGG-16 and AlexNet show the method can significantly reduce operations with little loss in accuracy. Layer-wise analysis provides insight into each layer's importance to the overall network.