Structurally Prune Anything: Any Architecture, Any Framework, Any Time
Abstract
Neural network pruning serves as a critical technique for enhancing the efficiency of deep learning models. Unlike unstructured pruning, which only sets specific parameters to zero, structured pruning eliminates entire channels, thus yielding direct computational and storage benefits. However, the diverse patterns for coupling parameters, such as residual connections and group convolutions, the diverse deep learning frameworks, and the various time stages at which pruning can be performed make existing pruning methods less adaptable to different architectures, frameworks, and pruning criteria.
To address this, we introduce Structurally Prune Anything (SPA), a versatile structured pruning framework that can prune neural networks with any architecture, from any framework, and at any stage of training. SPA leverages a standardized computational graph and ONNX representation to prune diverse neural network architectures without the need for manual intervention.
Key Contributions
- Universal Framework: SPA works across any neural network architecture and deep learning framework
- Group-Level Importance Estimation: Groups dependent computational operators and estimates their importance for structured pruning
- Flexible Timing: Supports pruning at any stage - before training, after training with fine-tuning, or after training without fine-tuning
- OBSPA Algorithm: Introduces Optimal Brain SPA for state-of-the-art post-training pruning results without requiring fine-tuning or calibration data
Impact
This work addresses a fundamental challenge in neural network compression by providing the first truly universal structured pruning solution. The frameworkâs ability to work across different architectures and frameworks without manual intervention makes it highly practical for real-world deployment scenarios.
Recommended citation: Wang, X., et al. (2024). "Structurally Prune Anything: Any Architecture, Any Framework, Any Time." arXiv preprint arXiv:2403.18955.
Download Paper