Torch-Pruning
Torch-Pruning is a comprehensive framework for structural pruning of deep neural networks, designed to optimize Large Language Models, Vision Foundation Models, and other architectures. Built on the academic DepGraph algorithm featured in CVPR 2023, it distinguishes itself from standard masking approaches by identifying and removing coupled parameter groups to achieve actual model size reduction and speedup. The toolkit supports a wide array of off-the-shelf models from libraries such as Hugging Face, Torchvision, and Timm. Key use cases include pruning state-of-the-art systems like Llama, Qwen, Phi, and DeepSeek variants for LLMs, as well as SAM, Diffusion Models, Vision Transformers, Swin Transformers, BERT, YOLOv7/v8, ResNet, and FasterRCNN. Recent updates integrate cutting-edge methods like Isomorphic Pruning for modern CNNs and Transformers, along with examples for DeepSeek-R1-Distill. Compatible with PyTorch 2.0 and later, Torch-Pruning provides a general-purpose solution for researchers and developers