intel/auto-round
intel/auto-round: A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Transformers. License: apache-2.0. Hugging Bay hosted rele
- License
- apache-2.0
- Scan status
- pending
- Hosting status
- external
- Upstream
- intel/auto-round
Open interactive artifact page