Igniting 2025 with tons of INT4 Quantizations!

As we just ignited 2025, and 2024 came to an end, I am proud to share that I have successfully uploaded over 230 quantized SLM/LLM models to my HuggingFace account. These models were entirely quantized using the computational resources of my homelab, achieving approximately 72 TFLOPS of performance-powered solely by "domestic" hardware.
Continue Reading >>>
Advanced Weight-only Quantization Technique on CPU

When LLMs started spreading at the end of 2022, it sounded something really impossible: training or even just fine-tune a model on your modest customer-grade hardware was fantasy.
Now, in the middle of 2024, thanks to an intensive work of scientific research, considerable investment, open governance, open collaboration, and a good dose of human ingenuity, we are now able to fine-tune models directly on our devices. Incredibile!
Continue Reading >>>