bare-metal.ai Articles

Generative AI on prem: secure, ethical, and accessible.

Quantization

Igniting 2025 with tons of INT4 Quantizations!


e858af29-e433-4acf-955d-56b29195fa29

As we just ignited 2025, and 2024 came to an end, I am proud to share that I have successfully uploaded over 230 quantized SLM/LLM models to my HuggingFace account. These models were entirely quantized using the computational resources of my homelab, achieving approximately 72 TFLOPS of performance-powered solely by "domestic" hardware.

Continue Reading >>>

Advanced Weight-only Quantization Technique on CPU

image400

When LLMs started spreading at the end of 2022, it sounded something really impossible: training or even just fine-tune a model on your modest customer-grade hardware was fantasy.

Now, in the middle of 2024, thanks to an intensive work of scientific research, considerable investment, open governance, open collaboration, and a good dose of human ingenuity, we are now able to fine-tune models directly on our devices. Incredibile!
Continue Reading >>>

Notice
We and selected third parties use cookies or similar technologies for technical purposes and, with your consent, for experience, measurement and marketing (personalized ads) as specified in the cookie policy.
With respect to advertising, we and 847 selected third parties, may use precise geolocation data, and identification through device scanning in order to store and/or access information on a device and process personal data like your usage data for the following advertising purposes: personalised advertising and content, advertising and content measurement, audience research and services development.
You can freely give, deny, or withdraw your consent at any time by accessing the preferences panel. If you give consent, it will be valid only in this domain. Denying consent may make related features unavailable.

Use the “Accept” button to consent. Use the “Reject” button to continue without accepting.

bare-metal.ai on Substack

Read on Substack