Hacker News new | past | comments | ask | show | jobs | submit login

Because the way they are quantized takes time to get bug-free when new architectures are released. If a model was quantized with a known bug in the quantizer, then it effectively makes those quantized versions buggy and they need to be requantized with a new version of llamacpp which has this fixed.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: