Support for multiple LLMs (currently LLAMA, BLOOM, OPT) at various model sizes (up to 170B) Support for a wide range of consumer-grade Nvidia GPUs Tiny and easy-to-use codebase mostly in Python (<500 ...
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results