GitHub’s Llama 2 is a powerful language model that enables individuals, creators, researchers, and businesses to experiment, innovate, and scale their ideas. This release includes model weights and starting code for pre-trained and fine-tuned Llama language models that range from 7 billion to 70 billion parameters. Llama 2 is an AI technology that carries potential risks with use, and thus, developers have a responsibility to address these risks.
Users can download Llama 2’s model weights and tokenizer by visiting the Meta website and accepting their License. Users will receive a signed URL over email, which they will need when running the download.sh script on the downloaded repository. Meta’s Llama-Recipes provide detailed examples that allow users to leverage Hugging Face.
Llama 2’s models come in two types - Pretrained Models and Fine-tuned Chat Models. Pretrained models are not fine-tuned for chat or Q&A, and researchers need to prompt them so that the expected answer is the natural continuation of the prompt. On the other hand, fine-tuned models are trained for dialogue applications. To get the expected features and performance from the fine-tuned models, specific formatting defined in chat_completion, including the INST and <<SYS>> tags, BOS, EOS tokens, and whitespaces, and break-lines must be followed. Users can deploy additional classifiers for filtering out inputs and outputs that are deemed unsafe. The Llama 2 repository provides examples of how to add a safety checker to the inputs and outputs of your inference code.
GitHub’s Llama 2 is licensed for both researchers and commercial entities and upholds the principles of openness. To report any software bugs or other problems with the models, users can submit feedback on the GitHub or Facebook pages or report bugs and security concerns to Facebook’s Whitehat/info pages.