Llama-3-Swallow-8B-Instruct-v0.1-kokoroe

Built with Meta Llama 3

The Llama-3-Swallow-8B-Instruct-v0.1-kokoroe is a large language model fine-tuned to follow instructions in Japanese, with safety tuning applied to enhance response appropriateness. This model is based on tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1.

Model Details

Model Description

Uses

This section describes three ways to use the model:

  • Huggingface/transformers library
  • vLLM library

huggingface/transformers Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "retrieva-jp/Llama-3-Swallow-8B-Instruct-v0.1-kokoroe"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16)
chat = [
    {"role": "system", "content": "あなたは誠実で優秀な日本人のアシスタントです。"},
    {"role": "user", "content": "自然言語処理とは何か"},
]
tokenized_input = tokenizer.apply_chat_template(chat, add_generation_prompt=True, tokenize=True, return_tensors="pt").to(model.device)
with torch.no_grad():
    output = model.generate(
        tokenized_input,
        max_new_tokens=100,
        do_sample=True,
        top_p=0.9,
        temperature=0.6,
    )[0]
print(tokenizer.decode(output))

vLLM Usage

$ vllm serve --model retrieva-jp/Llama-3-Swallow-8B-Instruct-v0.1-kokoroe --port 8000
$ curl -X POST http://localhost:8000/generate \
    -H "Content-Type: application/json" \
    -d '{
        "prompt": [
            {"role": "system", "content": "あなたは誠実で優秀な日本人のアシスタントです。"},
            {"role": "user", "content": "自然言語処理とは何か"}
        ],
        "max_new_tokens": 100,
        "do_sample": true,
        "top_p": 0.9,
        "temperature": 0.6
    }'

Model Card Authors

Satoru Katsumata

Model Card Contact

pr[at]retrieva.jp

Downloads last month
35
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for retrieva-jp/Llama-3-Swallow-8B-Instruct-v0.1-kokoroe

Finetuned
(3)
this model
Quantizations
2 models