Instructions to use nferruz/ProtGPT2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nferruz/ProtGPT2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="nferruz/ProtGPT2")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("nferruz/ProtGPT2") model = AutoModelForCausalLM.from_pretrained("nferruz/ProtGPT2") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use nferruz/ProtGPT2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "nferruz/ProtGPT2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nferruz/ProtGPT2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/nferruz/ProtGPT2
- SGLang
How to use nferruz/ProtGPT2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "nferruz/ProtGPT2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nferruz/ProtGPT2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "nferruz/ProtGPT2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nferruz/ProtGPT2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use nferruz/ProtGPT2 with Docker Model Runner:
docker model run hf.co/nferruz/ProtGPT2
Inquiry Regarding ProtGPT2 Prompt and Output Length
Dear Noeli,
I'd like to express my appreciation for your efforts in developing the ProtGPT2 model. It's an excellent resource for the research community, and I'm excited to work with it.
I have a few questions regarding the correct usage of prompts and output length when working with ProtGPT2. In your provided example, I noticed that you use an <|endoftext|> as the prompt:
from transformers import pipeline
protgpt2 = pipeline('text-generation', model="nferruz/ProtGPT2")
sequences = protgpt2("<|endoftext|>", max_length=100, do_sample=True, top_k=950, repetition_penalty=1.2, num_return_sequences=10, eos_token_id=0)
for seq in sequences:
print(seq)
Is it necessary to include the <|endoftext|> as a prompt? I experimented with using "MKK" directly as my prompt and the model returned results without any errors. However, I'm concerned about the accuracy of the results when using this approach. Could you please clarify whether the <|endoftext|> token is required for accurate output?
Like is <|endoftext|>MKK a more correct approach?
Additionally, does ProtGPT2 default to starting with "M" as the beginning of the generated sequence when provided with an <|endoftext|> prompt only?
My second question pertains to the max_length parameter. In your example, you set max_length=100, but the generated output can exceed this length. Is this an expected behavior of the model?
Your guidance on these matters would be greatly appreciated. I look forward to hearing from you and gaining a deeper understanding of the ProtGPT2 model.
Sincerely,
Littleworth
Hi Littleworth,
I put <|endoftext|> in the documentation as an example so that the model starts with a de novo sequence. But you can start with any sequence seed you want, like the one you mention. Or one could also leave it empty.
The model will often generate an 'M' after <|endoftext|>, but not always. It reproduces the distribution shown in the training set, and since some natural sequences don't start with 'M', sometimes it generates other amino acids too.
The max_length param refers to the number of tokens. Each token has an average length of 4 amino acids, so I'd expect that a max_length of 100 gives sequences from anywhere to 0 to 500 amino acids.
Hope this helps
Noelia
@nferruz Thanks so much for your clarification.