Instructions to use MiniMaxAI/MiniMax-M1-40k-hf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MiniMaxAI/MiniMax-M1-40k-hf with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="MiniMaxAI/MiniMax-M1-40k-hf") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("MiniMaxAI/MiniMax-M1-40k-hf") model = AutoModelForCausalLM.from_pretrained("MiniMaxAI/MiniMax-M1-40k-hf") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use MiniMaxAI/MiniMax-M1-40k-hf with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MiniMaxAI/MiniMax-M1-40k-hf" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiniMaxAI/MiniMax-M1-40k-hf", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/MiniMaxAI/MiniMax-M1-40k-hf
- SGLang
How to use MiniMaxAI/MiniMax-M1-40k-hf with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "MiniMaxAI/MiniMax-M1-40k-hf" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiniMaxAI/MiniMax-M1-40k-hf", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "MiniMaxAI/MiniMax-M1-40k-hf" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MiniMaxAI/MiniMax-M1-40k-hf", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use MiniMaxAI/MiniMax-M1-40k-hf with Docker Model Runner:
docker model run hf.co/MiniMaxAI/MiniMax-M1-40k-hf
Incompatible model.safetensors.index.json
Hi,
First of all, thank you for making hf weights public! I just tried to download the model, but noticed that model.safetensors.index.json contains references to model-00XXX-of-00414.safetensors while the files in the repo are named model-00XXX-of-00413.safetensors, which makes the download/loading of the model fail. Similarly, the file 00413 is referenced in the json, but does not exist in the repo.
I guess that something with computing tensor locations for the model.safetensors.index.json went wrong. Is there any chance to correct this? Thanks!
I will check the problem. You can download the model from https://huggingface.co/MiniMaxAI/MiniMax-M1-40k first, it should be the same.
Thank you so much for your feedback! We’ve identified the issue and have already fixed it. Please try again, and feel free to let us know if you encounter any further problems.