metadata
license: apache-2.0
base_model:
- deepseek-ai/DeepSeek-V3.2-Exp
pipeline_tag: text-generation
library_name: transformers
Changes from the Original DeepSeek-V3.2-Exp
- Dequantized Indexer to bfloat16
- Compatibility with transformers library (trust_remote_code=True)
test code
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("kishizaki-sci/DeepSeek-V3.2-Exp-FP8", trust_remote_code=True, dtype="auto", device_map='auto')
tokenizer = AutoTokenizer.from_pretrained("kishizaki-sci/DeepSeek-V3.2-Exp-FP8")
# Copied from https://huggingface.co/docs/transformers/model_doc/deepseek_v3
chat = [
{"role": "user", "content": "Hello, how are you?"},
{"role": "assistant", "content": "I'm doing great. How can I help you today?"},
{"role": "user", "content": "I'd like to show off how chat templating works!"},
]
inputs = tokenizer.apply_chat_template(chat, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
import time
start = time.time()
outputs = model.generate(inputs, max_new_tokens=50)
print(tokenizer.batch_decode(outputs))
print(time.time()-start)