Traceback (most recent call last):
File "/Users/taozeyu/codes/test/tiny_llm/main.py", line 13, in <module>
main()
File "/Users/taozeyu/codes/test/tiny_llm/main.py", line 4, in main
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/taozeyu/codes/test/tiny_llm/.venv/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py", line 738, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/taozeyu/codes/test/tiny_llm/.venv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2017, in from_pretrained
return cls._from_pretrained(
^^^^^^^^^^^^^^^^^^^^^
File "/Users/taozeyu/codes/test/tiny_llm/.venv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2249, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/taozeyu/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b/bf0f5cfb575eebebf9b655c5861177acfee03f16/tokenization_chatglm.py", line 196, in __init__
super().__init__(
File "/Users/taozeyu/codes/test/tiny_llm/.venv/lib/python3.12/site-packages/transformers/tokenization_utils.py", line 367, in __init__
self._add_tokens(
File "/Users/taozeyu/codes/test/tiny_llm/.venv/lib/python3.12/site-packages/transformers/tokenization_utils.py", line 467, in _add_tokens
current_vocab = self.get_vocab().copy()
^^^^^^^^^^^^^^^^
File "/Users/taozeyu/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b/bf0f5cfb575eebebf9b655c5861177acfee03f16/tokenization_chatglm.py", line 248, in get_vocab
vocab = {self._convert_id_to_token(i): i for i in range(self.vocab_size)}
^^^^^^^^^^^^^^^
File "/Users/taozeyu/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b/bf0f5cfb575eebebf9b655c5861177acfee03f16/tokenization_chatglm.py", line 244, in vocab_size
return self.sp_tokenizer.num_tokens
^^^^^^^^^^^^^^^^^
AttributeError: 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'. Did you mean: '_tokenize'?
protobuf
transformers==4.34.1
cpm_kernels
torch>=1.10
gradio
mdtex2html
sentencepiece
accelerate
from transformers import AutoTokenizer, AutoModel
def main():
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
print(response)
if __name__ == "__main__":
main()
- OS: macOS 15.3.1
- Python: 3.12.7
- Transformers: 4.34.1
- PyTorch: 2.6.0
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : False
Is there an existing issue for this?
Current Behavior
执行 example 代码后直接报错如下:
Expected Behavior
No response
Steps To Reproduce
requirements.txt 代码
main.py中的代码Environment
Anything else?
No response