Skip to content

macOS 的 transformers==4.34.1 版本运行 README 上的 example 代码报错 #1505

@Moskize91

Description

@Moskize91

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

执行 example 代码后直接报错如下:

Traceback (most recent call last):
  File "/Users/taozeyu/codes/test/tiny_llm/main.py", line 13, in <module>
    main()
  File "/Users/taozeyu/codes/test/tiny_llm/main.py", line 4, in main
    tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/taozeyu/codes/test/tiny_llm/.venv/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py", line 738, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/taozeyu/codes/test/tiny_llm/.venv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2017, in from_pretrained
    return cls._from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/taozeyu/codes/test/tiny_llm/.venv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2249, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/taozeyu/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b/bf0f5cfb575eebebf9b655c5861177acfee03f16/tokenization_chatglm.py", line 196, in __init__
    super().__init__(
  File "/Users/taozeyu/codes/test/tiny_llm/.venv/lib/python3.12/site-packages/transformers/tokenization_utils.py", line 367, in __init__
    self._add_tokens(
  File "/Users/taozeyu/codes/test/tiny_llm/.venv/lib/python3.12/site-packages/transformers/tokenization_utils.py", line 467, in _add_tokens
    current_vocab = self.get_vocab().copy()
                    ^^^^^^^^^^^^^^^^
  File "/Users/taozeyu/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b/bf0f5cfb575eebebf9b655c5861177acfee03f16/tokenization_chatglm.py", line 248, in get_vocab
    vocab = {self._convert_id_to_token(i): i for i in range(self.vocab_size)}
                                                            ^^^^^^^^^^^^^^^
  File "/Users/taozeyu/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b/bf0f5cfb575eebebf9b655c5861177acfee03f16/tokenization_chatglm.py", line 244, in vocab_size
    return self.sp_tokenizer.num_tokens
           ^^^^^^^^^^^^^^^^^
AttributeError: 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'. Did you mean: '_tokenize'?

Expected Behavior

No response

Steps To Reproduce

requirements.txt 代码

protobuf
transformers==4.34.1
cpm_kernels
torch>=1.10
gradio
mdtex2html
sentencepiece
accelerate

main.py 中的代码

from transformers import AutoTokenizer, AutoModel

def main():
  tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
  model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
  model = model.eval()
  response, history = model.chat(tokenizer, "你好", history=[])
  print(response)
  response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
  print(response)

if __name__ == "__main__":
  main()

Environment

- OS: macOS 15.3.1
- Python: 3.12.7
- Transformers: 4.34.1
- PyTorch: 2.6.0
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : False

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions