A repository dedicated to using simple techniques to tease out what language models actually represent as learned knowledge and what they merely memorize from disparate training sources.
All included source code is written and tested in a Python 3.11 environment, though older versions of Python 3 may still function.
All packages imported by various scripts are included in requirements.txt and can be simply installed via pip3 install -r requirements.txt
This repository is designed to work with locally run LMs via the Ollama python library. Changing to the OpenAI API/python library (which Ollama also implements and therefore can still interaface with your local models) is a consideration for the future.
For tokenization with GPT2's tokenizer (common to OpenAI models, LLAMA models, and several others -- however this may not be the tokenizer of your model, so update as needed), we use the HuggingFace Transformers library.
In the future, better tokenization control (including opting-out of tokenizers running) will be implemented.
All scripts are written with command line argument parsing, ergo their specific execution details can be read via: python3 <script.py> --help.
lm_math.pyperforms the primary investigation using locally run language modelsplots.pyproduces figures using the outputs generated bylm_math.py
special_operation.pydefines an arbitrary function forlm_math.pyto use as an un-memorizable input to the LM.