This repository provide necessary code, model, and test dataset to reproduce results in the paper "Mol-LLM: Multimodal Generalist Molecular LLM with Improved Graph Utilization", to support rebuttal process by providing detailed information.
For reproducibility, the model checkpoints and test dataset are available via GDrive. The corresponding model card and dataset cards are available on Huggingface, while download is only available for test dataset. After acceptance, the model checkpoints and train and test dataset will be released via Huggingface.
- Mol-LLM [GDrive][Huggingface]
- Mol-LLM (w/o Graph) [GDrive][Huggingface]
- Testset [GDrive][Huggingface]. After downloading the checkpoints and test dataset, adjust the path to each file by following the instructions in the Installation.
For easy and fast reproduction, all environments are built based on docker and Makefile.
- Build
dockerimage usingMakefile:make build-image - Before initialize
dockercontainer, set following volume mounting path inMakefileREPO_PATH=/home/{user_name}/text-mol: The path of the repositoryCACHE_PATH=/home/{user_name}/.cache: Huggingface cache pathIMAGE_NAME_TAG={user_name}/mol-llm:v1: The name of the built docker image
- FInally, initialize docker container using
Makefile:make init-container
- To reproduce performance of
Mol-LLMthrough Main Table 1-4, run the following command:bash /text-mol/Mol-LLM/bashes/mol-llm_test.sh "'{your_gpu_devices}'"- For example, if you want to run evaluation with
GPU=0,1, then inputyour_gpu_devices=0,1
- For example, if you want to run evaluation with
- To reproduce performance of
Mol-LLM (w/o Graph)through Main Table 1-4, run the following command:bash /text-mol/Mol-LLM/bashes/mol-llm_wo_graph_test.sh "'{your_gpu_devices}'"