[📖Paper]   [🤗ReasonMed Dataset]
[🤗ReasonMed-7B model]   [🤗CoTMed-7B model]   [🤗ReasponseMed-7B model]
Table of Contents
- Introduction
- Installation
- Modules
- 3.1 Generate CoTs
- 3.2 Evaluate CoTs
- 3.3 Quality Ranker
- 3.4 Error Refiner
- 3.5 Diff Optimizer
- 3.6 Response Summarizer
- 3.7 Score Evaluator
- Example Pipeline
- 4.1 Easy Pipeline
- 4.2 Medium Pipeline
- 4.3 Difficult Pipeline
- Conclusion
ReasonMed is a comprehensive multi-agent generated dataset designed to advance medical reasoning capabilities. It is equipped with various tools and modules for generating, validating, optimizing, ranking, summarizing, and evaluating Chain-of-Thought (CoT) responses in the medical domain. ReasonMed's goal is to help researchers and practitioners improve and assess medical reasoning in clinical decision-making.
This README provides an overview of ReasonMed's core functionality, installation instructions, usage examples, and how to integrate each module into your medical reasoning workflow.
To get started, clone the ReasonMed repository:
git clone https://github.com/YuSun-Work/ReasonMed.git
cd ReasonMedconda create -n reasonmed python=3.11 -y
conda activate reasonmed
pip install -r requirements.txtNote: Ensure that you have access to the models or endpoints mentioned for inference in each script.
This module generates multiple Chain-of-Thought (CoT) responses from three different models, each generating three CoTs for a given question.
python generate_9cot.py --data_path /path/to/question.json --model_path1 /path/to/model1 --model_path2 /path/to/model2 --model_path3 /path/to/model3 --json_path /path/to/save_cot.json[
{
"question": "Chronic urethral obstruction due to benign prostatic hyperplasia can lead to the following change in kidney parenchyma",
"options": [
"Hyperplasia",
"Hypertrophy",
"Atrophy",
"Dysplasia"
]
}
]data_path: The path to the JSON file containing clinical questions and multiple-choice options.model_path1,model_path2,model_path3: Paths to the three models used for generating CoTs.json_path: Path to save the generated CoTs.
The generated CoTs will be saved in a specified JSON file.
intermediate.json can be used to store results after each stage, useful for debugging or troubleshooting.
This script validates the generated CoTs by verifying their correctness against clinical reasoning.
python verifier.py --input_json /path/to/save_cot.json --model_path /path/to/eval_modelinput_json: Path to the JSON file containing generated CoTs.model_path: Path to the model used for evaluating the CoTs.
Validates the CoTs and outputs a verdict (e.g., Correct, Error).
The quality ranker ranks the CoTs generated for each clinical question, keeping the top two most valid CoTs.
python quality_ranker.py --input_json /path/to/save_cot.json --model_path /path/to/eval_model --intermediate_file /path/to/intermediate.json --final_output /path/to/final_results.jsoninput_json: The JSON file with the CoTs to be evaluated and ranked.model_path: Path to the model used for ranking.intermediate_file: Path to save intermediate ranking results.final_output: Path to save the final ranked CoTs.
Ranks the CoTs and saves the best two CoTs per clinical question.
This module refines CoTs that have errors or incomplete reasoning by leveraging error feedback to improve the reasoning process.
python error_refiner_openai.py --input_json /path/to/save_cot.json --api_key /path/to/api_key --azure_endpoint /path/to/azure_endpoint --model /path/to/refine_model --output_json /path/to/refined_output.jsoninput_json: Path to the JSON file with generated CoTs to be refined.api_key: Your Azure OpenAI API key.azure_endpoint: The endpoint for the Azure OpenAI API.model: Path to the refinement model.output_json: Path to save the refined CoTs.
Refined CoTs that incorporate error corrections from previous iterations.
This module performs advanced optimizations on the CoTs using the Azure OpenAI API. It focuses on deep reasoning improvements based on detailed feedback.
python diff_opti.py --input_json /path/to/save_cot.json --output_json /path/to/optimized_cot.json --api_key /path/to/api_key --azure_endpoint /path/to/azure_endpoint --model /path/to/optimize_modelinput_json: Path to the JSON file with CoTs to be optimized.output_json: Path to save the optimized CoTs.api_key: Your Azure OpenAI API key.azure_endpoint: The Azure OpenAI endpoint.model: The model used for deep optimization.
Optimized CoTs that have undergone deeper analysis and improvement.
This module generates concise summaries for each CoT, transforming verbose reasoning into a one-sentence explanation.
python response_summarizer.py --input_json /path/to/save_cot.json --model /path/to/summary_model --azure_endpoint /path/to/azure_endpoint --api_key /path/to/api_key --results_file /path/to/summaries.jsoninput_json: Path to the JSON file with CoTs to be summarized.model: The model used for summarization.azure_endpoint: The endpoint for the Azure OpenAI API.api_key: Your Azure OpenAI API key.results_file: Path to save the summarized CoTs.
A JSON file with concise summaries for each CoT.
This module evaluates the clinical accuracy of CoTs based on multiple criteria and generates scores for each CoT.
python score_evaluator.py --input_jsons /path/to/save_cot.json --model /path/to/score_model --azure_endpoint /path/to/azure_endpoint --api_key /path/to/api_key --final_output /path/to/scores.jsoninput_jsons: The input JSON file with CoTs to be evaluated.model: Path to the model used for scoring.azure_endpoint: The endpoint for the Azure OpenAI API.api_key: Your Azure OpenAI API key.final_output: Path to save the final evaluation scores.
Scores for each CoT based on clinical accuracy, reasoning, and completeness.
To generate and evaluate CoTs:
python generate_9cot.py --data_path /path/to/question.json --model_path1 /path/to/model1 --model_path2 /path/to/model2 --model_path3 /path/to/model3 --json_path /path/to/save_cot.json
python verifier.py --input_json /path/to/save_cot.json --model_path /path/to/eval_modelTo generate, evaluate, rank, and refine CoTs:
python generate_9cot.py --data_path /path/to/question.json --model_path1 /path/to/model1 --model_path2 /path/to/model2 --model_path3 /path/to/model3 --json_path /path/to/save_cot.json
python verifier.py --input_json /path/to/save_cot.json --model_path /path/to/eval_model
python quality_ranker.py --input_json /path/to/save_cot.json --model_path /path/to/eval_model --intermediate_file /path/to/intermediate.json --final_output /path/to/final_results.json
python error_refiner_openai.py --input_json /path/to/save_cot.json --api_key /path/to/api_key --azure_endpoint /path/to/azure_endpoint --model /path/to/refine_model --output_json /path/to/refined_output.jsonFor advanced optimizations:
python generate_9cot.py --data_path /path/to/question.json --model_path1 /path/to/model1 --model_path2 /path/to/model2 --model_path3 /path/to/model3 --json_path /path/to/save_cot.json
python verifier.py --input_json /path/to/save_cot.json --model_path /path/to/eval_model
python quality_ranker.py --input_json /path/to/save_cot.json --model_path /path/to/eval_model --intermediate_file /path/to/intermediate.json --final_output /path/to/final_results.json
python error_refiner_openai.py --input_json /path/to/save_cot.json --api_key /path/to/api_key --azure_endpoint /path/to/azure_endpoint --model /path/to/refine_model --output_json /path/to/refined_output.json
python diff_opti.py --input_json /path/to/save_cot.json --output_json /path/to/optimized_cot.json --api_key /path/to/api_key --azure_endpoint /path/to/azure_endpoint --model /path/to/optimize_modelStay tuned
Stay tuned
ReasonMed provides an integrated framework for generating, optimizing, validating, and evaluating medical Chain-of-Thought responses. This comprehensive pipeline is crucial for advancing AI-powered clinical reasoning and decision-making models.
@misc{sun2025reasonmed370kmultiagentgenerated,
title={ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning},
author={Yu Sun and Xingyu Qian and Weiwen Xu and Hao Zhang and Chenghao Xiao and Long Li and Yu Rong and Wenbing Huang and Qifeng Bai and Tingyang Xu},
year={2025},
eprint={2506.09513},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.09513},
}