Skip to content
@CogStack

CogStack

CogStack and related projects

CogStack AI Logo

Unlock the power of healthcare data with CogStack AI.


Website Documentation Community Forum

This GitHub organization hosts the open-source code for CogStack AI, including our core NLP pipelines, data workflows, and research tools.

  • Improve productivity and reduce clinical risk through better data with Deep Phenotypes and Search
  • Free-up staff time through health data insights from CogStack Language Models and Generative AI
  • Works with any digital health record system and interoperable to international data standards

Key Projects

NLP pipelines for healthcare, including the Medical Concept Annotation Tool (MedCAT) for entity recognition and linking.
CogStack NLP powers downstream analytics by structuring unstructured clinical data.


A collection of example workflows using Apache NiFi as the core orchestration engine.
These recipes connect NLP components and data services, supporting tasks such as Text Extraction, Natural Language Processing, and Secure Routing of Clinical Documents


Cogstack ModelServe (CMS) is a model-serving and model-governance system created for a range of CogStack NLP tasks. Targeting language models with NER and entity linking capabilities, CMS provides a one-stop shop for serving and fine-tuning models, training lifecycle management, as well as monitoring and end-to-end observability.


Research framework for deep generative modeling of patient timelines using Electronic Health Records (EHRs).
Enables exploration of predictive modeling and risk stratification through modern generative techniques.


Framework for building grounded instruction-based datasets and training domain-specific LLMs.
Focused on safe, explainable, and healthcare-aligned conversational AI.


🤝 Get Involved

We welcome contributions from clinicians, researchers, and developers passionate about healthcare AI.

  • Explore our repos and open issues
  • Share your ideas via discussions or feature requests
  • Join the mission to make healthcare data truly interoperable and intelligent

📄 License

Each project has its own license. Please check the relevant repository for details.

Pinned Loading

  1. CogStack-NiFi CogStack-NiFi Public

    Building data processing pipelines for documents processing with NLP using Apache NiFi and related services

    Python 54 19

  2. MedCAT MedCAT Public archive

    Medical Concept Annotation Tool

    Python 511 112

  3. MedCATtrainer MedCATtrainer Public archive

    A simple interface to inspect, improve and add concepts to biomedical NER+L -> MedCAT.

    Python 85 36

  4. Foresight Foresight Public

    Deep Generative Modelling of Patient Timelines using Electronic Health Records

    Jupyter Notebook 64 11

  5. OpenGPT OpenGPT Public

    A framework for creating grounded instruction based datasets and training conversational domain expert Large Language Models (LLMs).

    Jupyter Notebook 360 44

Repositories

Showing 10 of 38 repositories
  • CogStack-NiFi Public

    Building data processing pipelines for documents processing with NLP using Apache NiFi and related services

    CogStack/CogStack-NiFi’s past year of commit activity
    Python 54 19 0 0 Updated Oct 9, 2025
  • cogstack-jupyter-hub Public

    Custom Jupyter Hub Docker image with example notebooks

    CogStack/cogstack-jupyter-hub’s past year of commit activity
    Jupyter Notebook 3 1 0 0 Updated Oct 8, 2025
  • cogstack-nlp Public

    CogStack NLP, including the Medical Concept Annotation Tool MedCAT

    CogStack/cogstack-nlp’s past year of commit activity
    HTML 12 Apache-2.0 3 0 7 Updated Oct 8, 2025
  • CogStack-ModelServe Public

    A model serving and governance system for CogStack NLP solutions

    CogStack/CogStack-ModelServe’s past year of commit activity
    Python 1 Apache-2.0 2 1 (1 issue needs help) 3 Updated Oct 6, 2025
  • cogstack-platform-toolkit Public

    CogStack Platform-wide code, documentation and tooling.

    CogStack/cogstack-platform-toolkit’s past year of commit activity
    HCL 0 0 0 0 Updated Sep 23, 2025
  • cogstack-model-gateway Public

    A Gateway for accessing Cogstack ModelServe instances

    CogStack/cogstack-model-gateway’s past year of commit activity
    Python 2 Apache-2.0 0 0 0 Updated Sep 22, 2025
  • working_with_cogstack Public

    Basic setup and easy to follow templates to interact and search CogStack for data analysts

    CogStack/working_with_cogstack’s past year of commit activity
    Python 12 Apache-2.0 8 0 2 Updated Sep 18, 2025
  • .github Public
    CogStack/.github’s past year of commit activity
    0 0 0 0 Updated Sep 5, 2025
  • MedCATtrainer Public archive

    A simple interface to inspect, improve and add concepts to biomedical NER+L -> MedCAT.

    CogStack/MedCATtrainer’s past year of commit activity
    Python 85 Apache-2.0 36 18 5 Updated Sep 4, 2025
  • ocr-service Public
    CogStack/ocr-service’s past year of commit activity
    Python 4 2 0 0 Updated Aug 28, 2025

Most used topics

Loading…