Retico is an open-source framework for building state-of-the-art incremental processing systems. This python package contains the functionality of the major supported retico modules and makes them easily accessible.
"Incremental" means that the system processes inputs (e.g., speech recognition) at a fine-grained level, typically word-by-word. Large language models like ChatGPT allow a user to type a full input before submitting it to the chatbot, but when humans speak or write to each other, they produce and comprehend language incrementally, word-by-word. Not all practical systems need to function incrementally, but some would benefit from word-level processing. For example, a recent NSF-sponsored workshop report on spoken interaction with robots recommends that systems and modules should work in real-time to enable them to be more natural and responsive.
Retico is based on the Incremental Unit model of incremental dialogue processing. You can read about the basics of incremental processing. A typical Retico system is made up of processing modules. For example, a system could be made up of five modules: a speech recognizer, language understander, dialogue manager, natural language generator, and speech synthesizer. An "Incremental Unit" (IU) is a piece of information that passes between them. For example, a speech recognizer recognizes individual words packaged as an IU and outputs them to the language understander which takes in the speech recognition IU and interprets the intent of the user, then sends an IU that contains information about the intent to the dialogue manager, and so on.
Other incremental processing frameworks exist like InproTK which is written in Java.
The development of Retico is partially supported by the National Science Foundation
Some of the modules are available on pypi. Minimally, you need the retico_core (documentation). Individual modules have more information about their respective installation requirements.
To get a quick system up and running, run the following in a new Python environmnt (tested with Python 3.9 - 3.13):
pip install git+https://github.com/retico-team/retico-core git+https://github.com/retico-team/retico-googleasr git+https://github.com/retico-team/retico-huggingfacelm git+https://github.com/retico-team/retico-speechbraintts
Then download the example runner.py file and run python runner.py. This sample system is a simple chatbot that uses Retico modules, so ask it a simple question and it should give you an answer!
For more details about installation and development, click below to watch a tutorial of installing Retico, builing your first system, and how modules and incremental units work.
(hold down on ctrl then click the link to open a new tab)
- VQA github (live VQA with multiple camera streams)
- Emotion Tracking github
- Language Practice github
- Argumentation Tracking github
- Retico Agent-VQA github (a Retico agent that integrates Huggingface Agents tools)
- Simple Retico Agent documentation
- Game of NIM with Misty II Robot with RL github
- Robot "Tutor" with Misty II Robot github
- Compare two Object Recgonition Models github
- Cozmo on CoppelliaSim github
from retico import *
# imports for other modules
def callback(update_msg):
for x, ut in update_msg:
print(f"{ut}: {x.text} ({x.stability}) - {x.final}")
m1 = MicrophoneModule()
m2 = Wav2VecModule()
m3 = TextDispatcherModule()
m4 = GoogleTTSModule("en-US", "en-US-Wavenet-A")
m5 = SpeakerModule()
m6 = CallbackModule(callback)
m1.subscribe(m2)
m2.subscribe(m3)
m3.subscribe(m4)
m4.subscribe(m5)
m2.subscribe(m6)
run(m1)
input()
stop(m1)Some individual modules have visualization tools. We are currently working on a system-level visualization tool.
There are three options for loggers:
- The Platform for Situated Intelligence (psi) is a powerful framework for building complex AI systems, written in C#. Logging to psi is straight-forward using the
retico-zeromqmodule. - Full Logging of all incremental units can be done direclty with the Articulab fork of retico.
- Simple logging into a json file
Research that uses Retico
Michael, T. (2023). Simulating Conversations for the Prediction of Speech Quality. Springer International Publishing AG.
Thilo Michael. 2020. Retico: An incremental framework for spoken dialogue systems. In Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 49–52, 1st virtual meeting. Association for Computational Linguistics.
Thilo Michael and Sebastian Möller. 2020. Simulating Turn-Taking in Conversations with Delayed Transmission. In Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 157–161, 1st virtual meeting. Association for Computational Linguistics.
The SLIM Lab maintains many of the module repositories in retico-team.
Research that uses Retico
Henry, C., & Kennington, C. (2024). Unsupervised, Bottom-up Category Discovery for Symbol Grounding with a Curious Robot. arXiv preprint arXiv:2404.03092.
Whetten, R., Levandovsky, E., Imtiaz, M. T., & Kennington, C. (2023, August). Evaluating Automatic Speech Recognition and Natural Language Understanding in an Incremental Setting. In Proceedings of the 27th Workshop on the Semantics and Pragmatics of Dialogue (SemDial). Maribor, Slovenia.
Josue Torres-Fonseca, Catherine Henry, and Casey Kennington. 2022. Symbol and Communicative Grounding through Object Permanence with a Mobile Robot. In Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 124–134, Edinburgh, UK. Association for Computational Linguistics.
Imtiaz, M.T., Kennington, C. (2022). Incremental Unit Networks for Distributed, Symbolic Multimodal Processing and Representation. In: Duffy, V.G. (eds) Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Health, Operations Management, and Design. HCII 2022. Lecture Notes in Computer Science, vol 13320. Springer, Cham.
Casey Kennington, Daniele Moro, Lucas Marchand, Jake Carns, and David McNeill. 2020. rrSDS: Towards a Robot-ready Spoken Dialogue System. In Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 132–135, 1st virtual meeting. Association for Computational Linguistics.
Simple Retico Agent documentation
@inproceedings{michael-2020-retico,
title = "Retico: An incremental framework for spoken dialogue systems",
author = "Michael, Thilo",
booktitle = "Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue",
month = jul,
year = "2020",
address = "1st virtual meeting",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2020.sigdial-1.6",
doi = "10.18653/v1/2020.sigdial-1.6",
pages = "49--52"
}
If you use any of the multiomdal modules (vision, robots, etc.) please also cite this paper:
@inproceedings{manaseryan-etal-2025-rrsds,
title = "rr{SDS} 2.0: Incremental, Modular, Distributed, Multimodal Spoken Dialogue with Robotic Platforms",
author = "Manaseryan, Anna and
Rigby, Porter and
Matthews, Brooke and
Henry, Catherine and
Torres-Fonseca, Josue and
Whetten, Ryan and
Levandovsky, Enoch and
Kennington, Casey",
editor = "B{\'e}chet, Fr{\'e}d{\'e}ric and
Lef{\`e}vre, Fabrice and
Asher, Nicholas and
Kim, Seokhwan and
Merlin, Teva",
booktitle = "Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue",
month = aug,
year = "2025",
address = "Avignon, France",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.sigdial-1.51/",
pages = "637--640",
abstract = "This demo will showcase updates made to the `robot-ready spoken dialogue system' built on the Retico framework. Updates include new modules, logging and real-time monitoring tools, integrations with the Coppelia Sim virtual robot platfrom, integrations with a benchmark, improved documentation, and pypi environment usage."
}