Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/examples/ai_podcast.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,4 +76,4 @@ Chinese (AI_Podcast_ZH.py)

* `AI_Podcast <https://github.com/xorbitsai/inference/blob/main/examples/AI_podcast.py>`_ (English Version)

* AI_Podcast_ZH (Chinese Version)
* `AI_Podcast_ZH <https://github.com/xorbitsai/inference/blob/main/examples/AI_podcast_ZH.py>` (Chinese Version)
8 changes: 4 additions & 4 deletions doc/source/examples/chatbot.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
.. _examples_chatbot:

====================
Example: chatbot 🤖️
====================
=======================
Example: CLI chatbot 🤖️
=======================

**Description**:

Demonstrate how to interact with Xinference to play with LLM chat functionality with an AI agent 💻
Demonstrate how to interact with Xinference to play with LLM chat functionality with an AI agent in command line💻

**Used Technology**:

Expand Down
30 changes: 30 additions & 0 deletions doc/source/examples/gradio_chatinterface.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
.. _examples_gradio_chatinterface:

==============================
Example: Gradio ChatInterface🤗
==============================

**Description**:

This example showcases how to build a chatbot with 120 lines of code with Gradio ChatInterface and Xinference local LLM

**Used Technology**:

@ `Xinference <https://github.com/xorbitsai/inference>`_ as a LLM model hosting service

@ `Gradio <https://github.com/gradio-app/gradio>`_ as a web interface for the chatbot

**Detailed Explanation on the Demo Functionality** :

* Parse user-provided command line arguments to capture essential model parameters such as model name, size, format, and quantization.

* Establish a connection to the Xinference framework and deploy the specified model, ensuring it's ready for real-time interactions.

* Implement helper functions (flatten and to_chat) to efficiently handle and store chat interactions, ensuring the model has context for generating relevant responses.

* Set up an interactive chat interface using Gradio, allowing users to communicate with the model in a user-friendly environment.

* Activate the Gradio web interface, enabling users to start their chat sessions and receive model-generated responses based on their queries.

**Source Code** :
* `Gradio ChatInterface <https://github.com/xorbitsai/inference/blob/main/examples/gradio_chatinterface.py>`_
48 changes: 47 additions & 1 deletion doc/source/examples/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,50 @@ Examples
:hidden:

ai_podcast
chatbot
chatbot
gradio_chatinterface
pdf_chatbot

Here you can find examples and resources to learn about how to use Xinference.

Examples
========

End-to-end examples of using Xinference for various tasks:

* `Voice Conversations with AI Agents on M2 Max <ai_podcast.html>`_

* `Interacting with LLM Models: A Command-Line Example <chatbot.html>`_

* `Interacting with LLM Models: A Gradio ChatInterface Example <gradio_chatinterface.html>`_

* `PDF Chatbot with Local LLM and Embeddings <pdf_chatbot.html>`_

If you come across other examples in your own workflows we encourage you to contribute a `PR <https://github.com/xorbitsai/inference/pulls>`_!


Tutorials
=========

The following tutorials cover the basics of using Xinference in different scenarios:

* `Build a QA Application with Xinference and LangChain <https://github.com/RayJi01/Xprobe_inference/blob/main/examples/LangChain_QA.ipynb>`_

* `Using Xinference local LLMs within LlamaIndex <https://gpt-index.readthedocs.io/en/stable/examples/llm/XinferenceLocalDeployment.html>`_

* `[Chinese] 如何让 Chatbox 接入开源大模型,实现免费聊天 <https://twitter.com/benn_huang/status/1701420060240490785>`_

* `[Chinese] 摆脱 OpenAI 依赖,8 分钟教你用开源生态构建全栈 AI 应用 <https://mp.weixin.qq.com/s/cXBC0dikldNiGwOwPuJfUQ>`_

* `[Chinese] 使用全套开源工具构建 LLM 应用实战 <https://mp.weixin.qq.com/s/regqYkF0cNDQIdOkOeyeXQ>`_


Third-Party Library Integrations
================================

Xinference is designed to seamlessly integrate and deploy open-sourced AI models, so we want to incorporate support for mainstream toolkits
in the AI landscape. Xinference can be used with the following third-party libraries:

* LangChain `Text Embedding Models <https://python.langchain.com/docs/integrations/text_embedding/xinference>`_ and `LLMs <https://python.langchain.com/docs/integrations/llms/xinference>`_

* `LlamaIndex Xinference LLM <https://docs.llamaindex.ai/en/stable/api_reference/llms/xinference.html>`_
32 changes: 32 additions & 0 deletions doc/source/examples/pdf_chatbot.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
.. _examples_pdf_chatbot:

======================
Example: PDF Chatbot📚
======================

**Description**:

This example showcases how to build a PDF chatbot with local LLM and Embedding models

**Used Technology**:

@ `Xinference <https://github.com/xorbitsai/inference>`_ as a LLM model hosting service

@ `LlamaIndex <https://github.com/run-llama/llama_index>`_ for orchestrating the entire RAG pipeline

@ `Streamlit <https://streamlit.io/>`_ for interactive UI

**Detailed Explanation on the Demo Functionality** :

* Crafted a Dockerfile to simplify the process and ensure easy reproducibility.

* Set up models with Xinference and expose two ports for accessing them.

* Leverage Streamlit for seamless file uploads and interactive communication with the chat engine.

* 5x faster doc embedding than OpenAI's API.

* Leveraging the power of GGML to offload models to the GPU, ensuring swift acceleration. Less long waits for returns.

**Source Code** :
* `PDF Chatbot <https://github.com/onesuper/PDF-Chatbot-Local-LLM-Embeddings>`_