Build, customize, and experiment with LangChain-powered agents running local LLMs through Ollama!
LangChain is a powerful Python framework for building agentic systems using LLMs. It enables abstraction of components, making it easier to construct agents, chains, memory-aware conversations, and more. This abstraction extends across the LangChain ecosystem including:
- LangGraph: Graph-based reasoning framework.
- LangSmith: LLM observability and debugging tool.
- LangFlow: Visual editor for LangChain apps.
- LangServe: Server-ready LangChain deployments.
In this project, we'll walk through setting up a LangChain project using the Ollama Python library to run LLMs locally and customize agent behavior.
We'll use uv, a fast Python package and environment manager written in Rust.
-
Initialize the project:
uv init langchain-agents
-
Create a virtual environment:
uv venv --python 3.12.7
👉 To see all Python versions on your system:
uv python list
-
Activate the virtual environment:
source .venv/bin/activate -
Install dependencies:
uv add \ langchain-core langchain-ollama langsmith docarray \ langchain-community ipykernel langchain \ fastapi uvicorn google-search-results ollama
-
Install Ollama:
-
Follow the official installation guide.
-
If using WSL:
curl -fsSL https://ollama.com/install.sh | sh
-
-
Pull a model (e.g., Qwen 0.6b):
ollama pull qwen3:0.6b
-
Test the setup:
uv run ollama_basics.py
🧰 You can find more advanced model options and configuration settings in the Ollama REST API Docs.
-
✅ Interpreter selection:
Ctrl + Shift + P→ "Python: Select Interpreter"- Enter path:
.venv/bin/python, or usepython3orpython3.12
-
🔄 Reload Window:
Ctrl + Shift + P→ "Reload Window"
We're using non-instruction-tuned models like DeepSeek-R1 and Qwen3, which can lead to unpredictable or incoherent outputs.
👉 It’s highly recommended to use instruction-tuned models like Llama3 for more reliable and contextual responses.
Foundation layer that provides abstractions for:
- Prompts
- Memory
- Chains
- Tools
- Agents
LangChain recognizes three prompt types:
-
System Prompt 🧠 Provides role and behavior for the LLM (e.g., “You are a helpful assistant…”)
-
User Prompt 👤 Carries the task or question from the user.
-
AI Prompt 🤖 LLM’s generated response to the user.
LangChain offers templates to compose and manage these prompt types effectively.
A debugging, testing, and observability tool for LLM applications.
- Monitors how your agents behave
- Tracks prompts, outputs, and traces
- Requires an API key (free tier available)
An effective prompt contains:
- Rules: Behavior definition for the model.
- Context: External info (a.k.a. Retrieval-Augmented Generation).
- Question: What the user wants.
- Answer: What the LLM returns.
LLMs don’t retain previous inputs by default. LangChain provides memory wrappers to retain conversation history.
| Type | Description |
|---|---|
ConversationBufferMemory |
Stores all past messages in memory. |
ConversationBufferWindowMemory |
Retains only the last N messages. |
ConversationSummaryMemory |
Summarizes entire conversation history. |
ConversationSummaryBufferMemory |
Mix of buffer + summary using token limits. |
RunnableWithMessageHistory.
Agents enhance LLMs by allowing tool usage (e.g., search, calculators, databases).
🔧 Key points when defining tools:
- Use docstrings to explain when and why the tool should be used.
- Provide self-explanatory argument names.
- Add clear type annotations for both input and output.
Example:
def get_weather(city: str) -> str:
"""Gets current weather for the provided city."""
...LCEL allows composing LLM chains declaratively and functionally — similar to RxJS or functional streams.
✅ Ideal for:
- Chaining prompts
- Handling memory
- Creating reusable LLM components