Context Engineering
Wiki

37 articles from arXiv, OpenAI, Anthropic, Google AI, and built-in terms. Auto-fetched and searchable.

OpenAI Cookbookprompt engineering

How to work with large language models

[Large language models][Large language models Blog Post] are functions that map text to text. Given an input string of text, a large language model predicts the text that should come next.

tokensprompts
OpenAI Cookbookprompt engineering

Techniques to improve reliability

When GPT-3 fails on a task, what should you do?

tokensprompts
OpenAI Cookbookprompt engineering

Related resources from around the web

People are writing great tools and papers for improving outputs from GPT. Here are some cool ones we've seen:

prompts
OpenAI Cookbooktoken optimization

How_to_count_tokens_with_tiktoken

{ "cells": { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": " How to count tokens with tiktoken\n", "\n", " tiktoken ...

tokenspromptsembeddings+2
OpenAI Cookbookprompt engineering

How_to_stream_completions

{ "cells": { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": " How to stream completions\n", "\n", "By default, when you request a completion...

tokenspromptsstreaming
Anthropiccaching

Prompt Caching

Claude API Documentation

cachingoptimizationtokens
Anthropicprompt engineering

Prompt Engineering Overview

Claude API Documentation

promptsengineeringbest-practices
Anthropicprompt engineering

Chain of Thought Prompting

Comprehensive guide to prompt engineering techniques for Claude's latest models, covering clarity, examples, XML structuring, thinking, and agentic systems.

promptsreasoningchain-of-thought
Anthropiccontext management

Context Windows

Claude API Documentation

contextwindowstokens
Anthropiccontext management

Long Context Window Tips

Comprehensive guide to prompt engineering techniques for Claude's latest models, covering clarity, examples, XML structuring, thinking, and agentic systems.

contextlong-contextoptimization
Anthropictoken optimization

Token Counting

Claude API Documentation

tokenscountingusage
Anthropicprompt engineering

Use XML Tags in Prompts

Comprehensive guide to prompt engineering techniques for Claude's latest models, covering clarity, examples, XML structuring, thinking, and agentic systems.

promptsxmlstructure
Anthropicprompt engineering

Extended Thinking

Claude API Documentation

reasoningthinkingchain-of-thought
Google AIcaching

Context Caching

Saiba como usar o armazenamento em cache de contexto na API Gemini

cachingcontextoptimization
Google AIcontext management

Long Context

Learn about how to get started building with long context (1 million context window) on Gemini.

contextlong-contexttokens
Google AItoken optimization

Tokens

/ Styles inlined from /site assets/css/style.css / body theme="googledevai theme" { devsite background 0: var devsite background 1 ; devsite button border: 1px solid 747775; devsite...

tokenscountingusage
Google AIprompt engineering

Prompting Strategies

/ Styles inlined from /site assets/css/style.css / body theme="googledevai theme" { devsite background 0: var devsite background 1 ; devsite button border: 1px solid 747775; devsite...

promptsstrategiesengineering
Google AIprompt engineering

System Instructions

Gemini API ile sohbet ve metin oluşturma uygulamaları geliştirmeye başlayın

system-promptsinstructionsengineering
Google AItool use

Code Execution

Learn how to use the Gemini API code execution feature.

codeexecutiontools
Built-incontext management

Progressive Disclosure

Instead of loading an entire codebase—which would immediately overwhelm the attention budget—modern agents use JIT context. The assistant dynamically loads only the necessary data at runtime.

contextjitoptimization
Built-incontext management

Lightweight Identifiers

The assistant maintains references (file paths, stored queries) and dynamically loads only the necessary data at runtime using tools like grep, head, or tail.

contextreferencesefficiency
Built-incontext management

Compaction

When a session nears its token limit, the assistant summarizes critical details—such as architectural decisions and unresolved bugs—while discarding redundant tool outputs.

contextcompressionlong-horizon
Built-incontext management

Tool Result Clearing

A light touch form of compaction where the raw results of previous tool calls (like long terminal outputs) are cleared to save space.

contexttoolsoptimization
Built-incontext management

Structured Note-taking

The agent may maintain an external NOTES.md or a to-do list to track dependencies and progress across thousands of steps, which it can read back into its context after a reset.

contextpersistencenotes
Built-incontext management

Distractors

Files or code snippets that are topically related to the query but do not contain the answer can cause the model to lose focus or hallucinate.

contextpollutionrelevance
Built-incontext management

Context Rot

As more tokens are added, the model's ability to accurately retrieve needles of information from the haystack of the codebase decreases.

contextdegradationtokens
Built-inprompt engineering

XML Tagging

Use tags like <background_information>, <tool_guidance>, <constraints> to clearly separate different types of instructions in system prompts.

promptsxmlstructure
Built-intoken optimization

High-Signal Tokens

The objective is to provide the smallest possible set of high-signal tokens that maximize the likelihood of the correct code generation.

tokensoptimizationquality
Built-incontext management

Structural Patterns

Research suggests that models often perform better on shuffled or unstructured context than on logically structured haystacks, impacting how they process long files.

contextstructureresearch
Built-inarchitecture

Agent Skills

Reusable packages of domain expertise defined in SKILL.md files that provide specialized AI agent capabilities. Introduced as GA in VS Code 1.109, skills can be invoked as slash commands or loaded...

skillsagentsvscode+1
Built-inarchitecture

Agent Hooks

Deterministic shell commands that execute at key lifecycle points during agent sessions. Unlike instructions, hooks run code with guaranteed outcomes for security policies, quality checks, or audit...

hooksagentslifecycle+1
Built-inarchitecture

Agent Orchestration

A multi-agent pattern where specialized subagents collaborate on complex tasks, each operating in its own dedicated context window. Provides context efficiency, specialization with different models,...

orchestrationmulti-agentsubagent+1
Built-incontext management

Message Steering

An agent interaction pattern where follow-up messages redirect a running agent request. The agent yields after the active tool execution and processes the new message. Alternatives include request...

agentssteeringqueueing+1
Built-inarchitecture

Terminal Sandboxing

A security mechanism restricting file system and network access for agent-executed terminal commands. Sandboxed commands have read/write access only to the workspace directory, and network access can...

securitysandboxterminal+1
Built-intoken economics

Thinking Tokens

Tokens generated during a model's internal reasoning process before producing a visible response. Thinking tokens consume context budget but improve quality on complex tasks. Anthropic models support...

thinkingreasoningtokens+1
Built-inarchitecture

MCP Server (Model Context Protocol)

A local stdio process that exposes tools to Claude Code and other MCP-capable agents. Tokalator's MCP server (tokalator-mcp) provides four tools: count_tokens, estimate_budget, preview_turn, and...

mcpclaude-codetools+2
Built-intoken optimization

CLI Token Counter

A standalone command-line tool for counting tokens and checking context budgets outside of VS Code. Tokalator ships a CLI binary (tokalator count, budget, preview, models) for SSH sessions, CI...

cliterminaltokens+2