🧠

AI Agent Foundations

Zero to production · No prior knowledge required

Key takeaway: What data your agent accesses and when — the difference between a chatbot and a grounded, accurate assistant.

Agent Knowledge

Everything an AI agent knows or can look up · 8 min read

Agent knowledge is all the information an AI agent can access to answer questions and make decisions. Think of it as the agent's "brain" combined with its "library". Some knowledge is baked in during training — the model learned it from reading the internet. Other knowledge is retrieved on-demand from databases, documents, or live APIs. The art of building great agents is knowing which kind of knowledge to use when.

💡 ANALOGY — THE EXPERT CONSULTANT

Imagine hiring a brilliant consultant. They bring years of expertise in their head (training knowledge). But for your specific company's internal reports, they need to read your documents (retrieval knowledge). And for today's stock prices, they check live data (real-time knowledge). Your AI agent works exactly the same way.

WHY IT MATTERS

Without the right knowledge, even a powerful LLM hallucinates — it confidently makes things up. Giving an agent the correct knowledge at the right moment is what separates useful agents from expensive toy chatbots.

How it works

Parametric knowledge (built-in)

During training the model read billions of web pages. That knowledge is encoded into the model's weights — billions of numbers. It's always there, instant, but frozen at the training cutoff date. It cannot be updated without retraining.

Retrieval knowledge (RAG)

You embed your documents into a vector database. When the user asks a question, you embed the question too, find the most similar document chunks, and inject them into the context window. The LLM reads them like reading notes before answering.

Episodic knowledge (memory)

Previous turns in the conversation are stored and retrieved. The agent remembers what you said earlier in the session (short-term) or in previous sessions (long-term memory via database). This makes it feel like it truly knows you.

Procedural knowledge (tools)

The agent knows HOW to do things via tools — call an API, run a SQL query, execute code. This is active knowledge that changes the world, not just describes it.

When to use

IF: Data changes after the model's training cutoff

→ RAG — retrieve from a live database

IF: Your data is private / proprietary

→ RAG — never put it in a model's training set

IF: General world knowledge (history, science, coding)

→ Parametric — the LLM already knows it

IF: Multi-session personalisation

→ Long-term memory stored in a database

IF: Need to know the current date / live prices

→ Tool call to an API

Where it's used

Customer support bot

RAG over your product docs so it knows your specific features

Legal assistant

RAG over contract PDFs + parametric knowledge of law

Personal assistant

Long-term memory of your preferences + RAG over your notes

Code assistant

Parametric (language syntax) + RAG over your codebase

How it differs from related concepts

vs SkillsKnowledge is what the agent KNOWS. Skills are what the agent CAN DO. Knowing how to drive is knowledge. Actually pressing the gas pedal is a skill.

vs ContextKnowledge is the full information available. Context is the subset you choose to put in the LLM's window right now. Context is a curated slice of knowledge.

vs PromptKnowledge is the raw information. A prompt is the instruction that tells the agent how to USE that knowledge.

Common mistakes to avoid

Relying on parametric knowledge for current events — the model's training data has a cutoff date.
Storing ALL documents in the context window — this is expensive and drowns out the relevant parts. Use retrieval instead.
Skipping metadata — chunk your documents WITH their source URL and title so the agent can cite sources.
Ignoring embedding model mismatch — always use the same embedding model for ingestion AND retrieval.
Forgetting conversation history — agents that can't remember what the user just said feel frustrating to use.

Copy-ready examples

💬 Run this once when a document is added to the knowledge base.

1// 1. Chunk your document
2const chunks = chunkText(documentContent, 800, 120);
3 
4// 2. Embed each chunk
5const embeddings = await embedBatch(chunks.map(c => c.text));
6 
7// 3. Store in pgvector
8for (let i = 0; i < chunks.length; i++) {
9  await sql`
10    INSERT INTO chunks (content, embedding, source)
11    VALUES (${chunks[i].text}, ${toVector(embeddings[i])}, ${sourceUrl})
12  `;
13}