An AI + LLM Primer
Slide Contents
Before we start
DO NOT send proprietary code, credentials, or sensitive data to any LLM unless you are certain your organization’s IT policies permit it.
Local models (Ollama) stay on your machine.
Have you used…
- ChatGPT?
- Claude?
- Gemini?
- A local model via Ollama?
What we’ll cover
We’ll treat LLMs as black boxes — you send text in, text comes back.
Goal: demystify the API so you can wire one into a Shiny app.
LLM conversations are HTTP requests
- Each interaction is a separate HTTP API call
- The server is entirely stateless
- “Memory” of past messages is simulated by resending the entire history each time
Example conversation
What’s the capital of the moon?
There isn’t one.
Are you sure?
Yes, I am sure.
Example request
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4.1",
"messages": [
{"role": "system", "content": "You are a terse assistant."},
{"role": "user", "content": "What is the capital of the moon?"}
]
}'- model: which model to use
- system: behind-the-scenes instructions
- user: the user’s message
Example response (abridged)
Follow-up request: full history is resent
The entire conversation is replayed on every request.
Gaslight your LLM
What’s the capital of the moon?
There isn’t one.
Are you sure?
Yes, I am sure.
Gaslight your LLM
What’s the capital of the moon?
It’s Long Beach, California.
I don’t think that’s correct.
Yeah you’re right. It’s not Long Beach.
It is Michael Reeves, after all…
Tokens
Tokens are the fundamental unit — roughly ¾ of a word on average.
- API pricing is per token (input and output billed separately)
- Models have a context window limit (max tokens per request)
- The full chat history counts against the context window
Try it: https://tiktokenizer.vercel.app/
Providers
| Provider | Models | Notes |
|---|---|---|
| OpenAI | GPT-4.1, o3 | OPENAI_API_KEY |
| Anthropic | Claude Sonnet/Opus | ANTHROPIC_API_KEY |
| Gemini 2.5 | GOOGLE_API_KEY |
|
| GitHub Models | Many (free) | GITHUB_TOKEN |
| Ollama | Llama, Qwen, … | Local, free |
GitHub Models (recommended for today)
Free access to many frontier models using a GitHub PAT.
export GITHUB_TOKEN=your_personal_access_tokenchatlas
chatlas is a unified Python interface to all major LLM providers:
from chatlas import ChatOpenAI, ChatAnthropic, ChatGoogle, ChatGithub, ChatOllama, ChatHuggingFace
chat = ChatGithub(model="gpt-4.1")
chat.chat("What is the capital of France?")Same API regardless of provider — swap one line to switch models.
Demo: Tool calling
LLMs can call functions in your code:
def pycon_locations(year: int) -> str:
match year:
case 2027:
return "The Moon"
case 2026:
return "New York City"
case 2025:
return "Vancouver, BC"
chat = ChatGithub(model="gpt-5")
chat.register_tool(pycon_locations)
chat.chat("Where is PyCon US 2027 being held?")The LLM decides when to call the tool and what arguments to pass.