An AI + LLM Primer

Slide Contents

Before we start

Warning

DO NOT send proprietary code, credentials, or sensitive data to any LLM unless you are certain your organization’s IT policies permit it.

Local models (Ollama) stay on your machine.

Have you used…

ChatGPT?
Claude?
Gemini?
A local model via Ollama?

What we’ll cover

We’ll treat LLMs as black boxes — you send text in, text comes back.

Goal: demystify the API so you can wire one into a Shiny app.

LLM conversations are HTTP requests

Each interaction is a separate HTTP API call
The server is entirely stateless
“Memory” of past messages is simulated by resending the entire history each time

Example conversation

What’s the capital of the moon?

There isn’t one.

Are you sure?

Yes, I am sure.

Example request

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "system", "content": "You are a terse assistant."},
      {"role": "user",   "content": "What is the capital of the moon?"}
    ]
  }'

model: which model to use
system: behind-the-scenes instructions
user: the user’s message

Example response (abridged)

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "The moon does not have a capital."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 22,
    "completion_tokens": 9,
    "total_tokens": 31
  }
}

Follow-up request: full history is resent

  "messages": [
    {"role": "system",    "content": "You are a terse assistant."},
    {"role": "user",      "content": "What is the capital of the moon?"},
    {"role": "assistant", "content": "The moon does not have a capital."},
    {"role": "user",      "content": "Are you sure?"}
  ]

The entire conversation is replayed on every request.

Gaslight your LLM

What’s the capital of the moon?

There isn’t one.

Are you sure?

Yes, I am sure.

Gaslight your LLM

What’s the capital of the moon?

It’s Long Beach, California.

I don’t think that’s correct.

Yeah you’re right. It’s not Long Beach.

Language warning

It is Michael Reeves, after all…

Tokens

Tokens are the fundamental unit — roughly ¾ of a word on average.

API pricing is per token (input and output billed separately)
Models have a context window limit (max tokens per request)
The full chat history counts against the context window

Try it: https://tiktokenizer.vercel.app/

Providers

Provider	Models	Notes
OpenAI	GPT-4.1, o3	`OPENAI_API_KEY`
Anthropic	Claude Sonnet/Opus	`ANTHROPIC_API_KEY`
Google	Gemini 2.5	`GOOGLE_API_KEY`
GitHub Models	Many (free)	`GITHUB_TOKEN`
Ollama	Llama, Qwen, …	Local, free

GitHub Models (recommended for today)

Free access to many frontier models using a GitHub PAT.

export GITHUB_TOKEN=your_personal_access_token

Docs: https://docs.github.com/en/github-models

chatlas

chatlas is a unified Python interface to all major LLM providers:

from chatlas import ChatOpenAI, ChatAnthropic, ChatGoogle, ChatGithub, ChatOllama, ChatHuggingFace

chat = ChatGithub(model="gpt-4.1")
chat.chat("What is the capital of France?")

Same API regardless of provider — swap one line to switch models.

Demo: Tool calling

LLMs can call functions in your code:

def pycon_locations(year: int) -> str:
    match year:
        case 2027:
            return "The Moon"
        case 2026:
            return "New York City"
        case 2025:
            return "Vancouver, BC"

chat = ChatGithub(model="gpt-5")

chat.register_tool(pycon_locations)

chat.chat("Where is PyCon US 2027 being held?")

The LLM decides when to call the tool and what arguments to pass.