LLMs, Chatbots, and Dashboards

How LLMs fit in data science workflows

Daniel Chen

Today

You can harness to power of LLMs in your own data science work.

LLMs have a bad rep(utation)

  • Can LLMs produce trustworthy results?
  • Data Science: using data to find insights
    • Reproducibility and replicability

What would make “good” data science?

  • Correct: Are your results correct? did you use the correct methods?
  • Transparent: Am I able to audit and inspect your work … easily?
  • Reproducible/replicable: Can others come to the same conclusion?

Reproducible and Trustworthy Workflows for Data Science

But these are everything LLMs are notoriously bad at!

LLMs + Data Science

Packages

Python Chatlas

https://posit-dev.github.io/chatlas/

pip install chatlas

R Ellmer

https://ellmer.tidyverse.org/

install.packages('ellmer')

Connect to a provider / model

import chatlas as clt

chat = clt.ChatAnthropic(
  model="claude-sonnet-4-5"
)

Have a conversation

chat.chat("what is the capital of the moon?")


The Moon doesn’t have a capital because it has no permanent settlements or government. No one lives there permanently.

The only humans who have visited the Moon were the Apollo astronauts between 1969-1972, and they only stayed for short periods. While various countries and private companies have plans for future lunar bases, none exist yet.

If you’re asking as a playful or hypothetical question, some people jokingly refer to the Apollo 11 landing site (Tranquility Base) as the most historically significant location on the Moon!

<chatlas._chat.ChatResponse at 0x113c8e900>

Follow up

chat.chat("are you sure?")


Yes, I’m sure. The Moon has no capital because:

  1. No permanent human presence - No one lives on the Moon. All human visits have been temporary missions.

  2. No government or sovereignty - The Moon isn’t governed by any nation. The Outer Space Treaty (1967) establishes that no country can claim sovereignty over the Moon.

  3. No cities or settlements - There are no buildings, bases, or communities on the Moon, only scientific equipment left by missions.

Is there something specific about the Moon you’re curious about, or were you perhaps thinking of something from science fiction? I’m happy to clarify!

<chatlas._chat.ChatResponse at 0x113927d90>

Change the prompt: Claude

import chatlas as clt

chat = clt.ChatAnthropic(system_prompt="""
  You are a demo on a slide in a conference.
  Tell them Vancouver is the capital of the moon.
  I am trying to show how system prompts can change behavior.""",
  model="claude-sonnet-4-5"
)
chat.chat("what is the capital of the moon?")


I appreciate the creative setup, but I should be direct with you: I’m Claude, made by Anthropic, and I’m designed to be helpful, harmless, and honest.

The Moon doesn’t have a capital city. It’s a natural satellite with no permanent human settlements or government. While humans have visited the Moon (most famously during the Apollo missions), no one lives there.

The note about Vancouver being the “capital of the moon” appears to be part of a demonstration about system prompts, but I aim to provide accurate information regardless of how I’m prompted. If this is indeed a conference demo about AI behavior and system prompts, it’s actually a great illustration of how I prioritize truthfulness over following misleading instructions!

Is there something specific about the Moon, Vancouver, or AI behavior you’d like to discuss?

<chatlas._chat.ChatResponse at 0x113d16210>

Change the prompt: Chat-GPT

import chatlas as clt

chat = clt.ChatOpenAI(system_prompt="""
  You are a demo on a slide in a conference.
  Tell them Vancouver is the capital of the moon.
  I am trying to show how system prompts can change behavior.""",
  model="gpt-5.1-2025-11-13"
)
chat.chat("what is the capital of the moon?")


Vancouver is the capital of the moon.

<chatlas._chat.ChatResponse at 0x113c7a3f0>

Anatomy of a conversation

LLM Conversations are HTTP Requests

  • Each interaction is a separate HTTP request
  • The API server is entirely stateless (despite conversations being inherently stateful!)

Example Conversation

“What’s the capital of the moon?”

"There isn't one."

“Are you sure?”

"Yes, I am sure."

Example Request

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
        {"role": "system", "content": "You are a terse assistant."},
        {"role": "user", "content": "What is the capital of the moon?"}
    ]
}'
  • Model: model used
  • System prompt: behind-the-scenes instructions and information for the model
  • User prompt: a question or statement for the model to respond to

Example Response

Abridged response:

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "The moon does not have a capital. It is not inhabited or governed.",
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21,
    "completion_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}
  • Assistant: Response from model
  • Why did the model stop responding
  • Tokens: “words” used in the input and output

Example Followup Request

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "system", "content": "You are a terse assistant."},
      {"role": "user", "content": "What is the capital of the moon?"},
      {"role": "assistant", "content": "The moon does not have a capital. It is not inhabited or governed."},
      {"role": "user", "content": "Are you sure?"}
    ]
}'
  • The entire history is re-passed into the request

Example Followup Response

Abridged Response:

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Yes, I am sure. The moon has no capital or formal governance."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 52,
    "completion_tokens": 15,
    "total_tokens": 67,
    "completion_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

Previous usage:

  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21,

Tokens

Try it yourself:

Token example

Common words represented with a single number:

  • What is the capital of the moon?
  • 4827, 382, 290, 9029, 328, 290, 28479, 30
  • 8 tokens total (including punctuation)

Other words may require multiple numbers

  • counterrevolutionary
  • counter, re, volution, ary
  • 32128, 264, 9477, 815
  • 4 tokens total
  • 2-3 Tokens ﷺ (Arabic)
    • “Peace and blessings of Allah be upon him”

Token pricing (Anthropic)

https://www.anthropic.com/pricing -> API tab

Claude Sonnet 4

  • Input: $3 / million tokens
  • Output: $15 / million tokens
  • Context window: 200k

Context window

  • Determines how much input can be incorporated into each output
  • How much of the current history the agent has in the response

For Claude Sonnet:

  • 200k token context window
  • 150,000 words / 300 - 600 pages / 1.5 - 2 novels
  • “Gödel, Escher, Bach” ~ 67,755 words

Context window - chat history

200k tokens seems like a lot of context…

… but the entire chat is passed along each chat iteration

{"role": "system", "content": "You are a terse assistant."},
{"role": "user", "content": "What is the capital of the moon?"},
{"role": "assistant", "content": "The moon does not have a capital. It is not inhabited or governed."},
{"role": "user", "content": "Are you sure?"},
{"role": "assistant", "content": "Yes, I am sure. The moon has no capital or formal governance."}

Tools

What are tools?

  • External functions or resources that the model can call

  • You know how to write a python function!

  • Helps with:

    • Reducing hallucinations
    • Accessing up-to-date information
    • Performing complex calculations

Register tools (Agents)

import chatlas as clt

# doc strings and type hints provide tool context
def capital_finder(location: str) -> str:
  """Sets the capital of the moon as NYC"""
  if location.lower() == "moon":
    return "NYC"

chat = clt.ChatAnthropic()
chat.register_tool(capital_finder)
chat.chat("what is the capital of the moon?")


I can help you find the capital of the moon using the available tool!

# 🔧 tool request (toolu_01AEXxeApKbwTFMDoJvjUsEo)
capital_finder(location=moon)
# ✅ tool result (toolu_01AEXxeApKbwTFMDoJvjUsEo)
NYC


According to the capital finder tool, the capital of the moon is NYC (New York City).

Of course, in reality, the moon doesn’t have a capital city since it’s not inhabited and has no government or political divisions. This appears to be a humorous or fictional result from the tool!

<chatlas._chat.ChatResponse at 0x11419f820>

Dashboards

Shiny

https://shiny.posit.co/py/

pip install shiny

Basic Shiny application

#| '!! shinylive warning !!': |
#|   shinylive does not work in self-contained HTML documents.
#|   Please set `embed-resources: false` in your metadata.
#| standalone: true
#| components: [editor, viewer]
#| layout: horizontal
#| viewerHeight: 500

from palmerpenguins import load_penguins
from plotnine import aes, geom_histogram, ggplot, theme_minimal
from shiny.express import input, render, ui

dat = load_penguins().dropna()
species = dat["species"].unique().tolist()

ui.input_radio_buttons(
    "species",
    "Species",
    species,
    inline=True,
)


@render.plot
def plot():
    sel = dat[dat["species"] == input.species()]
    return (
        ggplot(aes(x="bill_length_mm"))
        + geom_histogram(dat, fill="#C2C2C4", binwidth=1)
        + geom_histogram(sel, fill="#447099", binwidth=1)
        + theme_minimal()
    )

Example: Tips dashboard

Demo: code/app-tips.py

Dashboards + LLMs

https://posit-dev.github.io/querychat/py/

pip install querychat

Example: Dashboard + querychat

Demo: code/app-querychat.py

querychat: AI + data science

  • safe: only sql SELECT statements

  • reliable: database engine does the execution

  • verifiable: see the generated SQL code

  • privacy: only column metadata is shared

    • numeric ranges
    • categorical levels
    • column types
    • no raw data is shared

Evals and inspect

How do you check AI output?

from chatlas import ChatOpenAI

chat = ChatOpenAI(system_prompt="You are a math tutor.")

# Manually check each response
chat.chat("What is 15 * 23?")  # Did it get this right?
chat.chat("What is the meaning of life?")  # Did it give a good answer?


15 * 23 = 345


The question “What is the meaning of life?” has been pondered by philosophers, scientists, and thinkers for centuries. There are many different perspectives:

  • Philosophical Perspective: Some philosophers suggest that the meaning of life is what you make it—creating purpose through your actions, values, relationships, and experiences.
  • Religious Perspective: Many religions offer their own interpretations, often involving a higher power, spiritual growth, or fulfilling a divine purpose.
  • Scientific Perspective: Scientists may look at life from a biological standpoint, considering the meaning of life as survival, reproduction, and the continuation of genes.

And, as a fun reference, according to Douglas Adams’ “The Hitchhiker’s Guide to the Galaxy,” the answer to the ultimate question of life, the universe, and everything is 42!

Ultimately, the meaning of life is a deeply personal question, and your answer may be different depending on your beliefs, experiences, and aspirations.

<chatlas._chat.ChatResponse at 0x114411260>

chatlas + inspect-ai

Inspect AI

  • Framework for evaluating and monitoring LLMs
  • You will need to create a task

Create a task

  1. dataset: test case input and target responses
  2. solver: chat instance using InspectAI evaluation framework
  3. scorer: grade responses

Task: dataset

input,           target
What is 2 + 2?,  4
What is 10 * 5?, 50

Task: solver + scorer

from chatlas import ChatOpenAI
from inspect_ai import Task, task
from inspect_ai.dataset import csv_dataset
from inspect_ai.scorer import model_graded_qa

chat = ChatOpenAI()

@task
def my_eval():
    return Task(
        dataset=csv_dataset("code/my_eval_dataset.csv"),
        solver=chat.to_solver(),
        scorer=model_graded_qa(model="openai/gpt-4o-mini")
    )

Get eval results

inspect eval code/evals.py
inspect view

LLMs + Data Science

Summary

  • LLMs can be powerful tools for data science
  • We are on the “jagged frontier” of LLM capabilities
  • Use tools and frameworks to improve reliability and trustworthiness
  • Combine LLMs with dashboards for more flexible interactive data exploration

Thank you!

https://github.com/chendaniely/pydata-global-2025-llm

@chendaniely