Student Innovation Project 2025

Interpaws: Agentic AI for Veterinary Practice Management

A Student Innovation Project exploring the application of Vector Embeddings and ReAct Agents to solve healthcare scheduling bottlenecks in veterinary practice management.

pgvector + Embeddings

384-dimensional semantic search for staff-patient matching

ReAct Agent Loop

Autonomous reasoning and action execution cycles

Local LLM Inference

Ollama-powered Llama 3 / Qwen for on-premise AI

Full-Stack Architecture

Next.js 15 + FastAPI + PostgreSQL stack

Read Abstract View Architecture

interpaws.ai/agent-console

ReAct Agent Console

Live reasoning trace

My dog has been limping after a fall yesterday

THOUGHT

Analyzing symptoms: limping + fall history suggests orthopedic concern. Need to query staff vector store for expertise match.

ACTION

query_staff_vector(embedding="orthopedic trauma limping", k=3)

RESULT

Top match: Dr. Sarah Chen (cosine_similarity: 0.94) - Orthopedic Surgery

THOUGHT

High confidence match found. Now checking calendar availability for Dr. Chen...

ACTION

check_availability(staff_id=12, date_range="next_48h")

AGENT RESPONSE

I've matched you with Dr. Sarah Chen, our orthopedic specialist. She has slots at 10:30 AM or 2:00 PM tomorrow.

Vector Match

384-dim • cosine: 0.94

Llama 3

8B params • local

Project Technical Specifications

384

Embedding Dimensions

<200ms

Vector Query Latency

ReAct

Agent Architecture

100%

Local Inference

Technical Innovation

Innovation Claims & Technical Contributions

Three core innovations that differentiate Interpaws from conventional practice management systems. Each leverages modern AI/ML techniques to solve real veterinary workflow bottlenecks.

all-MiniLM-L6-v2384

Beyond Keyword Search

Semantic Staff Matching

Moving beyond keyword search. Utilizing pgvector and 384-dimensional embeddings to match unstructured patient complaints with veterinarian skill sets mathematically.

-- Vector similarity search
SELECT staff.name,
  1 - (staff.skill_embedding <=> $1) AS similarity
FROM staff
ORDER BY staff.skill_embedding <=> $1
LIMIT 3;

Embedding Model

all-MiniLM-L6-v2

Dimensions

384

Similarity

Cosine

ReAct LoopOllama

Reason + Act Agents

The ReAct Paradigm

Implementing 'Reason + Act' agents. Unlike standard chatbots, Interpaws uses a continuous execution loop to query inventory and calendars autonomously before responding.

# ReAct Agent Loop
while not done:
    thought = llm.think(observation)
    action = llm.decide(thought)
    observation = execute(action)
    if action == "respond":
        done = True

Architecture

ReAct Loop

LLM Backend

Ollama

Model

Llama 3 / Qwen

> 12 monthsTemporal SQL

Automated Patient Outreach

Proactive Wellness Loops

Automated identification of at-risk patients via temporal SQL analysis, triggering LLM-generated, personalized outreach emails.

-- Find overdue patients
SELECT pet.name, owner.email,
  AGE(NOW(), MAX(visit.date)) as since_last
FROM pets
JOIN visits ON pet.id = visit.pet_id
GROUP BY pet.id
HAVING AGE(NOW(), MAX(visit.date)) > '12 months';

Trigger

> 12 months

Analysis

Temporal SQL

Output

LLM Email

Semantic Search

Natural language queries matched against embedded staff profiles

Autonomous Loops

Self-correcting agent cycles until task completion

LLM Synthesis

Context-aware responses generated from retrieved data

Research Foundation

This project builds on established research in semantic search (Reimers & Gurevych, 2019), ReAct prompting (Yao et al., 2022), and healthcare scheduling optimization. All AI inference runs locally via Ollama, ensuring data privacy and HIPAA-aligned architecture.

100% Local LLM Inference

No External API Dependencies

Privacy-First Architecture

System Architecture

Full-Stack Technical Implementation

A modern, production-grade architecture combining Next.js, FastAPI, and local LLM inference. Every component is designed for scalability, maintainability, and data privacy.

Frontend

Next.js 15 App Router & Shadcn/UI

Modern React framework with server-side rendering, type-safe routing, and optimized bundle splitting.

Next.js 15App Router, Server Components

React 19Concurrent rendering

Tailwind CSS 4Utility-first styling

Shadcn/UIAccessible components

Backend

FastAPI, SQLAlchemy, Alembic

High-performance async Python backend with automatic OpenAPI docs and comprehensive data validation.

FastAPIAsync Python API

SQLAlchemy 2.0ORM with type hints

AlembicDatabase migrations

Pydantic v2Data validation

AI Engine

Ollama (Llama 3/Qwen) + SentenceTransformers

Fully local AI stack ensuring data privacy. No API calls leave your infrastructure.

OllamaLocal LLM inference

Llama 3 / Qwen8B parameter models

SentenceTransformersall-MiniLM-L6-v2

pgvectorVector similarity search

Data Flow Pipeline

From user query to intelligent response in five stages

STEP 01

User Input

Natural language query from client portal

STEP 02

Vector Embedding

Text → 384-dim vector via SentenceTransformers

STEP 03

Cosine Similarity

pgvector search against staff skill embeddings

STEP 04

LLM Synthesis

Ollama generates contextual response

STEP 05

Response

Structured output to user interface

Architecture

Microservices

Database

PostgreSQL + pgvector

API Style

REST + WebSocket

Deployment

Docker Compose

backend/

backend/
├── app/
│   ├── api/
│   │   ├── routes/
│   │   │   ├── bookings.py
│   │   │   ├── staff.py
│   │   │   └── chat.py      # ReAct Agent
│   │   └── deps.py
│   ├── core/
│   │   ├── llm.py           # Ollama client
│   │   └── embeddings.py    # SentenceTransformers
│   ├── models/              # SQLAlchemy models
│   └── services/
│       ├── vector_search.py # pgvector queries
│       └── agent.py         # ReAct loop
└── alembic/                 # Migrations

frontend/

frontend/
├── app/
│   ├── (auth)/
│   │   ├── login/
│   │   └── register/
│   ├── (dashboard)/
│   │   ├── admin/
│   │   │   ├── staff/
│   │   │   ├── bookings/
│   │   │   └── vector-logs/  # Embedding inspector
│   │   └── client/
│   │       ├── pets/
│   │       └── chat/         # Agent interface
│   └── components/
│       └── ui/               # Shadcn components
└── lib/
    └── api.ts                # Type-safe API client

Technical Deep Dive

This project demonstrates a complete full-stack AI application with vector embeddings, ReAct agent architecture, and local LLM inference for veterinary practice management.

Innovation Claims View Architecture

ReAct AgentReason + Act loop

Vector Searchpgvector + embeddings

Full-StackNext.js + FastAPI

Local LLMOllama inference

About this showcase: This site demonstrates the project's technical concepts and architecture. The full system includes local Ollama inference, PostgreSQL with pgvector, and a FastAPI backend.

Want to Learn More?

Have questions about the project or want to discuss the technical implementation? Feel free to reach out.