How to Create a Chatbot with Your Own Documents Using Ollama and OpenWebUI

Introduction

Would you like to build a chatbot that interacts with your own documents? In this guide, you’ll learn how to:

Run Ollama with a local LLM (Large Language Model)
Use OpenWebUI as your AI chat interface
Enable RAG (Retrieval-Augmented Generation) for smarter, document-aware conversations using your own PDF files

This example runs on Windows but is also compatible with macOS and Linux.
Note for macOS users: It's recommended to run Ollama natively (not via Docker) to enable GPU acceleration, as Docker on macOS does not support GPU passthrough.

Windows Prerequisites

Ensure your system meets the following requirements:

Windows 10 or Windows 11
NVIDIA GPU (recommended, but not required)
WSL2 (Windows Subsystem for Linux 2)
Docker Desktop
At least 16GB RAM (32GB or more recommended)

Step 1: Run Ollama with GPU Support in Docker

docker run -d --name ollama --gpus all -p 11434:11434 ollama/ollama

Step 2: Pull a Language Model (e.g., LLaMA 3)

docker exec -it ollama ollama pull llama3

Other supported models include mistral, phi3, llava, and more from ollama.com/library.

Step 3: Run OpenWebUI

docker run -d --name openwebui -p 8080:8080 ghcr.io/open-webui/open-webui:main

Step 4: Access the Web Interface

Open your browser and go to:

http://localhost:8080

⚠️ On your first visit, you’ll need to create an admin account.

Once logged in, your installed LLM should appear and be selected by default. You're now chatting with a fully local LLM!

Step 5: Enable RAG (Retrieval-Augmented Generation)

To enable RAG in OpenWebUI, use the Knowledge feature—a section where you can store structured information for reference during chats. This acts as a memory system, improving contextual understanding and personalization.

Set Up Knowledge

In OpenWebUI, select Workspace from the left menu.
Click Knowledge.
Click the + button to add a new knowledge base.
Name your knowledge base.
Example: Ducati User Manual
Fill in the purpose fields:
"What are you working on?" → Ducati User Manual
"What are you trying to achieve?" → Provide information for operating and maintaining a Ducati motorcycle.
Click Create Knowledge.
Drag and drop your PDF document to index it.

Configure Your Custom Chat Model

Go back to the Workspace and select Models.
Enter a model name.
Example: Ducati Manual
Choose the base model (e.g., llama3).
Under Model Params, enter a system prompt:
You are a user manual for a Ducati motorcycle providing information on operating and maintaining the motorcycle.
Link the knowledge base you created earlier (e.g., Ducati User Manual).
Under Advanced Params, set max_tokens to 8192.
Note: The default is 2048. Increasing this allows more document context to be used during chat.
Click Save & Create.

Step 6: Chat with Your Document

Return to the main chat screen. From the model dropdown, select your newly created model (e.g., Ducati Manual).

You can now interact with your document using natural language—powered entirely by a local LLM and personalized knowledge!

Final Tips

Performance Tip: A GPU will speed up responses and support larger models.
Security Advantage: All data stays on your machine—ideal for sensitive or private documents.
Model Choice: Try different models for better summarization, reasoning, or speed.

Search This Blog

Tech Blog