BLUEPRINTLLMOps

Ollama + OpenWebUI on Railway

Self-hosted ChatGPT alternative in 10 minutes

Beginner1 hour~$80/mo

Multi-Cloud

The Problem

ChatGPT Plus costs $20/user/month, sends your data to OpenAI, and locks you into one model. Enterprise teams need a self-hosted alternative that runs Llama 3.3 and other OSS models with proper auth, multi-user support, and data sovereignty — but production deployment guides are thin.

The Solution

One-click deploy a containerized Ollama + OpenWebUI stack on Railway with persistent volume for models, PostgreSQL for users/auth, and optional Cloudflare tunnel for SSL. Add GPU via RunPod serverless for heavy workloads. Zero vendor lock-in, zero per-token cost.

Overview

Deploy a fully self-hosted ChatGPT alternative with Ollama (LLM runtime) + OpenWebUI (polished chat interface) on Railway. Get a private, auth-protected AI chat for your team with zero per-token costs. Supports Llama 3.3, Mistral, Qwen, and any GGUF model. Includes GPU support via RunPod for production workloads.

Architecture

Loading interactive diagram...

Components

OpenWebUI Chat Interface

gateway

React/Svelte frontend with multi-user auth, conversation history, model switching, RAG support, and prompt library.

Service: Railway (Docker)

Ollama Runtime

ai-service

LLM inference server supporting Llama 3.3, Mistral, Qwen, Phi, Gemma, and any GGUF model. Exposes OpenAI-compatible API.

Service: Railway (Docker)

PostgreSQL

database

User accounts, conversation history, settings, and RBAC. Managed Postgres with automated backups.

Service: Railway Postgres

Model Volume

storage

Persistent disk for downloaded GGUF models (10-80GB). Survives container restarts and redeploys.

Service: Railway Volume

Cloudflare Tunnel

gateway

Optional SSL + custom domain with zero config. Zero-trust access and DDoS protection.

Service: Cloudflare

RunPod Serverless GPU

external

Optional: offload heavy models (70B+) to serverless GPU. Pay per second, auto-scale to zero.

Service: RunPod Serverless

Automated Backups

storage

Daily snapshots of conversations, users, and settings. S3-compatible storage with retention policy.

Service: Railway Scheduled Jobs

Implementation Steps

Core Deploy

15 minutes

One-click deploy the core stack and pull your first model

Tasks

Click Railway template for Ollama + OpenWebUI
Configure persistent volume for model storage (50GB)
Set OLLAMA_BASE_URL and WEBUI_SECRET_KEY env vars
Pull first model via OpenWebUI (llama3.3:8b or mistral:7b)
Verify chat works with default admin user

Deliverables

Running OpenWebUI instanceFirst model loadedAdmin access

Authentication & Multi-User

30 minutes

Configure auth, add team members, and set model permissions

Tasks

Enable ENABLE_SIGNUP=false to lock down registration
Create admin and regular user accounts
Configure model-level permissions (admin-only vs all users)
Add conversation sharing and workspace features
Connect PostgreSQL for persistent user state

Deliverables

Multi-user authPermission systemPersistent history

Production Hardening

1 hour

Add custom domain, backups, monitoring, and optional GPU overflow

Tasks

Configure Cloudflare tunnel for custom domain + SSL
Set up daily PostgreSQL backups to S3
Add Railway observability (logs, metrics, alerts)
Optional: configure RunPod serverless GPU for 70B models
Load test with concurrent users and verify performance

Deliverables

Custom domainDaily backupsGPU overflowMonitoring

Code Examples

Railway Deployment Config

railway.json and docker-compose.yml for Ollama + OpenWebUI stack

# docker-compose.yml
version: '3.9'
services:
  ollama:
    image: ollama/ollama:latest
    volumes:
      - ollama-models:/root/.ollama
    ports:
      - '11434:11434'
    environment:
      - OLLAMA_KEEP_ALIVE=24h
      - OLLAMA_HOST=0.0.0.0

  openwebui:
    image: ghcr.io/open-webui/open-webui:main
    depends_on:
      - ollama
    ports:
      - '3000:8080'
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
      - WEBUI_SECRET_KEY=${WEBUI_SECRET_KEY}
      - ENABLE_SIGNUP=false
      - DATABASE_URL=${DATABASE_URL}
    volumes:
      - openwebui-data:/app/backend/data

volumes:
  ollama-models:
  openwebui-data:

# railway.json
{
  "$schema": "https://railway.app/railway.schema.json",
  "build": { "builder": "DOCKERFILE" },
  "deploy": {
    "restartPolicyType": "ON_FAILURE",
    "restartPolicyMaxRetries": 10,
    "healthcheckPath": "/health"
  }
}

Pull Model Bootstrap Script

Shell script to pre-pull models on first deploy

#!/bin/bash
# bootstrap-models.sh — run on first deploy
set -e

MODELS=(
  'llama3.3:8b'
  'mistral:7b'
  'qwen2.5-coder:7b'
)

echo 'Waiting for Ollama to be ready...'
until curl -sf http://ollama:11434/api/tags > /dev/null; do
  sleep 2
done

for model in "${MODELS[@]}"; do
  echo "Pulling $model..."
  curl -X POST http://ollama:11434/api/pull \
    -H 'Content-Type: application/json' \
    -d "{\"name\": \"$model\"}"
done

echo 'All models pulled successfully'

Cost Estimate

$80

per month

$960

per year

Railway (Ollama + OpenWebUI)

$40

Railway Postgres

$10

Railway Volume (50GB)

Cloudflare (custom domain)

RunPod GPU (optional overflow)

$25

Assumptions: Small team (5-10 users), 7B-8B models on CPU, Occasional GPU overflow for 70B models, ~500 chats/day

Use Cases

Private team ChatGPT alternativeConfidential data analysis without OpenAIOffline AI development environmentCost-conscious AI deployment for startupsRegulated industry AI (healthcare, legal, finance)

Technologies

OllamaOpenWebUIRailwayPostgreSQLCloudflareRunPodDockerLlama 3.3Mistral

Ready to Build?

Deploy this architecture in minutes, or get the production-ready template with full source code.

Try Prototype Get Template