Complete Guide

Local Email AI Assistant

Download, store, and query your Outlook emails using a local LLM. Complete workflow from export to searchable AI assistant.

๐Ÿ“‹ What You'll Build

A private, local system that:

โš ๏ธ Important: This guide assumes you have access to your Outlook data files (.pst, .ost) or can export them. You'll need storage space for the email database.

๐Ÿ“ค Step 1: Export Outlook Emails

Option A: Built-in Outlook Export

  1. 1Open Outlook
  2. 2File โ†’ Open & Export โ†’ Import/Export
  3. 3Select "Export to a file"
  4. 4Choose "Outlook Data File (.pst)"
  5. 5Select folders to export (or everything)
  6. 6Save to your desired location

Option B: Find Existing PST Files

Outlook often auto-saves PST files. Search for them:

Option C: Gmail/IMAP Export

๐Ÿ”„ Step 2: Convert to Searchable Format

Option A: Use readpst (Linux/Mac)

# Install
brew install readpst

# Convert PST to mbox
readpst -o ./output -m yourfile.pst

# Convert to JSON
python3 pst_to_json.py yourfile.pst

Option B: Python Script (Recommended)

Create extract_emails.py:

import email
from email import policy
import json
from pathlib import Path

def extract_emails(pst_file, output_file):
    import extract_msg
    emails = []
    
    msg = extract_msg.Message(pst_file)
    # Extract: subject, from, to, date, body, attachments
    
    with open(output_file, 'w') as f:
        json.dump(emails, f, indent=2)

if __name__ == '__main__':
    # Your PST file
    extract_emails('inbox.pst', 'emails.json')

Option C: Use email-parser Library

pip install email-parser extract-msg

# Quick extract
python3 -c "import extract_msg; extract_msg.extract_msg('email.pst')"

๐Ÿง  Step 3: Create Vector Database (RAG)

Option A: Using LangChain + ChromaDB

pip install langchain langchain-community chromadb beautifulsoup4

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain.vectorstores import Chroma
import json

# Load emails
with open('emails.json') as f:
    emails = json.load(f)

# Create documents
documents = []
for email in emails:
    doc = f"From: {email['from']}\nTo: {email['to']}\nSubject: {email['subject']}\nDate: {email['date']}\n\n{email['body']}"
    documents.append(doc)

# Split into chunks
splitter = RecursiveCharacterTextSplitter(chunk_size=1000)
chunks = splitter.create_documents(documents)

# Create embeddings with local Ollama
embeddings = OllamaEmbeddings(model='nomic-embed-text')

# Store in ChromaDB
db = Chroma.from_documents(chunks, embeddings, persist_directory='./email_db')
db.persist()

Option B: Using LocalAI

# Install local-ai
docker run -ti --rm -v $(pwd)/emails:/var/lib/local-ai/local-ai/pkg/model-storage ghcr.io/mudler/local-ai:latest

# Use embed API
curl http://localhost:8080/v1/embeddings -d '{
  "input": "your email text here",
  "model": "nomic-embed-text"
}'
Recommended Setup: Use Ollama for embeddings + LLM + ChromaDB for storage. Works completely offline.

๐Ÿ–ฅ๏ธ Step 4: User Interface Options

Option A: Chatbot UI (Simplest)

Use Open WebUI with Ollama

# Run Ollama
ollama serve
ollama pull llama3.2

# Run Open WebUI
docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://host.docker.internal:11434 openwebui/open-webui:main

Then configure it to use your ChromaDB as knowledge base.

Option B: Custom Streamlit App

pip install streamlit langchain langchain-community

# Create app.py
import streamlit as st
from langchain_community.chat_models import ChatOllama
from langchain.chains import ConversationalRetrievalChain
from langchain.vectorstores import Chroma
from langchain.embeddings import OllamaEmbeddings

st.title("๐Ÿ“ง Email Assistant")

# Load DB
embeddings = OllamaEmbeddings(model='nomic-embed-text')
db = Chroma(persist_directory='./email_db', embedding_function=embeddings)

# Query
query = st.text_input("Ask about your emails:")
if query:
    docs = db.similarity_search(query)
    st.write(docs[0].page_content)

Option C: PrivateBin (Simple Search)

Use Obsidian + Local LLM plugin

  • Export emails as Markdown files
  • Import into Obsidian vault
  • Use Copilot plugin with local LLM

โš™๏ธ Automation & Updates

Weekly Sync Script

# sync_emails.sh
#!/bin/bash

# Export new emails from Outlook
python3 extract_emails.py --incremental

# Update vector DB
python3 update_vector_db.py

echo "Email DB updated!"

Cron Job for Updates

# Run every Sunday at 2am
0 2 * * 0 /path/to/sync_emails.sh >> /path/to/sync.log 2>&1

Incremental Updates

Only index emails newer than last sync date:

from datetime import datetime
from_date = datetime.fromisoformat('2026-01-01')

new_emails = [e for e in all_emails if e['date'] > from_date]
# Only embed new emails

๐Ÿ”— All-in-One Solutions

SolutionTypeLink
MailPilotFull stackGitHub
OutlookLLMOutlook Add-inGitHub
Email AssistantRAG templateGitHub

โœ… Quick Start Checklist

  1. โ˜ Export Outlook to PST
  2. โ˜ Convert PST to JSON/Markdown
  3. โ˜ Install Ollama + embedding model
  4. โ˜ Create vector database
  5. โ˜ Set up UI (Streamlit or Open WebUI)
  6. โ˜ Test queries
๐Ÿ“š Sources:
Reddit: Local LLM with Emails ยท GitHub: OutlookLLM ยท Medium: Email RAG Tutorial