Skip to content

WhatsApp RAG Integration - Deployment Guide

Prerequisites

  • Python 3.11+
  • PostgreSQL 14+ with pgvector extension
  • Redis (for Celery)
  • Meta WhatsApp Business Account
  • OpenAI or Gemini API key (for embeddings)

1. Database Setup

Enable pgvector Extension

Run in Supabase SQL Editor or PostgreSQL:

CREATE EXTENSION IF NOT EXISTS vector;

Run Migrations

cd backend
python manage.py migrate

This creates: - core_knowledgevector - Vector embeddings table - core_chatsession - WhatsApp chat sessions - core_chatmessage - Message history - core_chatauditlog - Compliance audit logs - whatsapp_config - Per-company WhatsApp settings - whatsapp_agent_config - Agent configurations - whatsapp_vector_stats - Embedding statistics


2. Environment Variables

Add to .env or Render/Vercel environment:

Required Variables

# Embedding API (choose one)
EMBEDDING_PROVIDER=openai    # or 'gemini'
EMBEDDING_API_KEY=sk-xxx     # OpenAI or Gemini API key
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMENSIONS=1536

# Meta Settings (if using Meta direct)
META_WHATSAPP_TOKEN=your-access-token

# Webhook Security
WHATSAPP_WEBHOOK_VERIFY_TOKEN=your-random-verify-token

3. Configure WhatsApp Provider

Meta (Facebook) Direct

  1. Go to Meta for Developers
  2. Create a WhatsApp Business App
  3. Configure webhook URL:
    https://your-domain.com/api/whatsapp/webhook/
    
  4. Set Verify Token (must match WHATSAPP_WEBHOOK_VERIFY_TOKEN)
  5. Subscribe to: messages, message_status

4. Install Dependencies

Backend

pip install -r requirements.txt

New dependencies added: - pgvector>=0.3.0 - PostgreSQL vector operations - PyPDF2>=3.0.0 - PDF text extraction - pytesseract>=0.3.10 - OCR for images - python-docx>=1.1.0 - DOCX text extraction

Tesseract (for OCR)

Ubuntu/Debian:

sudo apt-get install tesseract-ocr

macOS:

brew install tesseract

Windows: Download from GitHub


5. Start Celery Worker

WhatsApp message processing requires Celery:

# Start worker
celery -A config worker -l info -Q whatsapp,default

# Start beat scheduler (for cleanup tasks)
celery -A config beat -l info

Celery Beat Schedule

Add to settings.py:

CELERY_BEAT_SCHEDULE = {
    'cleanup-inactive-sessions': {
        'task': 'whatsapp.cleanup_inactive_sessions',
        'schedule': crontab(hour=2, minute=0),  # Daily at 2 AM
    },
}

6. Initial Data Sync

Sync existing database records to vector store:

# Via Django management command
python manage.py shell

>>> from whatsapp_integration.tasks import sync_embeddings
>>> sync_embeddings.delay('your-company-id')

Or trigger via Super Admin UI at /super-admin/vector-store.


7. Frontend Setup

Wrap App in SupabaseProvider

Edit frontend/src/main.tsx:

import { SupabaseProvider } from './contexts/SupabaseContext';

ReactDOM.createRoot(document.getElementById('root')!).render(
  <React.StrictMode>
    <UserProvider>
      <SupabaseProvider>
        <App />
      </SupabaseProvider>
    </UserProvider>
  </React.StrictMode>
);

Build Frontend

cd frontend
npm install
npm run build

8. Verify Deployment

Health Checks

Endpoint Expected
GET /api/whatsapp/webhook/?hub.verify_token=xxx&hub.challenge=123 Returns 123
GET /api/whatsapp/sessions/ Returns empty array or sessions
GET /api/whatsapp/stats/ Returns statistics object

Test Message Flow

  1. Send "Hello" to your WhatsApp number
  2. Check logs for webhook receipt
  3. Verify response is sent back
  4. Check /whatsapp dashboard for session

9. Production Checklist

  • [ ] pgvector extension enabled
  • [ ] Migrations applied
  • [ ] Environment variables set
  • [ ] Webhook URL configured in Meta
  • [ ] Celery worker running
  • [ ] Redis connection verified
  • [ ] SSL certificate valid (required for webhooks)
  • [ ] Frontend built and deployed
  • [ ] SupabaseProvider wrapping App
  • [ ] Test message sent successfully

10. Monitoring

Key Metrics

Metric Source
Messages processed ChatMessage.objects.count()
Active sessions ChatSession.objects.filter(is_active=True).count()
Vector count KnowledgeVector.objects.count()
Agent usage ChatAuditLog.objects.values('agent_id').annotate(count=Count('id'))

Logging

Key log patterns to monitor:

INFO whatsapp_integration.router - Routed query to sales agent
WARNING whatsapp_integration.views - SUPER ADMIN cleared vectors
ERROR whatsapp_integration.tasks - Failed to embed record

Troubleshooting

Issue Solution
Webhook not receiving Check SSL, verify URL in provider console
Embeddings failing Verify API key, check quota
Messages not sending Check provider credentials, phone format
Slow responses Check Celery worker, increase concurrency
Vector search returns nothing Run sync_embeddings task

Support

For issues, check: 1. Django logs: tail -f logs/django.log 2. Celery logs: tail -f logs/celery.log 3. Provider dashboards (Meta)