SYS://VISION.ACTIVE
VIEWPORT.01
LAT 28.0222° N
SIGNAL.NOMINAL
VISION Loading
Back to Blog

RAG in Laravel: Building AI That Knows Your Data

Shane Barron

Shane Barron

Laravel Developer & AI Integration Specialist

The Problem with General AI

Large language models are trained on public data and have knowledge cutoffs. They don't know about your products, your documentation, or your company policies. RAG (Retrieval Augmented Generation) solves this by retrieving relevant context from your data before generating responses.

How RAG Works

  1. User asks a question
  2. System searches your knowledge base for relevant documents
  3. Retrieved documents are included in the AI prompt
  4. AI generates a response grounded in your actual data

Setting Up Vector Storage

Vectors (embeddings) allow semantic search—finding content by meaning, not just keywords:

// Using pgvector with PostgreSQL
Schema::create('documents', function (Blueprint $table) {
    $table->id();
    $table->string('title');
    $table->text('content');
    $table->vector('embedding', 1536); // OpenAI embedding dimension
    $table->timestamps();

    $table->index('embedding', 'documents_embedding_idx')
        ->algorithm('ivfflat')
        ->with(['lists' => 100]);
});

Generating Embeddings

class EmbeddingService
{
    public function __construct(private Client $client) {}

    public function generate(string $text): array
    {
        $response = $this->client->post('https://api.openai.com/v1/embeddings', [
            'json' => [
                'model' => 'text-embedding-3-small',
                'input' => $text,
            ],
        ]);

        $data = json_decode($response->getBody(), true);
        return $data['data'][0]['embedding'];
    }
}

// Index a document
$document = Document::create([
    'title' => $title,
    'content' => $content,
    'embedding' => $embeddingService->generate($content),
]);

Semantic Search

class DocumentSearch
{
    public function search(string $query, int $limit = 5): Collection
    {
        $queryEmbedding = $this->embeddings->generate($query);

        return Document::query()
            ->selectRaw('*, embedding <=> ? as distance', [$queryEmbedding])
            ->orderBy('distance')
            ->limit($limit)
            ->get();
    }
}

The RAG Pipeline

class RAGService
{
    public function __construct(
        private DocumentSearch $search,
        private AIClient $ai
    ) {}

    public function answer(string $question): string
    {
        // Retrieve relevant documents
        $documents = $this->search->search($question, 5);

        // Build context
        $context = $documents->map(fn ($doc) =>
            "Title: {$doc->title}\nContent: {$doc->content}"
        )->join("\n\n---\n\n");

        // Generate answer with context
        $prompt = <<ai->generate($prompt);
    }
}

Chunking Strategies

Large documents should be split into smaller chunks for better retrieval:

class DocumentChunker
{
    public function chunk(string $content, int $chunkSize = 500, int $overlap = 50): array
    {
        $words = explode(' ', $content);
        $chunks = [];
        $start = 0;

        while ($start < count($words)) {
            $chunk = array_slice($words, $start, $chunkSize);
            $chunks[] = implode(' ', $chunk);
            $start += $chunkSize - $overlap;
        }

        return $chunks;
    }
}

// When indexing
$chunks = $chunker->chunk($document->content);
foreach ($chunks as $index => $chunk) {
    DocumentChunk::create([
        'document_id' => $document->id,
        'chunk_index' => $index,
        'content' => $chunk,
        'embedding' => $embeddings->generate($chunk),
    ]);
}

Hybrid Search

Combine semantic and keyword search for best results:

public function hybridSearch(string $query, int $limit = 5): Collection
{
    // Semantic search
    $semanticResults = $this->semanticSearch($query, $limit * 2);

    // Keyword search (full-text)
    $keywordResults = Document::whereRaw(
        "to_tsvector('english', content) @@ plainto_tsquery('english', ?)",
        [$query]
    )->limit($limit * 2)->get();

    // Merge and rerank
    return $this->rerank(
        $semanticResults->merge($keywordResults)->unique('id'),
        $query
    )->take($limit);
}

Conclusion

RAG bridges the gap between general AI capabilities and your specific domain knowledge. Start with a simple implementation, measure retrieval quality, and iterate. The quality of your retrieval directly impacts the quality of AI responses.

Share this article
Shane Barron

Shane Barron

Strategic Technology Architect with 40 years of experience building production systems. Specializing in Laravel, AI integration, and enterprise architecture.

Need Help With Your Project?

I respond to all inquiries within 24 hours. Let's discuss how I can help build your production-ready system.

Get In Touch