SYS://VISION.ACTIVE
VIEWPORT.01
LAT 28.0222° N
SIGNAL.NOMINAL
VISION Loading
Back to Blog

Building Voice Assistants with Laravel: Speech-to-Text and Beyond

Shane Barron

Shane Barron

Laravel Developer & AI Integration Specialist

Voice as Interface

Voice interfaces are becoming ubiquitous. Building voice capabilities into your Laravel application opens new interaction paradigms—hands-free operation, accessibility improvements, and natural conversation.

Speech-to-Text with Whisper

class SpeechToText
{
    public function transcribe(string $audioPath): string
    {
        $response = Http::attach(
            'file',
            file_get_contents($audioPath),
            'audio.webm'
        )->post('https://api.openai.com/v1/audio/transcriptions', [
            'model' => 'whisper-1',
        ]);

        return $response->json('text');
    }
}

Text-to-Speech

class TextToSpeech
{
    public function synthesize(string $text, string $voice = 'alloy'): string
    {
        $response = Http::withHeaders([
            'Authorization' => 'Bearer ' . config('services.openai.key'),
        ])->post('https://api.openai.com/v1/audio/speech', [
            'model' => 'tts-1',
            'input' => $text,
            'voice' => $voice,
        ]);

        $path = 'audio/' . Str::uuid() . '.mp3';
        Storage::put($path, $response->body());

        return $path;
    }
}

Voice Conversation Loop

class VoiceAssistant
{
    public function processVoiceInput(string $audioPath): array
    {
        // Transcribe
        $text = $this->stt->transcribe($audioPath);

        // Process with chatbot
        $response = $this->chatbot->respond($text);

        // Synthesize response
        $audioResponse = $this->tts->synthesize($response);

        return [
            'transcription' => $text,
            'response_text' => $response,
            'response_audio' => $audioResponse,
        ];
    }
}

Real-Time Processing

class RealtimeVoice
{
    public function streamTranscription(Request $request): StreamedResponse
    {
        return response()->stream(function () use ($request) {
            $audioStream = $request->getContent();

            // Process in chunks
            foreach ($this->chunkAudio($audioStream) as $chunk) {
                $partial = $this->stt->transcribeChunk($chunk);
                echo "data: " . json_encode(['partial' => $partial]) . "\n\n";
                ob_flush();
                flush();
            }
        }, 200, ['Content-Type' => 'text/event-stream']);
    }
}

Conclusion

Voice interfaces add powerful capabilities to applications. Start with basic transcription and synthesis, then build toward real-time conversation. Consider accessibility implications and provide fallback text interfaces.

Share this article
Shane Barron

Shane Barron

Strategic Technology Architect with 40 years of experience building production systems. Specializing in Laravel, AI integration, and enterprise architecture.

Need Help With Your Project?

I respond to all inquiries within 24 hours. Let's discuss how I can help build your production-ready system.

Get In Touch