This AI assistant leverages real-time speech recognition (Whisper ASR) and visual input (via OpenCV and mss) to understand user context. It uses LangChain with 4o for smart response generation, and TTS for speaking back. It maintains conversation memory and fuses multimodal data to enhance interaction quality.
-
Updated
Apr 30, 2025 - Python