The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
-
Updated
Jun 18, 2025 - JavaScript
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
A simple "Be My Eyes" web app with a llama.cpp/llava backend
SGS, is a user-friendly, collaborative and versatile browser for visualizing single-cell and spatial multiomics data.
Deep Research through Multi-Agents, using GraphRAG
React component library for crafting user-friendly and engaging conversational experiences
[CVPR2021] SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
Build and explore multimodal web interactives with pieces of paper!
A novel cross-modal decoupling and alignment framework for multimodal representation learning.
Employee Productivity GenAI Assistant Example is an innovative code sample and architecture pattern designed to enhance writing tasks efficiency using AWS serverless technologies and Amazon Bedrock's generative AI models.
Multimodal Infinite Memory AI Agent
Sample skill which demonstrates the new Alexa Presentation Language (APL). The multi modal skill functionality is same as Alexa Fact Skill template it will select a fact at random and tell it to the user when the multi modal skill is invoked and is compatible with devices having display.
How you can add semantic search to your applications. This sample shows how you can use a multimodal model to find images which are semantically similar to some text. New blog coming out soon.
Google Earth Engine tool to generate multi-modal and multi-temporal datasets, including spatially and temporally aligned Sentinel-1 SAR data, Sentinel-2 multispectral data, weather and DEM-based data. A supplementary material for Paluba et al. 2024: "Identification of Optimal Sentinel-1 SAR Polarimetric Parameters for Forest Monitoring in Czechia
Amazon Alexa Skill - "Alexa, ask Fork On The Road"
🧠 | Multimodal Integration of Oncology Data System
Web-Based Exercise Posture Evaluation and AI Voice Feedback System
Three-level multimodal emotion recognition framework to detect emotions combining different inputs with different formats.
Create a supercut montage video with Gemini LLM
This repository contains the implementation of a media search application using Google Cloud Spanner and Vertex AI for generating and searching embeddings.
Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.
To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."