Skip to content

addyosmani/jarvis

Repository files navigation

Jarvis: Multimodal Live Assistant

A real-time, multimodal AI assistant built with React and the Google GenAI SDK. Jarvis integrates live audio and video streaming with advanced AI capabilities, including real-time conversation, internet search, and image generation, powered by Gemini 2.5 Flash and Nano Banana Pro (Gemini 3 Pro Image Preview).

Watch Demo Video

Try the app in AI Studio:

(Note: Requires a billing account for Nano Banana Pro usage)

Documentation

  • Architecture: Overview of the technical stack, core services (LiveService, ToolService), and data flow.
  • API Keys & Configuration: Instructions for setting up the app with Gemini Developer API (AI Studio) or Vertex AI (Google Cloud).
  • Prompt: The original system prompt used to bootstrap the application.

Resources

Usage / What can you say to Jarvis?

The app uses the Live API with Search grounding and image generation/reimagination capabilities. Here are some example prompts to try:

  • "Hello Jarvis, can you tell me today's weather in San Francisco?"
  • "Jarvis, can you create a photo of a futuristic city skyline at sunset?"
  • "Jarvis, please take a photo of me and reimagine it as if I'm in a castle."

Run Locally

Prerequisites: Node.js

  1. Install dependencies:

    npm install
  2. Configure Environment: Set the GEMINI_API_KEY (or Vertex credentials) in .env.local. See API Keys & Configuration for details.

  3. Run the app:

    npm run dev

Releases

No releases published

Packages

No packages published