Chrome extension with Llama 4 multi-modal AI via Groq's fast inference. Chat with AI and analyze webpages directly in your browser.
This Chrome extension demonstrates multi-modal AI chat capabilities using Groq API for ultra-fast inference with Llama 4 models, built as a complete template that you can fork, customize, and deploy as your own Chrome extension.
The extension opens as a chat interface in Chrome's side panel, but its true power lies in its vision capabilities - users can select any region of a webpage or capture full-page screenshots, then ask Llama 4 questions about the visual content, from analyzing UI designs to extracting data from charts and diagrams.
Key Features:
- Multi-Modal AI powered by Llama 4 Maverick and Scout models for text and image understanding
- Visual Analysis: Select regions or capture full-page screenshots for AI-powered image analysis
- Batch Processing: Capture and analyze multiple screenshots simultaneously for comprehensive workflows
- Smart Context: Automatic page metadata integration (title, URL, domain) for enhanced AI understanding
- Markdown Responses: Beautifully formatted AI responses with code blocks, tables, and rich text
- Power User Shortcuts: Comprehensive keyboard shortcuts for lightning-fast workflow
- Contextual Chat: Ask questions about visual content with full context awareness
- Sub-second response times with Groq Fast AI Inference acceleration
Want to test the extension immediately without setting up a development environment?
-
Download the Latest Release
- Go to the Releases section of this repository
- Download the latest
xipchat-extension.zipfile - Extract the ZIP file to a folder on your computer
-
Load the Unpacked Extension
- Open
chrome://extensionsin your Chrome browser - Enable Developer Mode (toggle in the top-right corner)
- Click Load unpacked and select the extracted folder
- Open
-
Start Using xipchat
- Click the xipchat extension icon in your Chrome toolbar
- Enter your Groq API key in the settings
- Begin chatting with Llama 4 and analyzing webpage content!
No coding required - just download, load, and start exploring the power of multi-modal AI in your browser.
-
⌨️ Keyboard Shortcuts: Lightning-fast workflow with comprehensive shortcuts
Ctrl+S- Take full page screenshotCtrl+Shift+S- Select region for screenshotCtrl+N- Start new chatCtrl+B- Toggle batch mode/- Focus message inputEnter- Send messageEscape- Cancel current action
-
📸 Batch Screenshot Processing: Capture multiple screenshots and analyze them together
- Queue multiple screenshots for comprehensive analysis
- Visual preview of all captured images
- Perfect for analyzing user flows, comparing designs, or processing workflows
-
🌐 Smart Page Context: Automatically includes page metadata with every screenshot
- Page title, URL, domain, and description
- Enhanced AI understanding of what it's analyzing
- More relevant and accurate responses
-
📝 Markdown Formatted Responses: Professional-quality AI responses
- Code blocks with syntax highlighting
- Tables, lists, and structured content
- Links, headings, and rich text formatting
- Copy-friendly code snippets
- 🎯 Interactive Welcome Screen: Helpful shortcuts and features guide
- ⚡ Optimized Performance: Faster loading and reduced bundle size
- ♿ Accessibility: Improved keyboard navigation and screen reader support
- 🎨 Visual Feedback: Clear indicators for batch mode and processing states
Tech Stack:
- Frontend: Svelte 5, TypeScript, TailwindCSS, DaisyUI
- Build System: Vite with Chrome Extension optimization
- Extension Framework: Chrome Manifest V3 with service workers
- AI Infrastructure: Groq API with Llama 4 Maverick and Scout models
Extension Components:
- Side Panel: Main chat interface accessible from Chrome toolbar
- Content Scripts: Web page interaction capabilities
- Background Service Worker: Extension lifecycle and API management
- Popup/Action: Quick access and settings
- Node.js (v16 or higher)
- npm or pnpm (v7 or higher)
- Google Chrome
- Groq API key (Create a free GroqCloud account and generate an API key here)
-
Clone the repository
git clone https://github.com/xeven777/xipchat cd xipchat -
Install Dependencies
bun install
-
Build for Production
bun run build
The production-ready extension will be output to the
dist/directory. -
Load Extension in Chrome
- Open
chrome://extensionsin your browser - Enable Developer Mode (toggle in the top-right corner)
- Click Load unpacked and select the
dist/folder
- Open
-
Configure API Key
- Click the xipchat extension icon in Chrome toolbar
- Enter your Groq API key in the settings
- Start chatting with Llama 4 models!
- Quick Screenshots: Press
Ctrl+Sfor instant full-page capture orCtrl+Shift+Sfor region selection - Fast Navigation: Use
Ctrl+Nfor new chat,/to focus input,Enterto send - Batch Mode: Toggle with
Ctrl+Bto capture multiple screenshots
- Enable Batch Mode: Click the "📸 Batch Mode" button or press
Ctrl+B - Capture Multiple Screenshots: Take screenshots normally - they'll be added to the batch queue
- Review Your Batch: See thumbnail previews of all captured images
- Analyze Together: Click "Analyze Batch" to send all images for comprehensive analysis
- Exit Batch Mode: Press
Escapeor click the batch toggle to return to single-image mode
- Automatic Context: Page title, URL, and metadata are automatically included with screenshots
- Enhanced Responses: AI receives rich context about what it's analyzing for better accuracy
- Markdown Formatting: AI responses include formatted code, tables, and structured content
- Use batch mode for analyzing user flows across multiple pages
- Combine region selection with batch processing for detailed UI analysis
- Leverage keyboard shortcuts for 10x faster workflow
- The welcome screen shows all available shortcuts and features
.
├── public/ # Static assets (manifest.json, icons)
├── src/
│ ├── background/ # Background scripts for Chrome extension functionality
│ ├── content-script/ # Content scripts for injecting into web pages
│ ├── lib/ # Reusable components, services, stores and types
│ │ ├── components/ # UI components (MainContent, Settings, MarkdownRenderer)
│ │ ├── services/ # Service implementations (Groq API, Keyboard Shortcuts)
│ │ ├── stores/ # State management (theme, settings)
│ │ └── types/ # TypeScript interfaces and type definitions
│ ├── App.svelte # Main application component
│ ├── app.css # Global styles
│ └── main.ts # Application entry point
├── tailwind.config.js # TailwindCSS configuration
├── tsconfig.json # TypeScript configuration
├── vite.config.ts # Vite configuration
├── postcss.config.js # PostCSS plugins (for TailwindCSS)
└── package.json # Project dependencies and scripts
The manifest.json file is located in the public/ directory and defines the Chrome extension's permissions and entry points.
Key Settings:
- Permissions: Add only the permissions you need to maintain user privacy
- Background Service Worker: Configured using Vite for background tasks
- Content Scripts: Enable interaction with web pages
{
"manifest_version": 3,
"name": "xipchat - Llama4 + Groq",
"version": "1.0.0",
"description": "A side panel Chrome extension for chatting with Llama4 multi-modal, accelerated by Groq | Fast AI Inference",
"permissions": ["sidePanel", "storage", "activeTab", "scripting", "tabs"],
"host_permissions": ["<all_urls>"],
"action": {
"default_title": "Open XipChat",
"default_icon": {
"16": "icons/icon16.png",
"32": "icons/icon32.png",
"48": "icons/icon48.png",
"64": "icons/icon64.png",
"128": "icons/icon128.png"
}
},
"side_panel": {
"default_path": "index.html"
},
"icons": {
"16": "icons/icon16.png",
"48": "icons/icon48.png",
"128": "icons/icon128.png"
},
"content_scripts": [
{
"matches": ["<all_urls>"],
"js": ["content-script.js"],
"run_at": "document_idle"
}
],
"background": {
"service_worker": "background.js",
"type": "module"
}
}- TailwindCSS: Highly customizable utility classes for rapid UI design
- DaisyUI: Prebuilt Tailwind components for a polished design
bun run dev: Start the development server with HMRbun run build: Build the extension for production
This template is designed to be a foundation for you to get started with. Key areas for customization:
- Model Selection: Update Groq model configuration in
src/lib/services/directory to use different Llama 4 variants or other Groq-supported models - UI/Styling: Customize themes and components in
src/lib/components/andtailwind.config.js - Extension Permissions: Modify
public/manifest.jsonto add or remove Chrome extension permissions - Chat Features: Extend chat functionality in
src/lib/components and services - Content Script Integration: Customize web page interactions in
src/content-script/
- UI/UX Review & Competitor Analysis: Analyze mockups, competitor sites, accessibility issues, A/B test variations with Figma API integration, color palette extraction, brand guideline checks
- Data Analysis & Reporting: Extract insights from charts, dashboards, financial graphs, competitor research with automated report generation, analytics platform integration, price monitoring alerts
- Academic Support & Learning: Analyze research papers, diagrams, educational materials, language learning content with subject-specific prompts, plagiarism detection, multi-language support
- Business Operations & Process Optimization: Product analysis, inventory dashboards, documentation generation with CRM/analytics platform integration, automated workflow triggers, compliance checking
- Code Analysis & Documentation: Analyze code screenshots, API visualizations, system monitoring, technical writing with GitHub/GitLab integration, automated documentation, code quality analysis
- ✅ Batch Processing: Multiple screenshot analysis, automated workflows (IMPLEMENTED)
- ✅ Smart Context Integration: Automatic page metadata inclusion (IMPLEMENTED)
- ✅ Markdown Rendering: Rich text formatting for AI responses (IMPLEMENTED)
- ✅ Keyboard Shortcuts: Power user workflow optimization (IMPLEMENTED)
- Export Features: PDF reports, integration APIs (Slack, Notion, Jira)
- Custom Templates: Industry-specific prompts, analytics dashboard
- Collaboration: Share results, team management, usage tracking
- OCR Integration: Text extraction from images before AI analysis
- Create your free GroqCloud account: Access official API docs, the playground for experimentation, and more resources via Groq Console
- Build and customize: Fork this repo and start customizing to build out your own Chrome extension with AI capabilities
- Explore Chrome Extension APIs: Learn more about Chrome Extension development to add advanced features
- Get support: Connect with other developers building on Groq, chat with our team, and submit feature requests on our Groq Developer Forum
- Minimal Permissions: Only request permissions that are absolutely necessary
- Static Asset Validation: Ensure all static assets (icons, scripts) are valid and trusted
- Content Script Isolation: Use content scripts judiciously to avoid conflicts with the web page
- Dynamic API Key: By inputting the API key in the front-end, you do not have to deploy the app with a stored secret key
- Llama 4 Documentation
- Groq Documentation
- Svelte Documentation
- Vite Documentation
- Chrome Extension Docs
- TailwindCSS Documentation
- DaisyUI Documentation
This project is licensed under the MIT License - see the LICENSE file for details.
Created by Anish Biswas