This repository contains a Python application that leverages advanced AI models to perform various image-related tasks. It integrates models like Imagen3, Imagen2 on VertexAI, Ollama, and Gemma to provide functionalities such as image generation, captioning, question answering, editing, and interactive dialogue.
- Image Generation:
- Create new images using the Imagen3 model on Vertex AI.
- Generate images based on text prompts.
- Image Captioning:
- Obtain descriptive text for images using visual captioning techniques.
- Visual Question Answering (VQA):
- Answer questions about images using the power of VQA models.
- Image Editing:
- Modify images based on text prompts using Imagen2.
- Image-Based Dialogue:
- Upload an image and engage in a conversation with the image using Ollama and Gemma.
- Dockerization:
- Dockerfiles provided for deploying the application and models to cloud platforms like Cloud Run.
- User Interface:
- A user-friendly interface built with Gradio for easy interaction.
- Python: The primary programming language for the application.
- Imagen3 and Imagen2: Advanced AI image generation and editing models.
- Gradio: Gradio is the fastest way to demo your machine learning model with a friendly web interface so that anyone can use it, anywhere!
- Ollama: Get up and running with large language models.
- Gemma: These models are designed to be lightweight and efficient, making them suitable for a variety of natural language processing tasks
The Gradio interface will provide a user-friendly way to interact with the application. You can:
-
Build the Docker Image:
gcloud builds submit --region=us-central1 --tag gcr.io/devhack-3f0c2/ollama-server --machine-type e2-highcpu-32 gcloud builds submit --region=us-central1 --tag gcr.io/devhack-3f0c2/generativeimages
-
Deploy to Cloud Run:
gcloud run deploy ollama-gemma --image=gcr.io/devhack-3f0c2/ollama-server --concurrency 4 --cpu 2 --set-env-vars OLLAMA_NUM_PARALLEL=4 --max-instances 7 --memory 8Gi --no-allow-unauthenticated --no-cpu-throttling --timeout=600 --region=us-central1 --platform managed gcloud run deploy generativeimages --image=gcr.io/devhack-3f0c2/generativeimages --region=us-central1 --platform managed
Note: Ensure you have the API keys and configurations for the cloud platform and AI services.
Made with ❤ by jggomez.
Copyright 2024 Juan Guillermo Gómez
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.