Skip to content

jggomez/generativeimagevertexai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generative Image with Imagen3 on VertexAI | AI Image Generator

Overview

This repository contains a Python application that leverages advanced AI models to perform various image-related tasks. It integrates models like Imagen3, Imagen2 on VertexAI, Ollama, and Gemma to provide functionalities such as image generation, captioning, question answering, editing, and interactive dialogue.

Key Features

  • Image Generation:
    • Create new images using the Imagen3 model on Vertex AI.
    • Generate images based on text prompts.
  • Image Captioning:
    • Obtain descriptive text for images using visual captioning techniques.
  • Visual Question Answering (VQA):
    • Answer questions about images using the power of VQA models.
  • Image Editing:
    • Modify images based on text prompts using Imagen2.
  • Image-Based Dialogue:
    • Upload an image and engage in a conversation with the image using Ollama and Gemma.
  • Dockerization:
    • Dockerfiles provided for deploying the application and models to cloud platforms like Cloud Run.
  • User Interface:
    • A user-friendly interface built with Gradio for easy interaction.

Technologies Used

  • Python: The primary programming language for the application.
  • Imagen3 and Imagen2: Advanced AI image generation and editing models.
  • Gradio: Gradio is the fastest way to demo your machine learning model with a friendly web interface so that anyone can use it, anywhere!
  • Ollama: Get up and running with large language models.
  • Gemma: These models are designed to be lightweight and efficient, making them suitable for a variety of natural language processing tasks

Usage

The Gradio interface will provide a user-friendly way to interact with the application. You can:

  • Generate images based on text prompts Screenshot 2024-09-17 at 7 00 46 p m

  • Upload images for captioning and question answering Screenshot 2024-09-17 at 7 03 07 p m

  • Editing images based on text prompts Screenshot 2024-09-17 at 6 59 32 p m

  • Engage in conversations with images Screenshot 2024-09-17 at 7 03 27 p m

Cloud Run Deployment

  • Build the Docker Image:

    gcloud builds submit --region=us-central1 --tag gcr.io/devhack-3f0c2/ollama-server --machine-type e2-highcpu-32
    gcloud builds submit --region=us-central1 --tag gcr.io/devhack-3f0c2/generativeimages
  • Deploy to Cloud Run:

    gcloud run deploy ollama-gemma --image=gcr.io/devhack-3f0c2/ollama-server --concurrency 4 --cpu 2 --set-env-vars OLLAMA_NUM_PARALLEL=4 --max-instances 7 --memory 8Gi --no-allow-unauthenticated --no-cpu-throttling --timeout=600 --region=us-central1 --platform managed
    gcloud run deploy generativeimages --image=gcr.io/devhack-3f0c2/generativeimages --region=us-central1 --platform managed

Note: Ensure you have the API keys and configurations for the cloud platform and AI services.

Made with ❤ by jggomez.

Twitter Badge Linkedin Badge Medium Badge

License

Copyright 2024 Juan Guillermo Gómez

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published