Wiro AI LogoWiro AI Logo
  • Models
  • Agents
  • Learn
  • Anatomy
  • Build Your Agent
  • Pre-built Agents
  • Studio
  • Pricing
  • Blog
  • Docs
  • Sign In
  • Sign Up
Wiro AI LogoWiro AI Logo
Models
Agents
StudioPricing
Blog
Docs
All Models
OverviewThe platform at a glance
LearnSkills, knowledge, guardrails
AnatomyWhat makes agents reason
Build Your AgentPick skills, set tier, deploy
Pre-built AgentsBrowse the catalog
Agent Usecases
Ad Campaign ManagerApp Event ManagerApp Review RepliesBarber BookingCustomer Win-BackEcommerce ListingsRestaurant Reviews
Sign In
Sign Up

Task History

Click to see output list
Projects
The list is empty
No results

You don't have task yet.

Go to Models

Explore

12
Social Media & Viral

wiro/Euphoria Effects

Transform images into trending Euphoria effects with customizable styles and durations. Apply various creative filters to create engaging video content.

6
Image to Video

PixVerse/reference-to-video-v6

Generate short cinematic videos from up to 3 reference images plus a prompt. PixVerse V6 keeps subjects consistent and can add synced audio.

Video to Video

PixVerse/video-extend-v6

Extend an existing clip with PixVerse Video Extend V6. Upload a short MP4 or MOV, describe the continuation, and get a longer MP4 in 360p–1080p.
Point: 4.8 (5 users)
5
Text to Video

PixVerse/image-to-video-v6

PixVerse Image to Video V6 animates a still image into a cinematic MP4 clip, with optional first-to-last frame transitions, multi-shot scenes, and audio.
Point: 4.75 (4 users)
5
Social Media & Viral

wiro/World Cup 2026 Effects

Transform images into trending World Cup 2026 effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 4.67 (14 users)
13

Recently Added

Popular Models
Text to Image

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.
Point: 4.811407407407407 (135 users)
123
Image to Video

klingai/kling-v2.6-motion-control

Generates videos from images and reference videos with motion control. Supports custom prompts and character orientation settings.
Point: 4.86 (84 users)
85
Text to Video

klingai/kling-v3

Generate high-quality videos from text prompts using Kling V3. Supports custom frames, duration, and aspect ratios.
Point: 4.87 (78 users)
76
Text to Video

ByteDance/seedance-pro-v1.5-uncensored

Seedance Pro v1.5 Uncensored by ByteDance generates short videos from text with optional native audio, strong prompt following, and accurate lip-sync for dialogue scenes.
Point: 4.78 (68 users)
70
Fast Inference

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.
Point: 4.82 (67 users)
63
Fast Inference

google/nano-banana-pro

Google's Gemini 3 Pro Image Preview, also known as Nano Banana, model for text-to-image and image-to-image generation.
Point: 4.892972972972973 (37 users)
35
Video to Video

wiro/Video Converter

Convert a video to MP4, MOV, WebM, MKV, AVI, MPEG, or M4V with adjustable compression. Built by Wiro for clean exports and size control.
Point: 4 (36 users)
32
Image to Image

wiro/Image Converter

Convert an image to JPEG, PNG, WebP, TIFF, or AVIF. Set output quality from 0 to 100 to balance file size and visual fidelity.
Point: 4.67 (36 users)
29
Social Media & Viral

wiro/panini-card

Turn a selfie into a Panini-style player card video. Enter name, height, weight, and birth date, then choose a country card preset.
Point: 5 (35 users)
31
Image to Image

wiro/smart resize

Smart Resize by Wiro converts one image into multiple exact sizes, using AI recomposition to keep key subjects in frame for ads, social posts, and web.
Point: 4.27 (33 users)
29
Text to Image

openai/gpt-image-2-custom

Create or edit images with OpenAI GPT Image 2 using custom pixel sizes, quality tiers, and format controls. Add optional masks to target specific edits.
Point: 4.27 (33 users)
29
Google

google/lyria 3 pro

Google’s Lyria 3 Pro generates up to 3-minute songs from detailed prompts and an optional image. It returns 48 kHz stereo MP3 audio with structure.
Point: 4.25 (26 users)
20
Text to Image

openai/gpt-image-2

Generate or edit images with GPT Image 2 from OpenAI. It delivers strong instruction following, sharp text rendering, and flexible sizing up to 4K.
Point: 4.72 (25 users)
20
Text to Song

google/lyria 3

Google’s Lyria 3 generates a 30-second, 48 kHz stereo music clip from a detailed description, with an optional image to guide mood and style.
Point: 5 (23 users)
17
Text to Video

ByteDance/Seedance 2.0

Seedance 2.0 by ByteDance generates short MP4 videos from text, with optional image and audio references. It supports synced audio, stable motion, and cinematic camera direction.
Point: 4.86 (20 users)
20
Partner LLM

google/gemini-3.5-flash

Google’s Gemini 3.5 Flash is a fast multimodal reasoning model built for agents, coding, and long-context analysis. It takes text plus media and returns text.
Point: 4.5 (19 users)
11
Image to Video

alibaba/happyhorse 1.0 reference

Edit a short MP4/MOV clip or generate a new 720p/1080p video from up to 9 reference images and a detailed prompt, with optional watermark and seed control.
Point: 5 (19 users)
16
Text to Video

alibaba/happyhorse 1.0

Alibaba’s HappyHorse 1.0 generates 5–15s videos from text or a first-frame image. Choose 720p or 1080p, aspect ratio, seed, and watermark.
Point: 4.4 (19 users)
18
Fast Inference

pruna/p-video-replace

Replace a subject in a source video using 1–3 reference images. Preserves the original motion and can keep the source audio in the final MP4.
Point: 5 (16 users)
14
Social Media & Viral

wiro/Euphoria Effects

Transform images into trending Euphoria effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 5 (15 users)
12
Generate Videos
Social Media & Viral

wiro/Euphoria Effects

Transform images into trending Euphoria effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 5 (15 users)
12
Image to Video

PixVerse/reference-to-video-v6

Generate short cinematic videos from up to 3 reference images plus a prompt. PixVerse V6 keeps subjects consistent and can add synced audio.
Point: 4.833333333333333 (6 users)
6
Video to Video

PixVerse/video-extend-v6

Extend an existing clip with PixVerse Video Extend V6. Upload a short MP4 or MOV, describe the continuation, and get a longer MP4 in 360p–1080p.
Point: 4.8 (5 users)
5
Text to Video

PixVerse/image-to-video-v6

PixVerse Image to Video V6 animates a still image into a cinematic MP4 clip, with optional first-to-last frame transitions, multi-shot scenes, and audio.
Point: 4.75 (4 users)
5
Social Media & Viral

wiro/World Cup 2026 Effects

Transform images into trending World Cup 2026 effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 4.67 (14 users)
13
Social Media & Viral

wiro/World Cup 2026 Effects with Caption

Transform images into trending World Cup 2026 effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 2.5 (11 users)
9
Social Media & Viral

wiro/Scream Effects

Transform images into trending Scream effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 5 (12 users)
11
Video to Video

wiro/Video Converter

Convert a video to MP4, MOV, WebM, MKV, AVI, MPEG, or M4V with adjustable compression. Built by Wiro for clean exports and size control.
Point: 4 (36 users)
32
Social Media & Viral

wiro/panini-card

Turn a selfie into a Panini-style player card video. Enter name, height, weight, and birth date, then choose a country card preset.
Point: 5 (35 users)
31
Fast Inference

pruna/p-video-replace

Replace a subject in a source video using 1–3 reference images. Preserves the original motion and can keep the source audio in the final MP4.
Point: 5 (16 users)
14
Fast Inference

nvidia/Cosmos3-Super

Cosmos 3 Super turns a reference image and motion prompt into a physics-grounded MP4 clip, with controls for aspect ratio, FPS, audio, and safety.
Point: 4.5 (15 users)
13
Social Media & Viral

wiro/Sport Trend Effects

Transform images into trending Sport effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 2.5 (11 users)
9
Text to Video

bytedance-research/Lance-Text-to-Video

Turn a detailed scene description into a short video clip. Lance Text to Video can also animate a still image or rewrite an existing video.
Point: 0 (0 users)
0
Image to Video

xai/grok-imagine-video-1.5

Create short videos from a still image plus a motion prompt. Grok Imagine Video 1.5 by xAI outputs 480p or 720p MP4 clips with synced audio.
Point: 5 (10 users)
9
Social Media & Viral

wiro/Insta Hot Girl Effects

Transform images into Insta Hot Girl effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 5 (12 users)
13
Image to Video

SulphurAI/Sulphur-2-base

Sulphur 2 Base by SulphurAI is an LTX 2.3 fine-tune for text-to-video and image-to-video. Create short MP4 clips with first and last frame control.
Point: 5 (3 users)
5
Social Media & Viral

wiro/Queer Editorial Effects

Transform images into Queer Editorial effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 4.29 (11 users)
12
Social Media & Viral

wiro/Tiktok Trend Effects

Transform images into trending TikTok effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 3 (7 users)
7
Fast Inference

pruna/wan-i2v

Pruna's WAN I2V animates one still image into a short MP4 video with prompt-driven motion. Pick 480p or 720p, plus optional last-frame control.
Point: 4.5 (15 users)
13
Fast Inference

pruna/wan-t2v

Pruna’s WAN T2V generates a short MP4 video from a detailed scene prompt. Pick 480p or 720p, choose 16:9 or 9:16, and control motion and safety options.
Point: 5 (12 users)
12
Generate Images
Fast Inference

microsoft/lens

A foundational text-to-image model designed for efficient, high-resolution image generation with strong prompt following and multilingual support.
Point: 5 (2 users)
2
Text to Image

openai/gpt-image-2-custom

Create or edit images with OpenAI GPT Image 2 using custom pixel sizes, quality tiers, and format controls. Add optional masks to target specific edits.
Point: 4.27 (33 users)
29
Fast Inference

sensenova/U1-8B-Text-to-Image

SenseNova U1-8B turns text into high-detail images with strong layout control and clearer in-image text. Use it for posters, infographics, and concept art.
Point: 5 (1 users)
1
Text to Image

openai/gpt-image-2

Generate or edit images with GPT Image 2 from OpenAI. It delivers strong instruction following, sharp text rendering, and flexible sizing up to 4K.
Point: 4.72 (25 users)
20
Text to Image

xai/grok-imagine-image

xAI’s Grok Imagine Image generates images from prompts and can edit a source image using instructions. Pick an aspect ratio, 1K or 2K, and 1–10 samples.
Point: 3.33 (10 users)
7
Text to Image

openai/gpt-image-1-5

Generate or edit images using text prompts or image edits with GPT Image 1.5. Supports multiple sizes, formats, and quality settings.
Point: 5 (7 users)
7
Text to Image

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.
Point: 4.811407407407407 (135 users)
123
Fast Inference

FireRedTeam/FireRed-Image-Edit-1.1

FireRed-Image-Edit-1.1 significantly enhances identity consistency, multi-image conditioning, and domain-specialized editing capabilities, bringing it closer to meeting real-world creative production demands.
Point: 5 (1 users)
1
Text to Image

ByteDance/seedream-v5-lite-uncensored

Generate high-quality images from text prompts or image inputs using the Seedream v5 Lite Uncensored model. Supports multiple resolutions and aspect ratios.
Point: 3.73 (13 users)
12
Fast Inference

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.
Point: 4.82 (67 users)
63
Text to Image

ByteDance/seedream-v5-lite

Generate high-quality images using Seedream V5 Lite, supporting both image-to-image and text-to-image transformations with customizable settings.
Point: 5 (2 users)
2
Fast Inference

FireRedTeam/FireRed-Image-Edit

FireRed-Image-Edit is a general-purpose image editing model that delivers high-fidelity and consistent editing across a wide range of scenarios.
Point: 5 (2 users)
2
Fast Inference

zai-org/GLM-IMAGE

GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture. In general image generation quality, GLM‑Image aligns with mainstream latent diffusion approaches, but it shows significant advantages in text-rendering and knowledge‑intensive generation scenarios. It performs especially well in tasks requiring precise semantic understanding and complex information expression, while maintaining strong capabilities in high‑fidelity and fine‑grained detail generation. In addition to text‑to‑image generation, GLM‑Image also supports a rich set of image‑to‑image tasks including image editing, style transfer, identity‑preserving generation, and multi‑subject consistency.
Point: 5 (1 users)
1
Fast Inference

black-forest-labs/FLUX.2-klein-base-9B

FLUX.2 [klein] 9B Base is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-base-4B

FLUX.2 [klein] 4B Base is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-4B

FLUX.2 [klein] 4B is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-9B

FLUX.2 [klein] 9B is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

wiro/FLUX.2-dev-turbo

FLUX.2 [dev] is a 32 billion parameter rectified flow transformer capable of generating, editing and combining images based on text instructions.
Point: 0 (0 users)
1
Text to Image

meituan-longcat/LongCat-Image-Edit

LongCat-Image-Edit, the image editing version of LongCat-Image.
Point: 5 (1 users)
0
Text to Image

meituan-longcat/LongCat-Image

LongCat-Image is a 6B-parameter model built for high-quality image generation, delivering strong multilingual text rendering, realistic visuals, and efficient deployment.
Point: 0 (0 users)
0
Edit Images
Image to Image

wiro/Image Converter

Convert an image to JPEG, PNG, WebP, TIFF, or AVIF. Set output quality from 0 to 100 to balance file size and visual fidelity.
Point: 4.67 (36 users)
29
Image to Image

wiro/smart resize

Smart Resize by Wiro converts one image into multiple exact sizes, using AI recomposition to keep key subjects in frame for ads, social posts, and web.
Point: 4.27 (33 users)
29
Fast Inference

sensenova/U1-8B-Interleave

u1-8b-interleave by SenseNova generates step-by-step text with matching images, optionally conditioned on up to 5 reference images. It suits tutorials, diaries, and infographics.
Point: 5 (3 users)
1
Text to Image

openai/gpt-image-2-custom

Create or edit images with OpenAI GPT Image 2 using custom pixel sizes, quality tiers, and format controls. Add optional masks to target specific edits.
Point: 4.27 (33 users)
29
Text to Image

openai/gpt-image-2

Generate or edit images with GPT Image 2 from OpenAI. It delivers strong instruction following, sharp text rendering, and flexible sizing up to 4K.
Point: 4.72 (25 users)
20
Text to Image

xai/grok-imagine-image

xAI’s Grok Imagine Image generates images from prompts and can edit a source image using instructions. Pick an aspect ratio, 1K or 2K, and 1–10 samples.
Point: 3.33 (10 users)
7
Text to Image

openai/gpt-image-1-5

Generate or edit images using text prompts or image edits with GPT Image 1.5. Supports multiple sizes, formats, and quality settings.
Point: 5 (7 users)
7
Social Media & Viral

wiro/Instagram Pose Multi

Generate stylish Instagram-style pose images with trendy angles, natural expressions, and a modern aesthetic. Built by Wiro for social content.
Point: 4.8 (11 users)
8
Text to Image

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.
Point: 4.811407407407407 (135 users)
123
Fast Inference

FireRedTeam/FireRed-Image-Edit-1.1

FireRed-Image-Edit-1.1 significantly enhances identity consistency, multi-image conditioning, and domain-specialized editing capabilities, bringing it closer to meeting real-world creative production demands.
Point: 5 (1 users)
1
Text to Image

ByteDance/seedream-v5-lite-uncensored

Generate high-quality images from text prompts or image inputs using the Seedream v5 Lite Uncensored model. Supports multiple resolutions and aspect ratios.
Point: 3.73 (13 users)
12
Fast Inference

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.
Point: 4.82 (67 users)
63
Text to Image

ByteDance/seedream-v5-lite

Generate high-quality images using Seedream V5 Lite, supporting both image-to-image and text-to-image transformations with customizable settings.
Point: 5 (2 users)
2
Fast Inference

FireRedTeam/FireRed-Image-Edit

FireRed-Image-Edit is a general-purpose image editing model that delivers high-fidelity and consistent editing across a wide range of scenarios.
Point: 5 (2 users)
2
Social Media & Viral

wiro/Shopify Template

Generate customizable Shopify templates from product photos with various aspect ratios and design options.
Point: 2 (4 users)
1
Fast Inference

zai-org/GLM-IMAGE

GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture. In general image generation quality, GLM‑Image aligns with mainstream latent diffusion approaches, but it shows significant advantages in text-rendering and knowledge‑intensive generation scenarios. It performs especially well in tasks requiring precise semantic understanding and complex information expression, while maintaining strong capabilities in high‑fidelity and fine‑grained detail generation. In addition to text‑to‑image generation, GLM‑Image also supports a rich set of image‑to‑image tasks including image editing, style transfer, identity‑preserving generation, and multi‑subject consistency.
Point: 5 (1 users)
1
Fast Inference

black-forest-labs/FLUX.2-klein-base-9B

FLUX.2 [klein] 9B Base is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-base-4B

FLUX.2 [klein] 4B Base is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-4B

FLUX.2 [klein] 4B is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Fast Inference

black-forest-labs/FLUX.2-klein-9B

FLUX.2 [klein] 9B is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.
Point: 0 (0 users)
0
Generate Text
Fast Inference

sensenova/U1-8B-Interleave

u1-8b-interleave by SenseNova generates step-by-step text with matching images, optionally conditioned on up to 5 reference images. It suits tutorials, diaries, and infographics.
Point: 5 (3 users)
1
Fast Inference

sensenova/U1-8B-Visual-Understanding

SenseNova U1 8B Visual Understanding creates infographic-style images and prompt-based edits from text and an optional reference image. Built by SenseNova on NEO-Unify.
Point: 5 (2 users)
1
Fast Inference

nvidia/parakeet-tdt-0.6b-v3

Multilingual speech-to-text for 25 European languages with auto language detection, punctuation, capitalization, and optional timestamps.
Point: 5 (5 users)
8
Fast Inference

CohereLabs/cohere-transcribe-03-2026

CohereLabs cohere-transcribe-03-2026 is a 2B Conformer speech-to-text model for 14 languages. It creates accurate transcripts for meetings, calls, and audio archives.
Point: 5 (4 users)
7
Fast Inference

Qwen/Qwen3-ASR-1.7B

A lightweight speech-to-text model optimized for fast inference. Converts audio input into text with support for multiple languages.
Point: 0 (0 users)
0
Fast Inference

mistralai/Voxtral-Mini-4B-Realtime-2602

Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source solutions to achieve accuracy comparable to offline systems with a delay of <500ms. It supports 13 languages and outperforms existing open-source baselines across a range of tasks, making it ideal for applications like voice assistants and live subtitling.
Point: 0 (0 users)
0
Speech to Text

nvidia/nemotron

Nemotron-Speech-Streaming-En-0.6b is the first unified model in the Nemotron Speech family, engineered to deliver high-quality English transcription across both low-latency streaming and high-throughput batch workloads. The model natively supports punctuation and capitalization and offers runtime flexibility with configurable chunk sizes, including 80ms, 160ms, 560ms, and 1120ms.
Point: 5 (1 users)
0
Speech to Text

elevenlabs/speech-to-text

Speech to text model from ElevenLabs
Point: 0 (0 users)
0
Image to Text

moondream3-preview/detect

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Image to Text

moondream3-preview/point

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Image to Text

moondream3-preview/caption

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Image to Text

moondream3-preview/query

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.
Point: 0 (0 users)
0
Speech to Text

openai/whisper-large-v3-turbo-turkish

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalise to many datasets and domains without the need for fine-tuning.
Point: 5 (1 users)
0
Video to Text

wiro/video-nsfw-detection

NSFW video detection automatically analyzes video content to identify inappropriate or explicit material, ensuring compliance with content policies and a safe viewing environment.
Point: 5 (1 users)
0
nocover
Video to Text

DAMO-NLP-SG/VideoLLaMA3-2B

VideoLLaMA3-2B is a model designed for video understanding.
Point: 0 (0 users)
0
nocover
Image to Text

DAMO-NLP-SG/VideoLLaMA3-2B-Image

VideoLLaMA3-2B-Image is a model designed for image understanding.
Point: 0 (0 users)
0
Image to Text

wiro/VideoLLaMA3-7B-Image

VideoLLaMA3-7B-Image is a model designed for image understanding.
Point: 0 (0 users)
0
Video to Text

wiro/VideoLLaMA3-7B

VideoLLaMA3-7B is a model designed for video understanding.
Point: 0 (0 users)
0
Image to Text

Salesforce/blip2-flan-t5-xl

BLIP-2 creates captions or detailed descriptions for images. This is BLIP-2 model, leveraging Flan T5-xl.
Point: 0 (0 users)
0
Image to Text

Salesforce/blip-image-captioning-large

BLIP is a model that is able to perform various multi-modal tasks including visual question answering and image captioning. This is the blip image captioning large model.
Point: 0 (0 users)
0
Generate 3D
3D Generation

TencentARC/Pixal3D

Pixal3D converts a single image into a textured 3D GLB mesh. It uses pixel-to-3D back-projection so geometry stays aligned to the input view.
Point: 4.2 (6 users)
4
3D Generation

tencent/HY-World-2.0-World-Reconstruction

Reconstruct a 3D scene from multi-view photos or a short video. Produces 3D Gaussian splats with depth maps and cameras, plus a colored point cloud.
Point: 5 (1 users)
1
3D Generation

microsoft/Trellis-2

Convert images into detailed 3D meshes using Microsoft's Trellis-2 model. Supports various resolutions and customization options.
Point: 5 (4 users)
4
3D Generation

tencent/Hunyuan3D-2.1

Generate 3D models from images using the Hunyuan3D-2.1 AI tool. Transform 2D inputs into detailed 3D assets for design and development.
Point: 5 (2 users)
1
Generate Audio
Fast Inference

OpenMOSS/MOSS-TTS-v1.5

OpenMOSS MOSS-TTS v1.5 turns text into natural speech, with optional zero-shot voice cloning from a reference clip. It supports 31 languages plus pause and pronunciation control.
Point: 2 (1 users)
0
Fast Inference

openbmb/VoxCPM2

A real-time, multilingual text-to-speech system offering expressive voice design and high-fidelity voice cloning through low-latency streaming inference.
Point: 1 (1 users)
2
Fast Inference

k2-fsa/OmniVoice

OmniVoice by k2-fsa generates 24 kHz speech from text in 600+ languages. Clone a speaker from a short reference clip or design a new voice from attributes.
Point: 0 (0 users)
0
Fast Inference

humeai/tada-3b-ml

TADA is a unified speech-language model that synchronizes speech and text into a single, cohesive stream via 1:1 alignment. By leveraging a novel tokenizer and architectural design, TADA achieves high-fidelity synthesis and generation with a fraction of the computational overhead required by traditional models.
Point: 0 (0 users)
0
Fast Inference

fishaudio/s2-pro

Generates high-quality speech from text using advanced TTS technology with support for voice cloning and multi-speaker synthesis.
Point: 0 (0 users)
0
Fast Inference

nineninesix/kani-tts-2-en

Generates natural-sounding speech from text with support for multi-speaker voice cloning and fast inference capabilities.
Point: 0 (0 users)
0
Text to Speech

resemble-ai/chatterbox-turbo

The fastest open source TTS model without sacrificing quality.
Point: 5 (1 users)
1
Text to Speech

resemble-ai/chatterbox-multilingual

Generate expressive, natural speech in 23 languages. Features instant voice cloning from short audio, emotion control, and seamless cross-language voice transfer.
Point: 0 (0 users)
0
Fast Inference

OpenMOSS/MOSS-TTSD

MOSS-TTSD is a production long-form dialogue model for expressive multi-speaker conversational audio at scale. It supports long-duration continuity, turn-taking control, and zero-shot voice cloning from short references for podcasts, audiobooks, commentary, dubbing, and entertainment dialogue.
Point: 0 (0 users)
0
Fast Inference

OpenMOSS/MOSS-TTS-Realtime

Real-time streaming text-to-speech with zero-shot voice cloning. Supports 20 languages including English, Chinese, Japanese, Korean, and more. Audio starts playing immediately — no waiting for full generation. Clone any voice from a short reference clip.
Point: 0 (0 users)
0
Fast Inference

elevenlabs/Realtime Conversational AI

A real-time voice conversation tool using ElevenLabs' AI voice agents. Customize voices, behaviors, and languages for interactive AI experiences.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime-mini

GPT Mini Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime

GPT Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Fast Inference

Qwen/Qwen3-TTS-12Hz-1.7B

A fast inference text-to-speech model optimized for real-time audio generation with multi-language support.
Point: 0 (0 users)
0
Speech to Speech

nvidia/PersonaPlex-Realtime

Convert speech to speech with customizable voices using PersonaPlex. Supports various audio formats and offers control over text temperature and audio top K settings.
Point: 4 (3 users)
1
Fast Inference

microsoft/VibeVoice-Realtime

VibeVoice-Realtime is a lightweight real‑time text-to-speech model supporting streaming text input and robust long-form speech generation.
Point: 5 (1 users)
1
Text to Speech

openbmb/VoxCPM

Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Point: 0 (0 users)
0
Text to Speech

google/gemini-2.5-tts

Google's Gemini 2.5 Flash Text To Speech Preview model
Point: 5 (2 users)
5
Text to Speech

elevenlabs/text-to-speech

Text to speech model from ElevenLabs
Point: 5 (2 users)
0
Social Media & Viral

wiro/Faceless-Video-Generator

Create professional short videos (30s) for YouTube Shorts, Instagram Reels, TikTok, and X (Twitter) – from a single prompt. Automatically generate speech, captions, and optional talking head avatars using AI. Perfect for content creators, marketers, and educators looking to grow faster with less effort.
Point: 5 (1 users)
1
Generate Music
Fast Inference

stabilityai/stable-audio-3-small-sfx

Stable Audio 3 Small SFX by Stability AI generates stereo sound effects from text prompts, up to 120 seconds, for games, video, and product UX.
Point: 0 (0 users)
1
Fast Inference

stabilityai/stable-audio-3-small-music

Create up to 2 minutes of stereo music from a text description with Stable Audio 3 Small Music by Stability AI. Control duration, steps, and guidance.
Point: 0 (0 users)
0
Fast Inference

stabilityai/stable-audio-3-medium

Stable Audio 3 Medium by Stability AI generates stereo music from text prompts for up to 380 seconds. Control duration, steps, guidance, and output format. ([fal.ai](https://fal.ai/models/fal-ai/stable-audio-3/medium/text-to-audio/api))
Point: 5 (1 users)
0
Text to Song

google/lyria 3

Google’s Lyria 3 generates a 30-second, 48 kHz stereo music clip from a detailed description, with an optional image to guide mood and style.
Point: 5 (23 users)
17
Music Generation

tencent-ailab/SongGeneration 2

Song Generation 2 generates full songs with vocals and instrumentals from lyrics — supports 14 genres, reference audio cloning, and separate track output (vocal/bgm/mix).
Point: 5 (3 users)
3
Video to Video

wiro/video-background-music-v2

It turns any video into a cinematic experience by generating AI-powered instrumental soundtracks that match its mood.
Point: 4 (1 users)
3
Music Generation

ACE-Step/text-to-song-ACE-Step1.5

ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer hardware.
Point: 5 (4 users)
2
Social Media & Viral

wiro/Song Frame

SongFrame places you into a cinematic world, pulls the soundtrack directly from your YouTube link, and fuses everything into a polished video — effortless, emotional, and instantly shareable.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Faceless-Video-Generator

Create professional short videos (30s) for YouTube Shorts, Instagram Reels, TikTok, and X (Twitter) – from a single prompt. Automatically generate speech, captions, and optional talking head avatars using AI. Perfect for content creators, marketers, and educators looking to grow faster with less effort.
Point: 5 (1 users)
1
Video to Video

wiro/video-background-music-gen

It’s a tool that gets your video, creates original music to match its vibe, and seamlessly adds it back as the perfect soundtrack.
Point: 0 (0 users)
0
Music Generation

ACE-Step/image-to-song-ACE-Step-v1-3.5B

Generate high-quality songs from image in seconds. Whether you're crafting instrumental tracks or full vocal compositions, bring your musical ideas to life with the power of AI. Ideal for artists, producers, and creative minds looking to turn inspiration into sound.
Point: 0 (0 users)
0
Music Generation

ACE-Step/text-to-song-ACE-Step-v1-3.5B

Generate high-quality songs from text prompts in seconds. Whether you're crafting instrumental tracks or full vocal compositions, bring your musical ideas to life with the power of AI. Ideal for artists, producers, and creative minds looking to turn inspiration into sound.
Point: 5 (1 users)
0
Music Generation

wiro/image-to-song-with-reference-YuE

Turn any song into your own with AI. Simply upload a reference track and provide your image — AI will recreate the song in your style, preserving the vocal style, instrumental feel, or both. Perfect for covers, parodies, remixes, or personalized creations.
Point: 0 (0 users)
0
Music Generation

wiro/image-to-song-YuE

Turn your image into a full song with vocals and instrumental music in seconds. Just upload your image and let AI compose and sing it for you. Perfect for creators, musicians, and storytellers.
Point: 0 (0 users)
0
Music Generation

wiro/text-to-song-with-reference-YuE

Turn any song into your own with AI. Simply upload a reference track and provide your custom lyrics — AI will recreate the song in your style, preserving the vocal style, instrumental feel, or both. Perfect for covers, parodies, remixes, or personalized creations.
Point: 0 (0 users)
0
Music Generation

wiro/text-to-song-YuE

Turn your lyrics into a full song with vocals and instrumental music in seconds. Just enter your lyrics and let AI compose and sing it for you. Perfect for creators, musicians, and storytellers.
Point: 0 (0 users)
0
Music Generation

wiro/music_gen

MusicGen is a text-to-music model capable of generating high-quality music samples.
Point: 0 (0 users)
0
Realtime Stream
Fast Inference

openbmb/VoxCPM2

A real-time, multilingual text-to-speech system offering expressive voice design and high-fidelity voice cloning through low-latency streaming inference.
Point: 1 (1 users)
2
Fast Inference

mistralai/Voxtral-Mini-4B-Realtime-2602

Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source solutions to achieve accuracy comparable to offline systems with a delay of <500ms. It supports 13 languages and outperforms existing open-source baselines across a range of tasks, making it ideal for applications like voice assistants and live subtitling.
Point: 0 (0 users)
0
Fast Inference

OpenMOSS/MOSS-TTS-Realtime

Real-time streaming text-to-speech with zero-shot voice cloning. Supports 20 languages including English, Chinese, Japanese, Korean, and more. Audio starts playing immediately — no waiting for full generation. Clone any voice from a short reference clip.
Point: 0 (0 users)
0
Fast Inference

elevenlabs/Realtime Conversational AI

A real-time voice conversation tool using ElevenLabs' AI voice agents. Customize voices, behaviors, and languages for interactive AI experiences.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime-mini

GPT Mini Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Fast Inference

openai/gpt-realtime

GPT Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.
Point: 5 (1 users)
0
Speech to Speech

nvidia/PersonaPlex-Realtime

Convert speech to speech with customizable voices using PersonaPlex. Supports various audio formats and offers control over text temperature and audio top K settings.
Point: 4 (3 users)
1
LLM & Chat
Partner LLM

google/gemini-3.5-flash

Google’s Gemini 3.5 Flash is a fast multimodal reasoning model built for agents, coding, and long-context analysis. It takes text plus media and returns text.
Point: 4.5 (19 users)
11
Chat

Qwen/Qwen3.6-27B

Qwen3.6-27B is a 27B dense vision-language model from Qwen for agentic coding and reasoning. It supports 262K-token context and optional thinking traces in outputs.
Point: 5 (5 users)
6
LLM

xai/grok-4-1-fast

Grok 4.1 Fast by xAI is a long-context chat model built for tool calling and agent workflows. It can analyze images and return grounded text answers with sources.
Point: 5 (5 users)
7
LLM

xai/grok-4-20

Grok 4.20 is xAI’s flagship text model with a 2M-token context window, tool calling, and structured JSON outputs. Attach images for vision Q&A and OCR.
Point: 5 (3 users)
5
Chat

Qwen/Qwen3.5-4B-heretic

Decensored Qwen3.5-4B checkpoint for long-context chat, coding, and analysis. Supports optional thinking traces and sampling controls for output style.
Point: 5 (4 users)
6
Chat

Qwen/Qwen3.5-9B-heretic

A large language model optimized for chat interactions and logical reasoning tasks. Designed for developers and researchers.
Point: 5 (2 users)
3
Chat

Qwen/Qwen3.5-4B

A compact yet capable LLM optimized for chat interactions and logical reasoning tasks. Designed for efficient deployment and accurate responses.
Point: 5 (7 users)
7
Chat

Qwen/Qwen3.5-9B

A dense 9-billion parameter language model optimized for chat and reasoning tasks. Designed for efficient deployment and high-quality responses.
Point: 5 (3 users)
4
Chat

Qwen/Qwen3.5-27B-heretic

A large language model optimized for chat interactions and reasoning tasks, designed for advanced users seeking powerful natural language processing capabilities.
Point: 5 (2 users)
3
LLM

bytedance/seed-v2-mini

A lightweight language model optimized for efficient inference and versatile applications in natural language processing tasks.
Point: 5 (3 users)
5
LLM

bytedance/seed-v2-lite

A lightweight text-to-image generation model optimized for visual content creation using prompts and optional images.
Point: 5 (2 users)
4
Chat

Qwen/Qwen3.5-27B

Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency.
Point: 5 (1 users)
1
Partner LLM

google/gemini-3-pro

Gemini 3 Pro is Google's advanced AI model designed for complex reasoning and natural language understanding tasks.
Point: 4.83 (15 users)
10
Chat

zai-org/GLM-4.7-Flash

A chat-based language model optimized for conversational AI tasks with support for custom prompts and session management.
Point: 5 (1 users)
1
Partner LLM

openai/gpt-5-nano

A compact AI model optimized for efficient processing of complex prompts and multi-modal inputs with support for images and structured data.
Point: 4.75 (4 users)
4
Partner LLM

openai/gpt-5.2

AI model for processing text prompts and image inputs with advanced reasoning capabilities. Supports multi-modal inputs and customizable responses.
Point: 3.5 (2 users)
1
Partner LLM

openai/gpt-5-mini

A compact AI model optimized for text generation tasks, designed for efficient processing and accurate responses.
Point: 0 (0 users)
0
Partner LLM

google/gemini-3-flash

gemini-3-flash
Point: 5 (3 users)
3
Partner LLM

google/gemini-2-5-flash

gemini-2-5-flash
Point: 3 (1 users)
0
Chat

wiro/rag-chat-github

Instantly retrieve and analyze content from any GitHub repository. Select your LLM model, extract relevant information from codebases or documentation, and generate context-aware responses with ease!
Point: 5 (1 users)
1
AI Models for E-commerce
Social Media & Viral

wiro/ugc creator

Generates custom video content from product images and text for marketing campaigns.
Point: 5 (3 users)
1
Social Media & Viral

wiro/Shopify Template

Generate customizable Shopify templates from product photos with various aspect ratios and design options.
Point: 2 (4 users)
1
Social Media & Viral

wiro/Product Studio

Generate 360° product videos with AI-powered effects. Transform product photos into engaging video content for e-commerce.
Point: 5 (3 users)
0
Social Media & Viral

wiro/Product with Model

Generates dynamic product videos with models in various scenes using image inputs. Designed for fast inference and e-commerce showcases.
Point: 4.833333333333333 (6 users)
3
Social Media & Viral

wiro/Virtual Try-On-V2

Generate realistic virtual try-on images and videos for fashion products using AI. Supports multiple garment uploads and photography styles.
Point: 4.5 (2 users)
2
Social Media & Viral

wiro/Animated Logo

Transform static logos into stunning animated videos with 36+ creative presets. Choose from scenes like Times Square billboards, Parisian storefronts, coffee art, neon signs, and luxury showcases.
Point: 5 (1 users)
1
Social Media & Viral

wiro/3D Text Animations

Create stunning 3D animated text videos with 22+ creative presets. Transform any text into balloon letters, neon signs, candy typography, cloud formations, and cinematic motion effects.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads with Caption

Combine product images with custom captions into stunning animated video ads. 42 creative presets featuring sales promotions, seasonal themes, and dynamic text animations.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads with Logo

Combine product images with logos into stunning animated video ads. 12 creative presets featuring storefronts, billboards, city banners, and surreal brand presentations.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Ads

Transform product images into stunning animated video ads with 100+ creative presets. Choose from effects like water splashes, scene transitions, surreal staging, and seasonal themes.
Point: 5 (3 users)
0
Ecommerce

wiro/camera-angle-editor

wiro/camera-angle-editor is an advanced AI tool that instantly changes the camera perspective and angle of any existing image. Leveraging sophisticated spatial reconstruction, it eliminates the need for reshoots by synthesizing photorealistic new viewpoints, making it the fastest way for creators to maximize the versatility of their visual content.
Point: 0 (0 users)
0
Ecommerce

wiro/Product Photoshoot

Save time and production costs with AI Product Photoshoot. Generate polished product images featuring adaptive lighting, varied angles, and contextual scenes. Ideal for online stores, marketing teams, and agencies looking to accelerate content creation with consistent, high-quality visuals.
Point: 5 (2 users)
2
Social Media & Viral

wiro/Virtual Try-On

Integrate the Wiro Virtual Try-On API to deliver hyper-realistic apparel fitting directly in your web, mobile, or SaaS platform. Generate lifelike visuals of users wearing new garments with precise texture mapping, pose alignment, and fabric simulation — ideal for online retail and fashion tech solutions.
Point: 5 (3 users)
1
Ecommerce

wiro/text-removal

This AI model intelligently removes unwanted text from any image, seamlessly filling in the background.
Point: 0 (0 users)
0
Ecommerce

wiro/remove-background

AI-powered background removal tool that automatically removes image backgrounds. Perfect for e-commerce product photos and quick image editing.
Point: 4.5 (1 users)
1
AI Models for Social Media Creators
Social Media & Viral

wiro/Euphoria Effects

Transform images into trending Euphoria effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 5 (15 users)
12
Social Media & Viral

wiro/World Cup 2026 Effects

Transform images into trending World Cup 2026 effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 4.67 (14 users)
13
Social Media & Viral

wiro/World Cup 2026 Effects with Caption

Transform images into trending World Cup 2026 effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 2.5 (11 users)
9
Social Media & Viral

wiro/Scream Effects

Transform images into trending Scream effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 5 (12 users)
11
Social Media & Viral

wiro/panini-card

Turn a selfie into a Panini-style player card video. Enter name, height, weight, and birth date, then choose a country card preset.
Point: 5 (35 users)
31
Social Media & Viral

wiro/Sport Trend Effects

Transform images into trending Sport effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 2.5 (11 users)
9
Social Media & Viral

wiro/Insta Hot Girl Effects

Transform images into Insta Hot Girl effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 5 (12 users)
13
Social Media & Viral

wiro/Queer Editorial Effects

Transform images into Queer Editorial effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 4.29 (11 users)
12
Social Media & Viral

wiro/Tiktok Trend Effects

Transform images into trending TikTok effects with customizable styles and durations. Apply various creative filters to create engaging video content.
Point: 3 (7 users)
7
Social Media & Viral

wiro/Wildlife Documentary Effect

Cinematic wildlife documentary videos generated from a single portrait, in the style of nature programming. 13 scenarios across predator hunts, reef encounters and aerial wildlife.
Point: 5 (14 users)
12
Social Media & Viral

wiro/Transformation Effect

Henshin, outfit cycles, animal/object morphs, age progressions and era shifts generated from a single portrait. 121 scenarios.
Point: 5 (2 users)
2
Social Media & Viral

wiro/Supernatural Presence Effect

Analog horror, ghost and paranormal cinematic short videos from a single portrait. 67 scenarios across uncanny domestic, found-footage and liminal spaces.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Superhero Powers Effect

Cinematic superhero power-up, aura and transformation videos generated from a single portrait. 35 scenarios.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Sports Extreme Effect

Cinematic extreme sports videos generated from a single portrait. 27 scenarios across parkour, BMX, surf, ATV, snowboard, MotoGP and lunar skiing.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Scale Shift Effect

Cosmic-to-micro scale-shift videos generated from a single portrait. 31 scenarios across galaxy zooms, ant scale, giant stomp and ocean trench dives.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Retro Period Aesthetic Effect

Era-shift cinematic short videos from a single portrait. 33 retro and period-aesthetic scenarios across a century of cinema styles.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Reality Warp Effect

Portals, dimensional rifts and reality-warp videos generated from a single portrait. 43 scenarios across sci-fi, surreal and fantasy.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Product Effect

Cinematic product, food and fashion short videos from one photo. 28 scenarios with motion design, slow-mo physics and stylized lighting.
Point: 0 (0 users)
0
Social Media & Viral

wiro/POV FPV Effect

First-person and FPV cinematic adventure videos generated from a single portrait. 61 scenarios across dragon flight, wingsuit, animal POV, parkour, FPV drone racing and lunar handheld footage.
Point: 0 (0 users)
0
Social Media & Viral

wiro/Movie Scene Homage Effect

Iconic movie-scene homage videos generated from a single portrait. 24 scenarios across sci-fi action, war epic, noir, samurai and adventure.
Point: 0 (0 users)
0
Wiro AI LogoWiro AI LogoLogo of nvidia programLogo of nvidia program
Wiro AI brings machine learning easily accessible to all in the cloud.
  • WIRO
  • About
  • Blog
  • Careers
  • Contact
  • Product
  • Models
  • Agents Platform
  • Pricing
  • Partner Program
  • Changelog
  • Status
  • FAQ
  • Getting Started
  • Introduction
  • Authentication
  • Projects
  • Code Examples
  • Wiro MCP Server
  • Self-Hosted MCP
  • n8n Integration
  • LLMs.txt
  • API Reference
  • Models
  • Run a Model
  • Model Parameters
  • Tasks
  • LLM & Chat Streaming
  • WebSocket
  • Realtime Voice Conversation
  • Files

2026 © Wiro.ai | Terms of Service & Privacy Policy