AI APIs for Developers – Explore AI Models, LLMs, Pipelines & Functions - Wiro AI

Models
Agents
Learn
Anatomy
Build Your Agent
Pre-built Agents
Studio
Pricing
Blog
Docs
Sign In
Sign Up

Blog

OverviewThe platform at a glance

LearnSkills, knowledge, guardrails

AnatomyWhat makes agents reason

Build Your AgentPick skills, set tier, deploy

Pre-built AgentsBrowse the catalog

Agent Usecases

Ad Campaign Manager App Event Manager App Review Replies Barber Booking Customer Win-Back Ecommerce Listings Restaurant Reviews

Task History

Click to see output list

Projects

The list is empty

No results

You don't have task yet.

Explore

Social Media & Viral

wiro/Euphoria Effects

Transform images into trending Euphoria effects with customizable styles and durations. Apply various creative filters to create engaging video content.

PixVerse/reference-to-video-v6

Generate short cinematic videos from up to 3 reference images plus a prompt. PixVerse V6 keeps subjects consistent and can add synced audio.

PixVerse/video-extend-v6

Extend an existing clip with PixVerse Video Extend V6. Upload a short MP4 or MOV, describe the continuation, and get a longer MP4 in 360p–1080p.

Point: 4.8 (5 users)

5

PixVerse/image-to-video-v6

PixVerse Image to Video V6 animates a still image into a cinematic MP4 clip, with optional first-to-last frame transitions, multi-shot scenes, and audio.

Point: 4.75 (4 users)

5

Social Media & Viral

wiro/World Cup 2026 Effects

Transform images into trending World Cup 2026 effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 4.67 (14 users)

13

Recently Added

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.

Point: 4.811407407407407 (135 users)

123

klingai/kling-v2.6-motion-control

Generates videos from images and reference videos with motion control. Supports custom prompts and character orientation settings.

Point: 4.86 (84 users)

85

klingai/kling-v3

Generate high-quality videos from text prompts using Kling V3. Supports custom frames, duration, and aspect ratios.

Point: 4.87 (78 users)

76

ByteDance/seedance-pro-v1.5-uncensored

Seedance Pro v1.5 Uncensored by ByteDance generates short videos from text with optional native audio, strong prompt following, and accurate lip-sync for dialogue scenes.

Point: 4.78 (68 users)

70

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.

Point: 4.82 (67 users)

63

google/nano-banana-pro

Google's Gemini 3 Pro Image Preview, also known as Nano Banana, model for text-to-image and image-to-image generation.

Point: 4.892972972972973 (37 users)

35

wiro/Video Converter

Convert a video to MP4, MOV, WebM, MKV, AVI, MPEG, or M4V with adjustable compression. Built by Wiro for clean exports and size control.

Point: 4 (36 users)

32

wiro/Image Converter

Convert an image to JPEG, PNG, WebP, TIFF, or AVIF. Set output quality from 0 to 100 to balance file size and visual fidelity.

Point: 4.67 (36 users)

29

Social Media & Viral

wiro/panini-card

Turn a selfie into a Panini-style player card video. Enter name, height, weight, and birth date, then choose a country card preset.

Point: 5 (35 users)

31

wiro/smart resize

Smart Resize by Wiro converts one image into multiple exact sizes, using AI recomposition to keep key subjects in frame for ads, social posts, and web.

Point: 4.27 (33 users)

29

openai/gpt-image-2-custom

Create or edit images with OpenAI GPT Image 2 using custom pixel sizes, quality tiers, and format controls. Add optional masks to target specific edits.

Point: 4.27 (33 users)

29

google/lyria 3 pro

Google’s Lyria 3 Pro generates up to 3-minute songs from detailed prompts and an optional image. It returns 48 kHz stereo MP3 audio with structure.

Point: 4.25 (26 users)

20

openai/gpt-image-2

Generate or edit images with GPT Image 2 from OpenAI. It delivers strong instruction following, sharp text rendering, and flexible sizing up to 4K.

Point: 4.72 (25 users)

20

google/lyria 3

Google’s Lyria 3 generates a 30-second, 48 kHz stereo music clip from a detailed description, with an optional image to guide mood and style.

Point: 5 (23 users)

17

ByteDance/Seedance 2.0

Seedance 2.0 by ByteDance generates short MP4 videos from text, with optional image and audio references. It supports synced audio, stable motion, and cinematic camera direction.

Point: 4.86 (20 users)

20

google/gemini-3.5-flash

Google’s Gemini 3.5 Flash is a fast multimodal reasoning model built for agents, coding, and long-context analysis. It takes text plus media and returns text.

Point: 4.5 (19 users)

11

alibaba/happyhorse 1.0 reference

Edit a short MP4/MOV clip or generate a new 720p/1080p video from up to 9 reference images and a detailed prompt, with optional watermark and seed control.

Point: 5 (19 users)

16

alibaba/happyhorse 1.0

Alibaba’s HappyHorse 1.0 generates 5–15s videos from text or a first-frame image. Choose 720p or 1080p, aspect ratio, seed, and watermark.

Point: 4.4 (19 users)

18

pruna/p-video-replace

Replace a subject in a source video using 1–3 reference images. Preserves the original motion and can keep the source audio in the final MP4.

Point: 5 (16 users)

14

Social Media & Viral

wiro/Euphoria Effects

Transform images into trending Euphoria effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 5 (15 users)

12

Generate Videos

Social Media & Viral

wiro/Euphoria Effects

Transform images into trending Euphoria effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 5 (15 users)

12

PixVerse/reference-to-video-v6

Generate short cinematic videos from up to 3 reference images plus a prompt. PixVerse V6 keeps subjects consistent and can add synced audio.

Point: 4.833333333333333 (6 users)

6

PixVerse/video-extend-v6

Extend an existing clip with PixVerse Video Extend V6. Upload a short MP4 or MOV, describe the continuation, and get a longer MP4 in 360p–1080p.

Point: 4.8 (5 users)

5

PixVerse/image-to-video-v6

PixVerse Image to Video V6 animates a still image into a cinematic MP4 clip, with optional first-to-last frame transitions, multi-shot scenes, and audio.

Point: 4.75 (4 users)

5

Social Media & Viral

wiro/World Cup 2026 Effects

Transform images into trending World Cup 2026 effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 4.67 (14 users)

13

Social Media & Viral

wiro/World Cup 2026 Effects with Caption

Transform images into trending World Cup 2026 effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 2.5 (11 users)

9

Social Media & Viral

wiro/Scream Effects

Transform images into trending Scream effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 5 (12 users)

11

wiro/Video Converter

Convert a video to MP4, MOV, WebM, MKV, AVI, MPEG, or M4V with adjustable compression. Built by Wiro for clean exports and size control.

Point: 4 (36 users)

32

Social Media & Viral

wiro/panini-card

Turn a selfie into a Panini-style player card video. Enter name, height, weight, and birth date, then choose a country card preset.

Point: 5 (35 users)

31

pruna/p-video-replace

Replace a subject in a source video using 1–3 reference images. Preserves the original motion and can keep the source audio in the final MP4.

Point: 5 (16 users)

14

nvidia/Cosmos3-Super

Cosmos 3 Super turns a reference image and motion prompt into a physics-grounded MP4 clip, with controls for aspect ratio, FPS, audio, and safety.

Point: 4.5 (15 users)

13

Social Media & Viral

wiro/Sport Trend Effects

Transform images into trending Sport effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 2.5 (11 users)

9

bytedance-research/Lance-Text-to-Video

Turn a detailed scene description into a short video clip. Lance Text to Video can also animate a still image or rewrite an existing video.

Point: 0 (0 users)

0

xai/grok-imagine-video-1.5

Create short videos from a still image plus a motion prompt. Grok Imagine Video 1.5 by xAI outputs 480p or 720p MP4 clips with synced audio.

Point: 5 (10 users)

9

Social Media & Viral

wiro/Insta Hot Girl Effects

Transform images into Insta Hot Girl effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 5 (12 users)

13

SulphurAI/Sulphur-2-base

Sulphur 2 Base by SulphurAI is an LTX 2.3 fine-tune for text-to-video and image-to-video. Create short MP4 clips with first and last frame control.

Point: 5 (3 users)

5

Social Media & Viral

wiro/Queer Editorial Effects

Transform images into Queer Editorial effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 4.29 (11 users)

12

Social Media & Viral

wiro/Tiktok Trend Effects

Transform images into trending TikTok effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 3 (7 users)

7

pruna/wan-i2v

Pruna's WAN I2V animates one still image into a short MP4 video with prompt-driven motion. Pick 480p or 720p, plus optional last-frame control.

Point: 4.5 (15 users)

13

pruna/wan-t2v

Pruna’s WAN T2V generates a short MP4 video from a detailed scene prompt. Pick 480p or 720p, choose 16:9 or 9:16, and control motion and safety options.

Point: 5 (12 users)

12

Generate Images

microsoft/lens

A foundational text-to-image model designed for efficient, high-resolution image generation with strong prompt following and multilingual support.

Point: 5 (2 users)

2

openai/gpt-image-2-custom

Create or edit images with OpenAI GPT Image 2 using custom pixel sizes, quality tiers, and format controls. Add optional masks to target specific edits.

Point: 4.27 (33 users)

29

sensenova/U1-8B-Text-to-Image

SenseNova U1-8B turns text into high-detail images with strong layout control and clearer in-image text. Use it for posters, infographics, and concept art.

Point: 5 (1 users)

1

openai/gpt-image-2

Generate or edit images with GPT Image 2 from OpenAI. It delivers strong instruction following, sharp text rendering, and flexible sizing up to 4K.

Point: 4.72 (25 users)

20

xai/grok-imagine-image

xAI’s Grok Imagine Image generates images from prompts and can edit a source image using instructions. Pick an aspect ratio, 1K or 2K, and 1–10 samples.

Point: 3.33 (10 users)

7

openai/gpt-image-1-5

Generate or edit images using text prompts or image edits with GPT Image 1.5. Supports multiple sizes, formats, and quality settings.

Point: 5 (7 users)

7

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.

Point: 4.811407407407407 (135 users)

123

FireRedTeam/FireRed-Image-Edit-1.1

FireRed-Image-Edit-1.1 significantly enhances identity consistency, multi-image conditioning, and domain-specialized editing capabilities, bringing it closer to meeting real-world creative production demands.

Point: 5 (1 users)

1

ByteDance/seedream-v5-lite-uncensored

Generate high-quality images from text prompts or image inputs using the Seedream v5 Lite Uncensored model. Supports multiple resolutions and aspect ratios.

Point: 3.73 (13 users)

12

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.

Point: 4.82 (67 users)

63

ByteDance/seedream-v5-lite

Generate high-quality images using Seedream V5 Lite, supporting both image-to-image and text-to-image transformations with customizable settings.

Point: 5 (2 users)

2

FireRedTeam/FireRed-Image-Edit

FireRed-Image-Edit is a general-purpose image editing model that delivers high-fidelity and consistent editing across a wide range of scenarios.

Point: 5 (2 users)

2

zai-org/GLM-IMAGE

GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture. In general image generation quality, GLM‑Image aligns with mainstream latent diffusion approaches, but it shows significant advantages in text-rendering and knowledge‑intensive generation scenarios. It performs especially well in tasks requiring precise semantic understanding and complex information expression, while maintaining strong capabilities in high‑fidelity and fine‑grained detail generation. In addition to text‑to‑image generation, GLM‑Image also supports a rich set of image‑to‑image tasks including image editing, style transfer, identity‑preserving generation, and multi‑subject consistency.

Point: 5 (1 users)

1

black-forest-labs/FLUX.2-klein-base-9B

FLUX.2 [klein] 9B Base is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.

Point: 0 (0 users)

0

black-forest-labs/FLUX.2-klein-base-4B

FLUX.2 [klein] 4B Base is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.

Point: 0 (0 users)

0

black-forest-labs/FLUX.2-klein-4B

FLUX.2 [klein] 4B is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.

Point: 0 (0 users)

0

black-forest-labs/FLUX.2-klein-9B

FLUX.2 [klein] 9B is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.

Point: 0 (0 users)

0

wiro/FLUX.2-dev-turbo

FLUX.2 [dev] is a 32 billion parameter rectified flow transformer capable of generating, editing and combining images based on text instructions.

Point: 0 (0 users)

1

meituan-longcat/LongCat-Image-Edit

LongCat-Image-Edit, the image editing version of LongCat-Image.

Point: 5 (1 users)

0

meituan-longcat/LongCat-Image

LongCat-Image is a 6B-parameter model built for high-quality image generation, delivering strong multilingual text rendering, realistic visuals, and efficient deployment.

Point: 0 (0 users)

0

wiro/Image Converter

Convert an image to JPEG, PNG, WebP, TIFF, or AVIF. Set output quality from 0 to 100 to balance file size and visual fidelity.

Point: 4.67 (36 users)

29

wiro/smart resize

Smart Resize by Wiro converts one image into multiple exact sizes, using AI recomposition to keep key subjects in frame for ads, social posts, and web.

Point: 4.27 (33 users)

29

sensenova/U1-8B-Interleave

u1-8b-interleave by SenseNova generates step-by-step text with matching images, optionally conditioned on up to 5 reference images. It suits tutorials, diaries, and infographics.

Point: 5 (3 users)

1

openai/gpt-image-2-custom

Create or edit images with OpenAI GPT Image 2 using custom pixel sizes, quality tiers, and format controls. Add optional masks to target specific edits.

Point: 4.27 (33 users)

29

openai/gpt-image-2

Generate or edit images with GPT Image 2 from OpenAI. It delivers strong instruction following, sharp text rendering, and flexible sizing up to 4K.

Point: 4.72 (25 users)

20

xai/grok-imagine-image

xAI’s Grok Imagine Image generates images from prompts and can edit a source image using instructions. Pick an aspect ratio, 1K or 2K, and 1–10 samples.

Point: 3.33 (10 users)

7

openai/gpt-image-1-5

Generate or edit images using text prompts or image edits with GPT Image 1.5. Supports multiple sizes, formats, and quality settings.

Point: 5 (7 users)

7

Social Media & Viral

wiro/Instagram Pose Multi

Generate stylish Instagram-style pose images with trendy angles, natural expressions, and a modern aesthetic. Built by Wiro for social content.

Point: 4.8 (11 users)

8

ByteDance/seedream-v4-5-uncensored

Generate high-resolution images using Seedream v4.5 Uncensored. Supports text-to-image and image-to-image transformations with customizable settings.

Point: 4.811407407407407 (135 users)

123

FireRedTeam/FireRed-Image-Edit-1.1

FireRed-Image-Edit-1.1 significantly enhances identity consistency, multi-image conditioning, and domain-specialized editing capabilities, bringing it closer to meeting real-world creative production demands.

Point: 5 (1 users)

1

ByteDance/seedream-v5-lite-uncensored

Generate high-quality images from text prompts or image inputs using the Seedream v5 Lite Uncensored model. Supports multiple resolutions and aspect ratios.

Point: 3.73 (13 users)

12

google/nano-banana-2

An image editing tool designed for quick transformations using reference images and prompts. Supports multi-image mixing and aspect ratio adjustments.

Point: 4.82 (67 users)

63

ByteDance/seedream-v5-lite

Generate high-quality images using Seedream V5 Lite, supporting both image-to-image and text-to-image transformations with customizable settings.

Point: 5 (2 users)

2

FireRedTeam/FireRed-Image-Edit

FireRed-Image-Edit is a general-purpose image editing model that delivers high-fidelity and consistent editing across a wide range of scenarios.

Point: 5 (2 users)

2

Social Media & Viral

wiro/Shopify Template

Generate customizable Shopify templates from product photos with various aspect ratios and design options.

Point: 2 (4 users)

1

zai-org/GLM-IMAGE

GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture. In general image generation quality, GLM‑Image aligns with mainstream latent diffusion approaches, but it shows significant advantages in text-rendering and knowledge‑intensive generation scenarios. It performs especially well in tasks requiring precise semantic understanding and complex information expression, while maintaining strong capabilities in high‑fidelity and fine‑grained detail generation. In addition to text‑to‑image generation, GLM‑Image also supports a rich set of image‑to‑image tasks including image editing, style transfer, identity‑preserving generation, and multi‑subject consistency.

Point: 5 (1 users)

1

black-forest-labs/FLUX.2-klein-base-9B

FLUX.2 [klein] 9B Base is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.

Point: 0 (0 users)

0

black-forest-labs/FLUX.2-klein-base-4B

FLUX.2 [klein] 4B Base is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.

Point: 0 (0 users)

0

black-forest-labs/FLUX.2-klein-4B

FLUX.2 [klein] 4B is a 4 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.

Point: 0 (0 users)

0

black-forest-labs/FLUX.2-klein-9B

FLUX.2 [klein] 9B is a 9 billion parameter rectified flow transformer capable of generating images from text descriptions and supports multi-reference editing capabilities.

Point: 0 (0 users)

0

sensenova/U1-8B-Interleave

u1-8b-interleave by SenseNova generates step-by-step text with matching images, optionally conditioned on up to 5 reference images. It suits tutorials, diaries, and infographics.

Point: 5 (3 users)

1

sensenova/U1-8B-Visual-Understanding

SenseNova U1 8B Visual Understanding creates infographic-style images and prompt-based edits from text and an optional reference image. Built by SenseNova on NEO-Unify.

Point: 5 (2 users)

1

nvidia/parakeet-tdt-0.6b-v3

Multilingual speech-to-text for 25 European languages with auto language detection, punctuation, capitalization, and optional timestamps.

Point: 5 (5 users)

8

CohereLabs/cohere-transcribe-03-2026

CohereLabs cohere-transcribe-03-2026 is a 2B Conformer speech-to-text model for 14 languages. It creates accurate transcripts for meetings, calls, and audio archives.

Point: 5 (4 users)

7

Qwen/Qwen3-ASR-1.7B

A lightweight speech-to-text model optimized for fast inference. Converts audio input into text with support for multiple languages.

Point: 0 (0 users)

0

mistralai/Voxtral-Mini-4B-Realtime-2602

Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source solutions to achieve accuracy comparable to offline systems with a delay of <500ms. It supports 13 languages and outperforms existing open-source baselines across a range of tasks, making it ideal for applications like voice assistants and live subtitling.

Point: 0 (0 users)

0

nvidia/nemotron

Nemotron-Speech-Streaming-En-0.6b is the first unified model in the Nemotron Speech family, engineered to deliver high-quality English transcription across both low-latency streaming and high-throughput batch workloads. The model natively supports punctuation and capitalization and offers runtime flexibility with configurable chunk sizes, including 80ms, 160ms, 560ms, and 1120ms.

Point: 5 (1 users)

0

elevenlabs/speech-to-text

Speech to text model from ElevenLabs

Point: 0 (0 users)

0

moondream3-preview/detect

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.

Point: 0 (0 users)

0

moondream3-preview/point

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.

Point: 0 (0 users)

0

moondream3-preview/caption

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.

Point: 0 (0 users)

0

moondream3-preview/query

Moondream3 is a cutting-edge vision-language model that delivers advanced visual reasoning with built-in object detection, pointing, and OCR capabilities—bringing fast, cost-effective, and scalable inference to real-world applications.

Point: 0 (0 users)

0

openai/whisper-large-v3-turbo-turkish

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalise to many datasets and domains without the need for fine-tuning.

Point: 5 (1 users)

0

wiro/video-nsfw-detection

NSFW video detection automatically analyzes video content to identify inappropriate or explicit material, ensuring compliance with content policies and a safe viewing environment.

Point: 5 (1 users)

0

nocover

DAMO-NLP-SG/VideoLLaMA3-2B

VideoLLaMA3-2B is a model designed for video understanding.

Point: 0 (0 users)

0

nocover

DAMO-NLP-SG/VideoLLaMA3-2B-Image

VideoLLaMA3-2B-Image is a model designed for image understanding.

Point: 0 (0 users)

0

wiro/VideoLLaMA3-7B-Image

VideoLLaMA3-7B-Image is a model designed for image understanding.

Point: 0 (0 users)

0

wiro/VideoLLaMA3-7B

VideoLLaMA3-7B is a model designed for video understanding.

Point: 0 (0 users)

0

Salesforce/blip2-flan-t5-xl

BLIP-2 creates captions or detailed descriptions for images. This is BLIP-2 model, leveraging Flan T5-xl.

Point: 0 (0 users)

0

Salesforce/blip-image-captioning-large

BLIP is a model that is able to perform various multi-modal tasks including visual question answering and image captioning. This is the blip image captioning large model.

Point: 0 (0 users)

0

TencentARC/Pixal3D

Pixal3D converts a single image into a textured 3D GLB mesh. It uses pixel-to-3D back-projection so geometry stays aligned to the input view.

Point: 4.2 (6 users)

4

tencent/HY-World-2.0-World-Reconstruction

Reconstruct a 3D scene from multi-view photos or a short video. Produces 3D Gaussian splats with depth maps and cameras, plus a colored point cloud.

Point: 5 (1 users)

1

microsoft/Trellis-2

Convert images into detailed 3D meshes using Microsoft's Trellis-2 model. Supports various resolutions and customization options.

Point: 5 (4 users)

4

tencent/Hunyuan3D-2.1

Generate 3D models from images using the Hunyuan3D-2.1 AI tool. Transform 2D inputs into detailed 3D assets for design and development.

Point: 5 (2 users)

1

OpenMOSS/MOSS-TTS-v1.5

OpenMOSS MOSS-TTS v1.5 turns text into natural speech, with optional zero-shot voice cloning from a reference clip. It supports 31 languages plus pause and pronunciation control.

Point: 2 (1 users)

0

openbmb/VoxCPM2

A real-time, multilingual text-to-speech system offering expressive voice design and high-fidelity voice cloning through low-latency streaming inference.

Point: 1 (1 users)

2

k2-fsa/OmniVoice

OmniVoice by k2-fsa generates 24 kHz speech from text in 600+ languages. Clone a speaker from a short reference clip or design a new voice from attributes.

Point: 0 (0 users)

0

humeai/tada-3b-ml

TADA is a unified speech-language model that synchronizes speech and text into a single, cohesive stream via 1:1 alignment. By leveraging a novel tokenizer and architectural design, TADA achieves high-fidelity synthesis and generation with a fraction of the computational overhead required by traditional models.

Point: 0 (0 users)

0

fishaudio/s2-pro

Generates high-quality speech from text using advanced TTS technology with support for voice cloning and multi-speaker synthesis.

Point: 0 (0 users)

0

nineninesix/kani-tts-2-en

Generates natural-sounding speech from text with support for multi-speaker voice cloning and fast inference capabilities.

Point: 0 (0 users)

0

resemble-ai/chatterbox-turbo

The fastest open source TTS model without sacrificing quality.

Point: 5 (1 users)

1

resemble-ai/chatterbox-multilingual

Generate expressive, natural speech in 23 languages. Features instant voice cloning from short audio, emotion control, and seamless cross-language voice transfer.

Point: 0 (0 users)

0

OpenMOSS/MOSS-TTSD

MOSS-TTSD is a production long-form dialogue model for expressive multi-speaker conversational audio at scale. It supports long-duration continuity, turn-taking control, and zero-shot voice cloning from short references for podcasts, audiobooks, commentary, dubbing, and entertainment dialogue.

Point: 0 (0 users)

0

OpenMOSS/MOSS-TTS-Realtime

Real-time streaming text-to-speech with zero-shot voice cloning. Supports 20 languages including English, Chinese, Japanese, Korean, and more. Audio starts playing immediately — no waiting for full generation. Clone any voice from a short reference clip.

Point: 0 (0 users)

0

elevenlabs/Realtime Conversational AI

A real-time voice conversation tool using ElevenLabs' AI voice agents. Customize voices, behaviors, and languages for interactive AI experiences.

Point: 5 (1 users)

0

openai/gpt-realtime-mini

GPT Mini Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.

Point: 5 (1 users)

0

openai/gpt-realtime

GPT Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.

Point: 5 (1 users)

0

Qwen/Qwen3-TTS-12Hz-1.7B

A fast inference text-to-speech model optimized for real-time audio generation with multi-language support.

Point: 0 (0 users)

0

Speech to Speech

nvidia/PersonaPlex-Realtime

Convert speech to speech with customizable voices using PersonaPlex. Supports various audio formats and offers control over text temperature and audio top K settings.

Point: 4 (3 users)

1

microsoft/VibeVoice-Realtime

VibeVoice-Realtime is a lightweight real‑time text-to-speech model supporting streaming text input and robust long-form speech generation.

Point: 5 (1 users)

1

openbmb/VoxCPM

Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Point: 0 (0 users)

0

google/gemini-2.5-tts

Google's Gemini 2.5 Flash Text To Speech Preview model

Point: 5 (2 users)

5

elevenlabs/text-to-speech

Text to speech model from ElevenLabs

Point: 5 (2 users)

0

Social Media & Viral

wiro/Faceless-Video-Generator

Create professional short videos (30s) for YouTube Shorts, Instagram Reels, TikTok, and X (Twitter) – from a single prompt. Automatically generate speech, captions, and optional talking head avatars using AI. Perfect for content creators, marketers, and educators looking to grow faster with less effort.

Point: 5 (1 users)

1

stabilityai/stable-audio-3-small-sfx

Stable Audio 3 Small SFX by Stability AI generates stereo sound effects from text prompts, up to 120 seconds, for games, video, and product UX.

Point: 0 (0 users)

1

stabilityai/stable-audio-3-small-music

Create up to 2 minutes of stereo music from a text description with Stable Audio 3 Small Music by Stability AI. Control duration, steps, and guidance.

Point: 0 (0 users)

0

stabilityai/stable-audio-3-medium

Stable Audio 3 Medium by Stability AI generates stereo music from text prompts for up to 380 seconds. Control duration, steps, guidance, and output format. ([fal.ai](https://fal.ai/models/fal-ai/stable-audio-3/medium/text-to-audio/api))

Point: 5 (1 users)

0

google/lyria 3

Google’s Lyria 3 generates a 30-second, 48 kHz stereo music clip from a detailed description, with an optional image to guide mood and style.

Point: 5 (23 users)

17

Music Generation

tencent-ailab/SongGeneration 2

Song Generation 2 generates full songs with vocals and instrumentals from lyrics — supports 14 genres, reference audio cloning, and separate track output (vocal/bgm/mix).

Point: 5 (3 users)

3

wiro/video-background-music-v2

It turns any video into a cinematic experience by generating AI-powered instrumental soundtracks that match its mood.

Point: 4 (1 users)

3

Music Generation

ACE-Step/text-to-song-ACE-Step1.5

ACE-Step v1.5 is a highly efficient open-source music foundation model designed to bring commercial-grade music generation to consumer hardware.

Point: 5 (4 users)

2

Social Media & Viral

wiro/Song Frame

SongFrame places you into a cinematic world, pulls the soundtrack directly from your YouTube link, and fuses everything into a polished video — effortless, emotional, and instantly shareable.

Point: 0 (0 users)

0

Social Media & Viral

wiro/Faceless-Video-Generator

Create professional short videos (30s) for YouTube Shorts, Instagram Reels, TikTok, and X (Twitter) – from a single prompt. Automatically generate speech, captions, and optional talking head avatars using AI. Perfect for content creators, marketers, and educators looking to grow faster with less effort.

Point: 5 (1 users)

1

wiro/video-background-music-gen

It’s a tool that gets your video, creates original music to match its vibe, and seamlessly adds it back as the perfect soundtrack.

Point: 0 (0 users)

0

Music Generation

ACE-Step/image-to-song-ACE-Step-v1-3.5B

Generate high-quality songs from image in seconds. Whether you're crafting instrumental tracks or full vocal compositions, bring your musical ideas to life with the power of AI. Ideal for artists, producers, and creative minds looking to turn inspiration into sound.

Point: 0 (0 users)

0

Music Generation

ACE-Step/text-to-song-ACE-Step-v1-3.5B

Generate high-quality songs from text prompts in seconds. Whether you're crafting instrumental tracks or full vocal compositions, bring your musical ideas to life with the power of AI. Ideal for artists, producers, and creative minds looking to turn inspiration into sound.

Point: 5 (1 users)

0

Music Generation

wiro/image-to-song-with-reference-YuE

Turn any song into your own with AI. Simply upload a reference track and provide your image — AI will recreate the song in your style, preserving the vocal style, instrumental feel, or both. Perfect for covers, parodies, remixes, or personalized creations.

Point: 0 (0 users)

0

Music Generation

wiro/image-to-song-YuE

Turn your image into a full song with vocals and instrumental music in seconds. Just upload your image and let AI compose and sing it for you. Perfect for creators, musicians, and storytellers.

Point: 0 (0 users)

0

Music Generation

wiro/text-to-song-with-reference-YuE

Turn any song into your own with AI. Simply upload a reference track and provide your custom lyrics — AI will recreate the song in your style, preserving the vocal style, instrumental feel, or both. Perfect for covers, parodies, remixes, or personalized creations.

Point: 0 (0 users)

0

Music Generation

wiro/text-to-song-YuE

Turn your lyrics into a full song with vocals and instrumental music in seconds. Just enter your lyrics and let AI compose and sing it for you. Perfect for creators, musicians, and storytellers.

Point: 0 (0 users)

0

Music Generation

wiro/music_gen

MusicGen is a text-to-music model capable of generating high-quality music samples.

Point: 0 (0 users)

0

Realtime Stream

openbmb/VoxCPM2

A real-time, multilingual text-to-speech system offering expressive voice design and high-fidelity voice cloning through low-latency streaming inference.

Point: 1 (1 users)

2

mistralai/Voxtral-Mini-4B-Realtime-2602

Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source solutions to achieve accuracy comparable to offline systems with a delay of <500ms. It supports 13 languages and outperforms existing open-source baselines across a range of tasks, making it ideal for applications like voice assistants and live subtitling.

Point: 0 (0 users)

0

OpenMOSS/MOSS-TTS-Realtime

Real-time streaming text-to-speech with zero-shot voice cloning. Supports 20 languages including English, Chinese, Japanese, Korean, and more. Audio starts playing immediately — no waiting for full generation. Clone any voice from a short reference clip.

Point: 0 (0 users)

0

elevenlabs/Realtime Conversational AI

A real-time voice conversation tool using ElevenLabs' AI voice agents. Customize voices, behaviors, and languages for interactive AI experiences.

Point: 5 (1 users)

0

openai/gpt-realtime-mini

GPT Mini Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.

Point: 5 (1 users)

0

openai/gpt-realtime

GPT Realtime enables low-latency, bidirectional streaming for voice and text. Build interactive, responsive AI experiences that feel natural and immediate.

Point: 5 (1 users)

0

Speech to Speech

nvidia/PersonaPlex-Realtime

Convert speech to speech with customizable voices using PersonaPlex. Supports various audio formats and offers control over text temperature and audio top K settings.

Point: 4 (3 users)

1

google/gemini-3.5-flash

Google’s Gemini 3.5 Flash is a fast multimodal reasoning model built for agents, coding, and long-context analysis. It takes text plus media and returns text.

Point: 4.5 (19 users)

11

Qwen/Qwen3.6-27B

Qwen3.6-27B is a 27B dense vision-language model from Qwen for agentic coding and reasoning. It supports 262K-token context and optional thinking traces in outputs.

Point: 5 (5 users)

6

xai/grok-4-1-fast

Grok 4.1 Fast by xAI is a long-context chat model built for tool calling and agent workflows. It can analyze images and return grounded text answers with sources.

Point: 5 (5 users)

7

xai/grok-4-20

Grok 4.20 is xAI’s flagship text model with a 2M-token context window, tool calling, and structured JSON outputs. Attach images for vision Q&A and OCR.

Point: 5 (3 users)

5

Qwen/Qwen3.5-4B-heretic

Decensored Qwen3.5-4B checkpoint for long-context chat, coding, and analysis. Supports optional thinking traces and sampling controls for output style.

Point: 5 (4 users)

6

Qwen/Qwen3.5-9B-heretic

A large language model optimized for chat interactions and logical reasoning tasks. Designed for developers and researchers.

Point: 5 (2 users)

3

Qwen/Qwen3.5-4B

A compact yet capable LLM optimized for chat interactions and logical reasoning tasks. Designed for efficient deployment and accurate responses.

Point: 5 (7 users)

7

Qwen/Qwen3.5-9B

A dense 9-billion parameter language model optimized for chat and reasoning tasks. Designed for efficient deployment and high-quality responses.

Point: 5 (3 users)

4

Qwen/Qwen3.5-27B-heretic

A large language model optimized for chat interactions and reasoning tasks, designed for advanced users seeking powerful natural language processing capabilities.

Point: 5 (2 users)

3

bytedance/seed-v2-mini

A lightweight language model optimized for efficient inference and versatile applications in natural language processing tasks.

Point: 5 (3 users)

5

bytedance/seed-v2-lite

A lightweight text-to-image generation model optimized for visual content creation using prompts and optional images.

Point: 5 (2 users)

4

Qwen/Qwen3.5-27B

Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency.

Point: 5 (1 users)

1

google/gemini-3-pro

Gemini 3 Pro is Google's advanced AI model designed for complex reasoning and natural language understanding tasks.

Point: 4.83 (15 users)

10

zai-org/GLM-4.7-Flash

A chat-based language model optimized for conversational AI tasks with support for custom prompts and session management.

Point: 5 (1 users)

1

openai/gpt-5-nano

A compact AI model optimized for efficient processing of complex prompts and multi-modal inputs with support for images and structured data.

Point: 4.75 (4 users)

4

openai/gpt-5.2

AI model for processing text prompts and image inputs with advanced reasoning capabilities. Supports multi-modal inputs and customizable responses.

Point: 3.5 (2 users)

1

openai/gpt-5-mini

A compact AI model optimized for text generation tasks, designed for efficient processing and accurate responses.

Point: 0 (0 users)

0

google/gemini-3-flash

Point: 5 (3 users)

3

google/gemini-2-5-flash

gemini-2-5-flash

Point: 3 (1 users)

0

wiro/rag-chat-github

Instantly retrieve and analyze content from any GitHub repository. Select your LLM model, extract relevant information from codebases or documentation, and generate context-aware responses with ease!

Point: 5 (1 users)

1

AI Models for E-commerce

Social Media & Viral

wiro/ugc creator

Generates custom video content from product images and text for marketing campaigns.

Point: 5 (3 users)

1

Social Media & Viral

wiro/Shopify Template

Generate customizable Shopify templates from product photos with various aspect ratios and design options.

Point: 2 (4 users)

1

Social Media & Viral

wiro/Product Studio

Generate 360° product videos with AI-powered effects. Transform product photos into engaging video content for e-commerce.

Point: 5 (3 users)

0

Social Media & Viral

wiro/Product with Model

Generates dynamic product videos with models in various scenes using image inputs. Designed for fast inference and e-commerce showcases.

Point: 4.833333333333333 (6 users)

3

Social Media & Viral

wiro/Virtual Try-On-V2

Generate realistic virtual try-on images and videos for fashion products using AI. Supports multiple garment uploads and photography styles.

Point: 4.5 (2 users)

2

Social Media & Viral

wiro/Animated Logo

Transform static logos into stunning animated videos with 36+ creative presets. Choose from scenes like Times Square billboards, Parisian storefronts, coffee art, neon signs, and luxury showcases.

Point: 5 (1 users)

1

Social Media & Viral

wiro/3D Text Animations

Create stunning 3D animated text videos with 22+ creative presets. Transform any text into balloon letters, neon signs, candy typography, cloud formations, and cinematic motion effects.

Point: 0 (0 users)

0

Social Media & Viral

wiro/Product Ads with Caption

Combine product images with custom captions into stunning animated video ads. 42 creative presets featuring sales promotions, seasonal themes, and dynamic text animations.

Point: 0 (0 users)

0

Social Media & Viral

wiro/Product Ads with Logo

Combine product images with logos into stunning animated video ads. 12 creative presets featuring storefronts, billboards, city banners, and surreal brand presentations.

Point: 0 (0 users)

0

Social Media & Viral

wiro/Product Ads

Transform product images into stunning animated video ads with 100+ creative presets. Choose from effects like water splashes, scene transitions, surreal staging, and seasonal themes.

Point: 5 (3 users)

0

wiro/camera-angle-editor

wiro/camera-angle-editor is an advanced AI tool that instantly changes the camera perspective and angle of any existing image. Leveraging sophisticated spatial reconstruction, it eliminates the need for reshoots by synthesizing photorealistic new viewpoints, making it the fastest way for creators to maximize the versatility of their visual content.

Point: 0 (0 users)

0

wiro/Product Photoshoot

Save time and production costs with AI Product Photoshoot. Generate polished product images featuring adaptive lighting, varied angles, and contextual scenes. Ideal for online stores, marketing teams, and agencies looking to accelerate content creation with consistent, high-quality visuals.

Point: 5 (2 users)

2

Social Media & Viral

wiro/Virtual Try-On

Integrate the Wiro Virtual Try-On API to deliver hyper-realistic apparel fitting directly in your web, mobile, or SaaS platform. Generate lifelike visuals of users wearing new garments with precise texture mapping, pose alignment, and fabric simulation — ideal for online retail and fashion tech solutions.

Point: 5 (3 users)

1

wiro/text-removal

This AI model intelligently removes unwanted text from any image, seamlessly filling in the background.

Point: 0 (0 users)

0

wiro/remove-background

AI-powered background removal tool that automatically removes image backgrounds. Perfect for e-commerce product photos and quick image editing.

Point: 4.5 (1 users)

1

AI Models for Social Media Creators

Social Media & Viral

wiro/Euphoria Effects

Transform images into trending Euphoria effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 5 (15 users)

12

Social Media & Viral

wiro/World Cup 2026 Effects

Transform images into trending World Cup 2026 effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 4.67 (14 users)

13

Social Media & Viral

wiro/World Cup 2026 Effects with Caption

Transform images into trending World Cup 2026 effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 2.5 (11 users)

9

Social Media & Viral

wiro/Scream Effects

Transform images into trending Scream effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 5 (12 users)

11

Social Media & Viral

wiro/panini-card

Turn a selfie into a Panini-style player card video. Enter name, height, weight, and birth date, then choose a country card preset.

Point: 5 (35 users)

31

Social Media & Viral

wiro/Sport Trend Effects

Transform images into trending Sport effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 2.5 (11 users)

9

Social Media & Viral

wiro/Insta Hot Girl Effects

Transform images into Insta Hot Girl effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 5 (12 users)

13

Social Media & Viral

wiro/Queer Editorial Effects

Transform images into Queer Editorial effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 4.29 (11 users)

12

Social Media & Viral

wiro/Tiktok Trend Effects

Transform images into trending TikTok effects with customizable styles and durations. Apply various creative filters to create engaging video content.

Point: 3 (7 users)

7

Social Media & Viral

wiro/Wildlife Documentary Effect

Cinematic wildlife documentary videos generated from a single portrait, in the style of nature programming. 13 scenarios across predator hunts, reef encounters and aerial wildlife.

Point: 5 (14 users)

12

Social Media & Viral

wiro/Transformation Effect

Henshin, outfit cycles, animal/object morphs, age progressions and era shifts generated from a single portrait. 121 scenarios.

Point: 5 (2 users)

2

Social Media & Viral

wiro/Supernatural Presence Effect

Analog horror, ghost and paranormal cinematic short videos from a single portrait. 67 scenarios across uncanny domestic, found-footage and liminal spaces.

Point: 0 (0 users)

0

Social Media & Viral

wiro/Superhero Powers Effect

Cinematic superhero power-up, aura and transformation videos generated from a single portrait. 35 scenarios.

Point: 0 (0 users)

0

Social Media & Viral

wiro/Sports Extreme Effect

Cinematic extreme sports videos generated from a single portrait. 27 scenarios across parkour, BMX, surf, ATV, snowboard, MotoGP and lunar skiing.

Point: 0 (0 users)

0

Social Media & Viral

wiro/Scale Shift Effect

Cosmic-to-micro scale-shift videos generated from a single portrait. 31 scenarios across galaxy zooms, ant scale, giant stomp and ocean trench dives.

Point: 0 (0 users)

0

Social Media & Viral

wiro/Retro Period Aesthetic Effect

Era-shift cinematic short videos from a single portrait. 33 retro and period-aesthetic scenarios across a century of cinema styles.

Point: 0 (0 users)

0

Social Media & Viral

wiro/Reality Warp Effect

Portals, dimensional rifts and reality-warp videos generated from a single portrait. 43 scenarios across sci-fi, surreal and fantasy.

Point: 0 (0 users)

0

Social Media & Viral

wiro/Product Effect

Cinematic product, food and fashion short videos from one photo. 28 scenarios with motion design, slow-mo physics and stylized lighting.

Point: 0 (0 users)

0

Social Media & Viral

wiro/POV FPV Effect

First-person and FPV cinematic adventure videos generated from a single portrait. 61 scenarios across dragon flight, wingsuit, animal POV, parkour, FPV drone racing and lunar handheld footage.

Point: 0 (0 users)

0

Social Media & Viral

wiro/Movie Scene Homage Effect

Iconic movie-scene homage videos generated from a single portrait. 24 scenarios across sci-fi action, war epic, noir, samurai and adventure.

Point: 0 (0 users)

0

Wiro AI brings machine learning easily accessible to all in the cloud.

WIRO
About
Blog
Careers
Contact

Product
Models
Agents Platform
Pricing
Partner Program
Changelog
Status
FAQ

Getting Started
Introduction
Authentication
Projects
Code Examples
Wiro MCP Server
Self-Hosted MCP
n8n Integration
LLMs.txt

API Reference
Models
Run a Model
Model Parameters
Tasks
LLM & Chat Streaming
WebSocket
Realtime Voice Conversation
Files

2026 © Wiro.ai | Terms of Service & Privacy Policy