Gemini Omni Flash API

EvoLink 上的 Gemini Omni Flash API — 一个 API key 搞定视频生成与视频编辑，支持异步任务工作流和回调通知。

模型类型:

✓Text to Video Image to Video Reference to Video Video Edit

价格:

$1.275(~ 86.7 credits) 每 100 万输入 tokens; $14.875(~ 1011.5 credits) 每 100 万视频输出 tokens

$7.650(~ 520.2 credits) 每 100 万其他输出 tokens

按 token 计费。实际费用以 API 返回的 usage 对象为准。

稳定性最高，保证 99.9% 可用性。推荐用于生产环境。

Use the same video endpoint for all modes. Only the model parameter differs.

Prompt*

Output is 720p with audio. Duration resets to Auto; drag the slider to send a fixed 3-10s duration.

131 （建议：2,000）

Aspect Ratio

Choose landscape, portrait, or Auto to let the provider select the output ratio.

Duration

Auto lets the provider decide the output duration (estimated as 10s). Choose 3-10s to send a fixed duration.

Click Generate to see preview

History

最多保留20条

0 运行中 · 0 已完成

您的生成历史将显示在这里

EvoLink 上的 Gemini Omni Flash API

Name: EvoLink AI Model API Platform
Brand: EvoLink
Availability: InStock

通过 EvoLink 的统一视频 API 使用 Gemini Omni Flash，实现文生视频、图生视频、参考图生视频和视频编辑。外部讨论常把 Gemini Omni 理解为“视频版 Nano Banana”，因为它把多模态视频创作和对话式编辑带入短视频工作流。在 EvoLink 上，这个页面聚焦实际 API 接入：EvoLink model ID、异步任务工作流、callback 支持、token 用量可见性，以及与 Veo、Seedance、Kling 等视频模型共用同一个 API key。

Billing Rules

•Gemini Omni Flash is billed by token usage. The task returns a credits_reserved estimate on creation and settles from the actual usage tokens once the task completes.
•Text input: counted from the prompt tokens.
•Video input: 5,792 tokens per second of input video.
•Video output: 5,792 tokens per second of 720p video (audio included).
•The output follows the input video, so video edit does not accept duration or aspect_ratio.

Pricing

Model	Mode	Meter	Price
Text to Video	Output video	Video output tokens	$0.015/ 1K tokens(1.0115 Credits)
Text to Video	Input text / image / video	Input tokens	$0.0013/ 1K tokens(0.0867 Credits)
Text to Video	Thinking / text output	Other output tokens	$0.0077/ 1K tokens(0.5202 Credits)

Text to Video

Output video

Meter:Video output tokens

Price:

$0.015/ 1K tokens

(1.0115 Credits)

Text to Video

Input text / image / video

Meter:Input tokens

Price:

$0.0013/ 1K tokens

(0.0867 Credits)

Text to Video

Thinking / text output

Meter:Other output tokens

Price:

$0.0077/ 1K tokens

(0.5202 Credits)

If it's down, we automatically use the next cheapest available—ensuring 99.9% uptime at the best possible price.

EVOLINK · PRICE EST.gemini-omni-flash

Auto estimated as 10s · real-time

Figures are pre-bill estimates. Actual charges follow the upstream usage tokens returned by the model.

Your estimate

~$0.86959.106

Official· saves ~15%

~$1.02369.537

Tokens per task

video output57,920

text input0

other output1,000

Mode

Duration

Prompt

0 chars · ~0 text tokens

Gemini Omni API 能做什么？

对话式视频编辑

用 Gemini Omni 生成一段视频，然后在对话中逐步精修——「让光线更暖一些」「把红色汽车替换掉」。这个工作流面向迭代式编辑，并尽量在所选路由能力范围内保留周围场景、主体身份和运动连贯性。

在 Playground 试用

物体替换与场景改写

替换画面中的物体、移除多余元素，或在保留身份和运动的前提下改写场景。适用于广告创意迭代和产品变体渲染，无需外部编辑工具。

查看工作流

参考图工作流

传入一张参考图，Gemini Omni 会在生成视频中锚定角色身份、光线和色彩。结合对话式编辑，可在不破坏视觉一致性的前提下精修特定镜头。

开始使用 API

支持音频的视频生成

在所选模式支持的情况下，Gemini Omni Flash 路由可以返回带音频的短视频输出，减少首轮生成中额外拼接 TTS 或音效流程的工作。

开始使用 API

Gemini Omni 横向对比 — 所有模型共用一个 EvoLink API key

Gemini Omni 的看点不只是原始画质，而是工作流：多模态输入、对话式编辑，以及通过 EvoLink 用同一个 API key 与 Veo、Seedance、Kling 并行评估的接入路径。

对话原生编辑工作流

Gemini Omni 的定位更偏对话式视频编辑，而 Veo 3.1 和 Seedance 2.0 通常首先作为生成路由来评估。对于多轮精修，这是需要重点测试的工作流差异。

长上下文角色一致性

Gemini Omni 被认为受益于 Gemini 的上下文和世界知识，适合评估多输入、重编辑工作流中的连续性表现。建议用自己的分镜或短视频 prompt 做实测。

无需 Google Cloud 项目 — 与 Veo、Seedance 相同的异步模式

无需 GCP 配置、无需 Vertex 计费、无需单独的区域审批。如果你已通过 EvoLink 运行视频生成，接入 Gemini Omni 只需改一个参数 — 请求结构、任务生命周期与 Veo 3.1、Seedance 2.0、Kling 完全一致。

Gemini Omni vs Veo 3.1 vs Seedance 2.0 — 详细对比

2026 年生产级视频工作流中最常被候选的三个模型，均可通过一个 EvoLink API key 访问。

功能	Gemini Omni	Veo 3.1	Seedance 2.0
EvoLink 价格	按 token 计费	From $0.50/s	From $0.092/s
画质	720p	720p / 1080p，部分场景支持 4K upscale	480p / 720p / 1080p
原生音频	支持	支持	支持
参考控制	文本 + 图片 + 对话编辑	文本 + 图片	文本 + 图片 + 视频 + 音频
视频时长	3-10 秒 / Auto	短片段，可在支持时用 Extend 扩展长镜头	4–15s
编辑能力	对话式编辑工作流	以生成为主	V2V 模式
最适合	短视频编辑和多输入工作流	电影级基准	多模态参考生产

Gemini Omni vs Veo 3.1 →

如何接入 Gemini Omni API

三步完成你的第一个 Gemini Omni 视频任务，接入模式与 Veo 3.1、Seedance 2.0、Kling 3.0 一致。

Step 1 — 获取 API Key

在 EvoLink.ai 注册并在控制台生成你的 API key，无需 Google Cloud 项目。

Step 2 — 提交生成任务

向 /v1/videos/generations 发送 POST 请求，指定一个 Gemini Omni Flash 模型名并填写 prompt。生成模式可用 duration 设置 3-10 秒或 Auto，图生视频/参考图生视频传 image_urls，视频编辑传 video_urls，可选 callback_url 接收完成通知。API 异步处理并返回 task_id。

Step 3 — 获取视频结果

使用 task_id 轮询状态接口，或等待 callback_url webhook 回调。当状态变为 completed 时，你将获得生成的 MP4 下载链接，链接有效期 24 小时。

Gemini Omni API 能力总览

面向生产视频工作流的技术规格。

编辑

对话式视频编辑

在对话式工作流中进行多轮精修，场景连续性取决于所选路由和输入质量。

输出

720p，3-10 秒 / Auto 片段

生成模式支持 720p、3-10 秒或 Auto 片段；Auto 按 10 秒预估。视频编辑模式接收一个最长 10 秒的 MP4 输入。

模式

文生视频与图生视频

支持 T2V 文本提示生成和 I2V 参考图输入。对话编辑适用于两种模式的输出。

音频

支持音频的视频输出

在所选 Gemini Omni Flash 路由支持时，短视频输出可以包含音频。

一致性

长上下文角色一致性

面向多输入和重编辑工作流中的连续性表现设计；用于生产前请用自己的 prompt 验证一致性。

工作流

异步 API — 支持 Task ID 和回调

提交任务后获取 ID，轮询状态或配置 callback_url。与 EvoLink 其他视频模型的生命周期一致。

费用示例 — Gemini Omni 定价预估

100 × 3-10s/Auto 片段（社交媒体批量生成）

按当前 Pricing 标签页费率估算

1,000 × 3-10s/Auto 片段/月（生产规模）

按当前 Pricing 标签页费率估算

1 次生成 + 3 次编辑（多轮工作流）

按当前 Pricing 标签页费率估算

请以上方 Pricing 标签中的当前 token 计费价格为准。通过切换 model 参数选择不同工作流。

探索 EvoLink 上更多视频生成模型 →

Gemini Omni API 常见问题

Everything you need to know about the product and billing.

Gemini Omni 是 Google 在 Google I/O 2026 发布的多模态视频模型家族，Omni Flash 被外部讨论为支持文本、图片、视频和音频输入的短视频路线。相比 Veo 3.1，Gemini Omni 更值得关注的是对话式编辑和多输入工作流；Veo 仍然是强电影级生成基线。

费用按 API 返回的 usage tokens 计费，包含 input、video output 和 other output 三类 token 维度。请查看上方定价表了解当前费率。

不需要。EvoLink 通过一个 API key 提供访问，无需 Google Cloud 项目、无需 Vertex 计费、无需单独的区域审批。认证方式与 EvoLink 上的 Veo 3.1 和 Seedance 2.0 完全一致。

当前支持四种模式：gemini-omni-flash-text-to-video、gemini-omni-flash-image-to-video、gemini-omni-flash-reference-to-video 和 gemini-omni-flash-video-edit。它们共享同一个异步视频 API 端点。

支持。提交任务时传入 callback_url（HTTPS），任务进入终态时 EvoLink 可以向你的端点发送 POST 请求。如果不提供 callback_url，也可以轮询任务状态接口。

失败任务会返回 failed 状态并附带错误原因。应用层重试时，建议先检查错误、保留原始参数用于排查，并只在确认是输入问题或临时故障后重新提交。

可以 — 这是 Gemini Omni 的主要工作流差异之一。使用自然语言编辑指令后，需要验证所选路由在多轮迭代中对周围场景、主体身份和运动连贯性的保留效果。

生成模式支持 3-10 秒或 Auto 片段；Auto 预留估算按 10 秒计算。视频编辑模式接收一个最长 10 秒的 MP4 输入。对于更长的叙事，可利用长上下文角色一致性将多个片段串联起来。

支持。传入参考图 URL，Gemini Omni 会将其作为生成视频的身份锚点。

Seedance 2.0 有较强的 benchmark 和多模态参考信号，Veo 3.1 仍是强电影级生成基线，并在 Flow 与扩展工作流上有优势。Gemini Omni 的差异化在于开发者正在评估它的对话式编辑、多输入生成和短视频迭代能力。

可以。EvoLink 通过单一 API key 提供 Gemini Omni、Veo 3.1、Nano Banana 2 以及 Gemini 全系列模型的统一访问。切换模型只需更改 model 参数。

全部 Gemini 视频 API 模型

EvoLink 通过单一 API key 提供对 Google 视频与媒体模型家族的统一访问。所有模型共享同一 EvoLink API 端点，切换模型只需更改一个参数。

探索 Gemini 系列查看 Veo 3.1 查看 Nano Banana 2

API Reference

Select endpoint

Endpoints

Authentication

All APIs require Bearer Token authentication.

Header

Authorization: 
Bearer YOUR_API_KEY

Get API Key

POST

/v1/videos/generations

Create Gemini Omni Flash Video Task

Text to Video uses the unified EvoLink video generation endpoint. Select the mode by changing the model parameter.

Asynchronous processing returns a task ID. Use it to , or provide callback_url for completion notifications.

Generated outputs should be stored in your own system when result URLs are time-limited.

Request Parameters

modelstringRequiredDefault: gemini-omni-flash-text-to-video

Gemini Omni Flash model name. Fixed to gemini-omni-flash-text-to-video for text-to-video generation.

Examplegemini-omni-flash-text-to-video

promptstringRequired

Natural-language instruction describing the requested video.

ExampleCreate a cinematic product video with smooth camera motion and natural audio ambience

aspect_ratiostringOptionalDefault: 16:9

Output aspect ratio. Use auto to let the provider choose.

Value	Description
16:9	Landscape video
9:16	Portrait video
auto	Let the provider choose the output ratio

Example16:9

durationinteger or stringOptionalDefault: 10 if omitted

Output video duration in seconds. The Playground sends auto by default.

Value	Description
3-10	Any integer from 3 to 10 seconds. If omitted, the API default is 10 seconds.
auto	Let the provider decide the output duration. Playground sends auto by default and estimates it as 10 seconds.

Notes

Use auto to let the model decide the duration; reservations estimate auto as 10 seconds
Affects the estimated reservation; completed tasks are billed from API usage tokens

Exampleauto

callback_urlstringOptional

Optional HTTPS callback address after task completion.

Notes

Use polling if no callback_url is provided
Store outputs promptly when result URLs are time-limited

Examplehttps://your-domain.com/webhooks/video-task-completed

Request Example

{
  "model": "gemini-omni-flash-text-to-video",
  "prompt": "Create a cinematic product video with smooth camera motion and natural audio ambience",
  "aspect_ratio": "16:9",
  "duration": "auto",
  "callback_url": "https://your-domain.com/webhooks/video-task-completed"
}

Response Example

{
  "id": "task-video-xxxxxxxx",
  "model": "gemini-omni-flash-text-to-video",
  "object": "video.generation.task",
  "status": "processing",
  "progress": 0,
  "task_info": {
    "estimated_time": 60,
    "can_cancel": false,
    "video_duration": 10
  },
  "usage": {
    "credits_reserved": 59.1089,
    "billing_rule": "per_token"
  },
  "type": "video",
  "created": 1782940800
}

Billing Rules

Gemini Omni Flash is billed by token usage. The task returns a credits_reserved estimate on creation and settles from the actual usage tokens once the task completes. Token counts per material:

Text input — counted from the prompt tokens.
Video output — 5,792 tokens per second of 720p video (audio included).
Duration only affects the reservation estimate; Auto is estimated as 10 seconds.