openbmb

minicpm-o2.6

A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

26.5K Pulls 13 Tags Updated 11 months ago

A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

17.2K Pulls 11 Tags Updated 8 months ago

A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Mulitmodal Live Streaming on Your Phone

6,669 Pulls 12 Tags Updated 3 months ago

A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone

3,246 Pulls 12 Tags Updated 6 days ago

A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

2,359 Pulls 12 Tags Updated 11 months ago

A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

1,725 Pulls 12 Tags Updated 9 months ago

highly efficient large language models (LLMs) designed explicitly for end-side devices

1,237 Pulls 1 Tag Updated 8 months ago

A GPT-4V Level Multimodal LLM on Your Phone

391 Pulls 13 Tags Updated 11 months ago