Skip to content

Imagedl: Search and download images from specific websites. (轻量级图片搜索下载器,支持谷歌,百度,必应,360,Pixabay,Yandex,搜狗,雅虎,DuckDuckGo,Unsplash等各大平台,轻松构建大模型训练测试集)

License

Notifications You must be signed in to change notification settings

CharlesPikachu/imagedl

Repository files navigation


docs PyPI - Python Version PyPI license PyPI - Downloads PyPI - Downloads issue resolution open issues

📚 Documents: imagedl.readthedocs.io

🧪 Online API Health & Demo: charlespikachu.github.io/imagedl
Automatically runs daily checks on all registered imagedl modules (search + download) via GitHub Actions and visualizes the latest results on this page.

demo

学习收获更多有趣的内容, 欢迎关注微信公众号:Charles的皮卡丘

🆕 What's New

  • 2025-12-11: Released pyimagedl v0.2.4 — supports searching for and downloading images from Unsplash, along with some minor improvements.
  • 2025-12-07: Released pyimagedl v0.2.3 — supports searching and downloading via the Yahoo image search engine, with partial tuning of the default arguments.
  • 2025-11-19: Released pyimagedl v0.2.2 — fix potential in-place modified bugs in HTTP requests.
  • 2025-11-16: Released pyimagedl v0.2.1 — fixed some minor bugs in duckduckgo and BaseImageClient.
  • 2025-11-16: Released pyimagedl v0.2.0 — upgrade ImageClient and fixed some minor bugs.
  • 2025-11-10: Released pyimagedl v0.1.8 — fix logging and requirements.

📘 Introduction

Imagedl lets you search for and download images from specific websites. If you find it useful, please consider starring the repository to follow updates—thank you for your support!

🖼️ Supported Image Client

ImageClient (EN) ImageClient (CN) Search Download Code Snippet
BaiduImageClient 百度图片 baidu.py
BingImageClient 必应图片 bing.py
GoogleImageClient 谷歌图片 google.py
I360ImageClient 360图片 i360.py
PixabayImageClient Pixabay图片 pixabay.py
YandexImageClient Yandex图片 yandex.py
DuckduckgoImageClient DuckDuckGo图片 duckduckgo.py
SogouImageClient 搜狗图片 sogou.py
YahooImageClient 雅虎图片 yahoo.py
UnsplashImageClient Unsplash图片 unsplash.py

📦 Install

You have three installation methods to choose from,

# from pip
pip install pyimagedl
# from github repo method-1
pip install git+https://github.com/CharlesPikachu/imagedl.git@main
# from github repo method-2
git clone https://github.com/CharlesPikachu/imagedl.git
cd imagedl
python setup.py install

⚡ Quick Start

After installing imagedl, you can use the following few lines of code to quickly get started with it,

from imagedl import imagedl

image_client = imagedl.ImageClient(image_source='BaiduImageClient')
image_client.startcmdui()

where image_source is used to specify the image search and download engine. Of course, you can equivalently enter imagedl -i "BaiduImageClient" in the terminal to execute the above code. imagedl --help displays the basic usage of the command-line tool.

Usage: imagedl [OPTIONS]

Options:
  --version                       Show the version and exit.
  -k, --keyword TEXT              The keywords for the image search. If left
                                  empty, an interactive terminal will open
                                  automatically.
  -i, --image-source, --image_source [bingimageclient|baiduimageclient|googleimageclient|i360imageclient|pixabayimageclient|yandeximageclient|duckduckgoimageclient|sogouimageclient|yahooimageclient|unsplashimageclient]
                                  The image search and download source.
                                  [default: BaiduImageClient]
  -s, --search-limits, --search_limits INTEGER RANGE
                                  Scale of image downloads.  [default: 1000;
                                  1<=x<=100000000.0]
  -n, --num-threadings, --num_threadings INTEGER RANGE
                                  Number of threads used.  [default: 5;
                                  1<=x<=256]
  -c, --init-image-client-cfg, --init_image_client_cfg TEXT
                                  Client config such as `work_dir` as a JSON
                                  string.
  -r, --request-overrides, --request_overrides TEXT
                                  Requests.get (or Requests.post) kwargs such
                                  as `headers` and `proxies` as a JSON string.
  --help                          Show this message and exit.

For class imagedl.ImageClient, the acceptable arguments include,

  • image_source (str, default: 'BaiduImageClient'): The image search and download source, including ['BaiduImageClient', 'BingImageClient', 'GoogleImageClient', 'I360ImageClient', 'PixabayImageClient', 'YandexImageClient', 'DuckduckgoImageClient', 'SogouImageClient', 'YahooImageClient', 'UnsplashImageClient'].
  • init_image_client_cfg (dict, default: {}): Client initialization configuration such as {'work_dir': 'images', 'max_retries': 5}.
  • search_limits (int, default: 1000): Scale of image downloads.
  • num_threadings (int, default: 5): Number of threads used.
  • request_overrides (dict, default: {}): requests.get (or requests.post) kwargs such as {'headers': {'User-Agent': xxx}, 'proxies': {}}.

The demonstration is as follows,


If you just want to do an image search, you can also do it like this,

from imagedl import imagedl

image_client = imagedl.ImageClient(image_source='DuckduckgoImageClient', search_limits=1000, num_threadings=5)
image_infos = image_client.search('cut animals', search_limits_overrides=10, num_threadings_overrides=1)
print(image_infos)

In the code above, search_limits_overrides overrides the search_limits argument set when initializing imagedl.ImageClient, and num_threadings_overrides works in the same way. The output of this code looks like,

[
    {
        "candidate_urls": [
            "https://img.freepik.com/.../cut-animal-cartoon-bundle-set_508290-2349.jpg",
            "https://tse2.mm.bing.net/th/id/OIP.vD-8G0MjAMREv1bYbKaqEwHaHa..."
        ],
        "raw_data": {
            "height": 626,
            "width": 626,
            "image": "https://img.freepik.com/.../cut-animal-cartoon-bundle-set_508290-2349.jpg",
            "image_token": "fbff471d31328...",
            "thumbnail": "https://tse2.mm.bing.net/th/id/OIP.vD-8G0MjAMREv1bYbKaqEwHaHa...",
            "thumbnail_token": "4ca07ad2aab9...",
            "source": "Bing",
            "title": "Premium Vector | Cut animal cartoon bundle set",
            "url": "https://www.freepik.com/premium-vector/cut-animal-cartoon-bundle-set_25750969.htm"
        },
        "identifier": "fbff471d31328...",
        "work_dir": "imagedl_outputs\\DuckduckgoImageClient\\2025-11-16-22-34-25 cutanimals",
        "file_path": "imagedl_outputs\\DuckduckgoImageClient\\2025-11-16-22-34-25 cutanimals\\00000001"
    },
    ...
]

Then you can also call the image downloading function to download the images found by the search. The code is as follows,

from imagedl import imagedl

image_client = imagedl.ImageClient(image_source='DuckduckgoImageClient', search_limits=1000, num_threadings=5)
image_infos = image_client.search('cut animals', search_limits_overrides=10, num_threadings_overrides=1)
image_client.download(image_infos=image_infos)

If you prefer not to use the unified interface, you can also import a specific image search engine directly, as in the following code,

from imagedl.modules.sources import (
    BingImageClient, I360ImageClient, YahooImageClient, BaiduImageClient, SogouImageClient, GoogleImageClient, YandexImageClient, PixabayImageClient, DuckduckgoImageClient, UnsplashImageClient
)


# bing tests
client = BingImageClient()
image_infos = client.search('Cute Dogs', search_limits=10, num_threadings=1)
client.download(image_infos, num_threadings=1)
# 360 tests
client = I360ImageClient()
image_infos = client.search('Cute Dogs', search_limits=10, num_threadings=1)
client.download(image_infos, num_threadings=1)
# baidu tests
client = BaiduImageClient()
image_infos = client.search('Cute Dogs', search_limits=10, num_threadings=1)
client.download(image_infos, num_threadings=1)
# sogou tests
client = SogouImageClient()
image_infos = client.search('Cute Dogs', search_limits=10, num_threadings=1)
client.download(image_infos, num_threadings=1)
# google tests
client = GoogleImageClient()
image_infos = client.search('Cute Dogs', search_limits=10, num_threadings=1)
client.download(image_infos, num_threadings=1)
# yandex tests
client = YandexImageClient()
image_infos = client.search('Cute Dogs', search_limits=10, num_threadings=1)
client.download(image_infos, num_threadings=1)
# pixabay tests
client = PixabayImageClient()
image_infos = client.search('Cute Dogs', search_limits=10, num_threadings=1)
client.download(image_infos, num_threadings=1)
# duckduckgo tests
client = DuckduckgoImageClient()
image_infos = client.search('Cute Dogs', search_limits=10, num_threadings=1)
client.download(image_infos, num_threadings=1)
# yahoo tests
client = YahooImageClient()
image_infos = client.search('Cute Dogs', search_limits=10, num_threadings=1)
client.download(image_infos, num_threadings=1)
# unsplash tests
client = UnsplashImageClient()
image_infos = client.search('Cute Dogs', search_limits=10, num_threadings=1)
client.download(image_infos, num_threadings=1)

💡 Recommended Projects

Project ⭐ Stars 📦 Version ⏱ Last Update 🛠 Repository
🎵 Musicdl
轻量级无损音乐下载器
Stars Version Last Commit 🛠 Repository
🎬 Videodl
轻量级高清无水印视频下载器
Stars Version Last Commit 🛠 Repository
🖼️ Imagedl
轻量级海量图片搜索下载器
Stars Version Last Commit 🛠 Repository
🌐 FreeProxy
全球海量高质量免费代理采集器
Stars Version Last Commit 🛠 Repository
🌐 MusicSquare
简易音乐搜索下载和播放网页
Stars Version Last Commit 🛠 Repository
🌐 FreeGPTHub
真正免费的GPT统一接口
Stars Version Last Commit 🛠 Repository

📚 Citation

If you use this project in your research, please cite the repository.

@misc{imagedl2022,
    author = {Zhenchao Jin},
    title = {Imagedl: Search and download images from specific websites},
    year = {2022},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/CharlesPikachu/imagedl/}},
}

🌟 Star History

Star History Chart

☕ Appreciation (赞赏 / 打赏)

WeChat Appreciation QR Code (微信赞赏码) Alipay Appreciation QR Code (支付宝赞赏码)

📱 WeChat Official Account (微信公众号):

Charles的皮卡丘 (Charles_pikachu)
img

About

Imagedl: Search and download images from specific websites. (轻量级图片搜索下载器,支持谷歌,百度,必应,360,Pixabay,Yandex,搜狗,雅虎,DuckDuckGo,Unsplash等各大平台,轻松构建大模型训练测试集)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

  •  

Packages

No packages published