Skip to content

Refactor AutoThrottle and deprecate the download_delay spider attr (#7167)#7175

Open
DenisC96 wants to merge 5 commits into
scrapy:masterfrom
DenisC96:issue-7167
Open

Refactor AutoThrottle and deprecate the download_delay spider attr (#7167)#7175
DenisC96 wants to merge 5 commits into
scrapy:masterfrom
DenisC96:issue-7167

Conversation

@DenisC96

@DenisC96 DenisC96 commented Dec 13, 2025

Copy link
Copy Markdown
Contributor

Resolve #7167

The following changes have been made:

  1. Updated scrapy/extensions/throttle.py and scrapy/core/downloader/__init__.py to deprecate download_delay spider attr, and added deprecation warning
  2. Updated tests in tests/test_extension_throttle.py to deprecate download_delay spider attr

Please let me know if there are any comments. Thank you.

@codecov

codecov Bot commented Dec 13, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 91.49%. Comparing base (d8583a8) to head (10f4cd3).
⚠️ Report is 86 commits behind head on master.

Files with missing lines Patch % Lines
scrapy/core/downloader/__init__.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #7175   +/-   ##
=======================================
  Coverage   91.48%   91.49%           
=======================================
  Files         165      165           
  Lines       12666    12671    +5     
  Branches     1621     1624    +3     
=======================================
+ Hits        11588    11593    +5     
  Misses        808      808           
  Partials      270      270           
Files with missing lines Coverage Δ
scrapy/extensions/throttle.py 100.00% <100.00%> (ø)
scrapy/core/downloader/__init__.py 95.36% <50.00%> (+0.03%) ⬆️

... and 4 files with indirect coverage changes

@DenisC96 DenisC96 changed the title Refactor AutoThrottle and deprecate the download_delay spider attr (#7167 - Draft) Refactor AutoThrottle and deprecate the download_delay spider attr (#7167) Dec 15, 2025
…y/core/downloader/__init__.py and tests/test_extension_throttle.py
@DenisC96 DenisC96 marked this pull request as ready for review December 15, 2025 20:20
@wRAR

wRAR commented Dec 16, 2025

Copy link
Copy Markdown
Member

Can you explain how does the AutoThrottle extension influence the downloader after your changes?

Comment on lines +100 to +101
if settings.getbool("AUTOTHROTTLE_ENABLED"):
delay = max(delay, settings.getfloat("AUTOTHROTTLE_START_DELAY"))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original AutoThrottle class calls _start_delay() in _spider_opened() to assign the initial delay to spider.download_delay, which was then assigned to the slot's initial delay in scrapy.core.downloader._get_concurrency_delay().

In my updated codes, the logic in _start_delay() is handled inside scrapy.core.downloader._get_concurrency_delay() using the settings (as shown above), so the slot's initial delay can be set without calling _start_delay() (can delete this function and relevant test?), and therefore deprecate the use of spider.download_delay attr.

Other parts in AutoThrottle such as _response_downloaded and _adjust_delay() remain unchanged so delay adjustment should not be affected. In short, I moved the logic of _start_delay() in AutoThrottle into scrapy.core.downloader._get_concurrency_delay() to get rid of the use of spider.download_delay attr, and _start_delay() is no longer used.

I'm new to this project, please let me know if I misunderstood the flow or if you prefer other ways of changing. Thank you so much!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so this moves a small part of the logic from the extension to the core.

I don't think I like it but I also don't know if it's possible to do it in a less invasive way in the current state. There was a non-specific suggestion to try tackling this after #6660, though it's not guaranteed it will be easier then :)

So I'll leave this open in case someone has better ideas.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will see how it goes after #6660 is resolved or if anyone has a better approach. Thank you for your comments!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deprecate the download_delay spider attr after refactoring AutoThrottle

2 participants