Skip to content

[5.x]: Find & Replace skips ~50% of matching records #17974

@ul8

Description

@ul8

What happened?

Description

The Find & Replace utility in Craft CMS skips approximately half of the matching records during batch processing. The job reports success but leaves many records unchanged.

Steps to reproduce

  1. Create content with 200+ instances of a search term
  2. Run Find & Replace
  3. Execute the queue
  4. Verify results: SELECT COUNT(*) FROM elements_sites WHERE content LIKE '%OldTerm%'

Expected behavior

0 remaining occurrences

Actual behavior

Significant number of occurrences remain (typically 40-50%)

The bug

The bug is caused by offset pagination while modifying the dataset. Here's the sequence:

  1. Initial query finds 1,000 matching records
  2. Batch 1 (OFFSET 0, LIMIT 100): Processes records 1-100, replaces ~60 of them (due to case sensitivity in this example)
  3. Those 60 records no longer match the query after replacement
  4. Batch 2 (OFFSET 100, LIMIT 100): Query now returns only ~940 matching records
  5. OFFSET 100 skips the first 100 results, which now only includes ~40 from Batch 1
  6. Records that were originally at positions 101-160 get skipped entirely

This compounds with each batch, resulting in ~50% of records being silently skipped.

The issue is in src/queue/jobs/FindAndReplace.php:

class FindAndReplace extends BaseBatchedJob
{
    protected function loadData(): Batchable
    {
        return new QueryBatcher(
            (new Query())
                ->select(['id', 'title', 'content'])
                ->from(Table::ELEMENTS_SITES)
                ->where($where)  // WHERE content LIKE '%OldTerm%'
        );
    }
}

QueryBatcher uses offset-based pagination (->offset(100)->limit(100)), which causes the "shifting records" problem when the result set changes between batches.

Workaround

I patched the job to disable the batching, works flawlessly then. I could submit a PR but I imagine you would probably want to properly fix the batching. Although I would say that is maybe not necessary - these kinds of operation run extremely fast I don't see the point.

Craft CMS version

5.7.10 (tested) / 5.8.18 (looks like still present)

PHP version

8.3

Operating system and version

No response

Database type and version

No response

Image driver and version

No response

Installed plugins and versions

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions