Parent
#15
What to build
Introduce Extractor as a plain struct in src/runner/extract.rs. Move link extraction and asset classification logic out of crawler.rs (and/or wrap the existing src/extract/ module) behind this single seam. Interface: fn extract(&self, response: &FetchResponse) -> ExtractedContent where ExtractedContent carries Vec<DiscoveredUrl> and asset classifications.
No trait — one impl, pure logic. Crawler::process_job calls Extractor directly after a successful fetch.
Acceptance criteria
Blocked by
Parent
#15
What to build
Introduce
Extractoras a plain struct insrc/runner/extract.rs. Move link extraction and asset classification logic out ofcrawler.rs(and/or wrap the existingsrc/extract/module) behind this single seam. Interface:fn extract(&self, response: &FetchResponse) -> ExtractedContentwhereExtractedContentcarriesVec<DiscoveredUrl>and asset classifications.No trait — one impl, pure logic.
Crawler::process_jobcallsExtractordirectly after a successful fetch.Acceptance criteria
Extractorstruct insrc/runner/extract.rsExtractorCrawler::process_jobcallsExtractorinstead of inline helperscrawler.rsremoved if fully subsumed#[cfg(test)] mod testswith HTML fixtures: relative URLs, base href, srcset, link relscargo test --all-featuresgreenBlocked by