Skip to content

Datetime coming from response headers issue #179

@SamComber

Description

@SamComber

I noticed that htmldate utilizes the find_date function, which internally relies on examine_header.

Does it make sense to parse the response header from the server? Do servers typically default this to the current date?

Here’s an example where this date is extracted: '2024-12-02'...

from htmldate import find_date

find_date(
    "https://octopus.energy/blog/agile-octopus-bigger-story/",
    original_date=True,
    extensive_search=True,
)

But the published at is actually...

image

If I comment lines on examine_header we do extract out the correct date (2022-12-13) during # last resort

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions