Update Germany holidays: add school holidays support#3533
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds Germany Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. 👉 Get your free trial and get 200 agent minutes per Slack user (a $50 value). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/countries/test_germany.py (2)
40-50: 🛠️ Refactor suggestion | 🟠 MajorExercise
SCHOOLwith a real subdivision intest_no_holidays.The new
SCHOOLchecks only instantiate Germany withoutsubdiv, which is always empty anyway. That misses the real boundary here:Germany(subdiv=..., years=self.start_year - 1, categories=SCHOOL).Based on learnings: In the holidays project, the
test_no_holidaysmethod should test ALL supported_categories (viacategories=CountryClass.supported_categories) rather than just the default PUBLIC category, to ensure comprehensive validation that no holidays exist before start_year across any supported category including OPTIONAL and subdivision categories.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/countries/test_germany.py` around lines 40 - 50, The test_no_holidays currently only checks SCHOOL without a subdivision and misses categories beyond PUBLIC; update test_no_holidays to iterate all supported categories using Germany.supported_categories and for each subdiv in Germany.subdivisions instantiate Germany(subdiv=subdiv, years=self.start_year - 1, categories=category) (and for categories that aren't subdiv-dependent also test without subdiv) so you validate that no holidays exist before start_year across all supported categories; ensure the existing specific checks (e.g., years=1989 and years=2025 for SCHOOL) remain or are adapted to use Germany.supported_categories where appropriate.
448-549: 🛠️ Refactor suggestion | 🟠 MajorAdd a subdivision-scoped l10n check for the new school-holiday names.
These localization fixtures still exclude
SCHOOL, and without a subdivision they can never hit the new school-holiday translations. Please add at least one localized subdivision case forcategories=SCHOOL(or(PUBLIC, SCHOOL)) so the newDE.poentries are actually verified.Based on learnings: The
assertLocalizedHolidaysmethod in the vacanza/holidays project requires a complete list of all holidays from all categories (PUBLIC, GOVERNMENT, etc.), not just the holidays from the default category. This is a framework requirement for comprehensive localization testing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/countries/test_germany.py` around lines 448 - 549, The localization tests currently never exercise SCHOOL-scoped translations; add a new subdivision-scoped test in tests/countries/test_germany.py (e.g., def test_l10n_de_by_school(self):) that calls assertLocalizedHolidays with the Bavaria subdivision identifier and includes the school-only holiday names, passing categories=(SCHOOL,) or categories=(PUBLIC, SCHOOL) so the new DE.po SCHOOL entries are verified; reference assertLocalizedHolidays and ensure the test name and the categories argument target SCHOOL so the school-holiday translations in DE.po are exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@holidays/countries/germany.py`:
- Around line 156-160: The _populate_school_holidays method fails to handle the
deprecated "BYP" subdivision so Germany(subdiv="BYP", categories=SCHOOL) returns
no data; update the logic in _populate_school_holidays to map "BYP" to Bavaria
the same way _populate_public_holidays does—i.e., when self._normalized_subdiv
equals "augsburg" OR equals "byp" (case-insensitive normalized form), set subdiv
= "BY" (using self._normalized_subdiv and the existing "BY" mapping) so school
holidays for Bavaria are returned for the deprecated code.
In `@scripts/calendar/germany_school_holidays_generator.py`:
- Around line 190-203: The function ensure_pdf_sources currently returns any
cached PDFs immediately; instead always query the KMK index via
_get_school_year_pdf_links(index_url) to determine the authoritative set of pdf
URLs, compute the desired filenames (using urlparse(...).path.name), then
reconcile the raw_pdf_dir: download any missing desired PDFs, remove any extras
that are not in the authoritative list, and finally return the sorted list of
current pdf Paths. Update ensure_pdf_sources to perform this reconciliation (use
_get_school_year_pdf_links, urlparse(url).path.name, _fetch_url) rather than
short-circuiting when any *.pdf exists.
- Around line 267-276: The code currently zips columns with raw_row[1:] which
silently truncates ragged rows from extract_table(); update the loop that builds
rows to validate row width before zipping by checking that len(raw_row) - 1 ==
len(columns) (or at least >=), and if it doesn't match, raise or log a clear
exception including state_label/subdiv and the offending raw_row so generation
fails loudly; only proceed to create rows[subdiv] via zip(columns, raw_row[1:])
(and/or explicitly pad missing cells with "" if you prefer to tolerate short
rows) after the width check, referencing variables raw_row, columns,
state_label, STATE_LABEL_TO_CODE, and _normalize_cell to locate the change.
In `@tests/common.py`:
- Around line 600-603: In _assertLocalizedHolidays, the current fallback
`categories = categories or self.test_class.supported_categories` overwrites
intentionally empty values like ()—change it to only assign the default when
categories is None (e.g., `if categories is None: categories =
self.test_class.supported_categories`) so explicit empty tuples/lists are
preserved; update the logic in the _assertLocalizedHolidays method of the test
class accordingly.
---
Outside diff comments:
In `@tests/countries/test_germany.py`:
- Around line 40-50: The test_no_holidays currently only checks SCHOOL without a
subdivision and misses categories beyond PUBLIC; update test_no_holidays to
iterate all supported categories using Germany.supported_categories and for each
subdiv in Germany.subdivisions instantiate Germany(subdiv=subdiv,
years=self.start_year - 1, categories=category) (and for categories that aren't
subdiv-dependent also test without subdiv) so you validate that no holidays
exist before start_year across all supported categories; ensure the existing
specific checks (e.g., years=1989 and years=2025 for SCHOOL) remain or are
adapted to use Germany.supported_categories where appropriate.
- Around line 448-549: The localization tests currently never exercise
SCHOOL-scoped translations; add a new subdivision-scoped test in
tests/countries/test_germany.py (e.g., def test_l10n_de_by_school(self):) that
calls assertLocalizedHolidays with the Bavaria subdivision identifier and
includes the school-only holiday names, passing categories=(SCHOOL,) or
categories=(PUBLIC, SCHOOL) so the new DE.po SCHOOL entries are verified;
reference assertLocalizedHolidays and ensure the test name and the categories
argument target SCHOOL so the school-holiday translations in DE.po are
exercised.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 409ed20d-b7bb-439a-bfba-e8916ff92636
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (14)
README.mddocs/examples.mddocs/holiday_categories.mdholidays/countries/germany.pyholidays/countries/germany_school_holidays.pyholidays/locale/de/LC_MESSAGES/DE.poholidays/locale/en_US/LC_MESSAGES/DE.poholidays/locale/th/LC_MESSAGES/DE.poholidays/locale/uk/LC_MESSAGES/DE.popyproject.tomlscripts/calendar/germany_school_holidays_generator.pytests/common.pytests/countries/test_germany.pytests/test_utils.py
|
In comparison to the draft PR #3171 where holidays were hard coded, this PR brings automation via script to update dataset. Normally KMK updates the data rarely in bulks for next few years, so it would be enough to run the script once per year. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/calendar/germany_school_holidays_generator.py`:
- Around line 157-159: The _fetch_url function currently calls urlopen(url) with
no timeout which can hang; update the call in function _fetch_url to pass a
reasonable timeout (e.g., timeout=30) to urlopen (urlopen(url, timeout=30)) so
network reads will fail fast on stalled connections and keep the generator
responsive.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6d3d8c63-c465-4901-bb00-b1044d913bdc
📒 Files selected for processing (2)
.gitignorescripts/calendar/germany_school_holidays_generator.py
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
scripts/calendar/germany_school_holidays_generator.py (1)
283-284: 🧹 Nitpick | 🔵 TrivialConsider adding
strict=Truefor extra safety.Row length validation at lines 277-281 ensures the zip is safe, but adding
strict=Trueprovides defense-in-depth and silences the Ruff B905 warning.Proposed fix
rows[subdiv] = { - column: _normalize_cell(cell or "") for column, cell in zip(columns, row_cells) + column: _normalize_cell(cell or "") for column, cell in zip(columns, row_cells, strict=True) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/calendar/germany_school_holidays_generator.py` around lines 283 - 284, The dict comprehension using zip(columns, row_cells) in the generator that builds rows (mapping column: _normalize_cell(cell or "") for column, cell in zip(columns, row_cells)) should use zip(columns, row_cells, strict=True) to enforce equal-length iterables and silence Ruff B905; update that expression to add strict=True so any mismatch raises immediately and keeps the existing row length checks as defense-in-depth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/calendar/germany_school_holidays_generator.py`:
- Around line 389-391: The outer loop currently declares an unused variable
"year" in the construct "for year, year_data in data.items()" — update the loop
to avoid the unused variable by either iterating over values ("for year_data in
data.values()") or renaming "year" to "_" ("for _, year_data in data.items()")
so linters don't flag it; keep the inner body that sorts and deduplicates
"year_data[subdiv]".
In `@tests/countries/test_germany.py`:
- Around line 146-150: The test is mocking Germany._add_holiday with an
incorrect return type (False) even though the method returns date | None; update
the mock in the test to return None instead of False so the mocked behavior
matches the real function signature (locate the mock.patch.object(Germany,
"_add_holiday", return_value=...) in tests/countries/test_germany.py and change
the return_value to None), leaving the _add_multiday_holiday mock and its
assert_not_called() check unchanged.
---
Duplicate comments:
In `@scripts/calendar/germany_school_holidays_generator.py`:
- Around line 283-284: The dict comprehension using zip(columns, row_cells) in
the generator that builds rows (mapping column: _normalize_cell(cell or "") for
column, cell in zip(columns, row_cells)) should use zip(columns, row_cells,
strict=True) to enforce equal-length iterables and silence Ruff B905; update
that expression to add strict=True so any mismatch raises immediately and keeps
the existing row length checks as defense-in-depth.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 37e6e121-48d6-4f3b-8ed2-540775ba1fa6
📒 Files selected for processing (4)
holidays/countries/germany.pyscripts/calendar/germany_school_holidays_generator.pytests/common.pytests/countries/test_germany.py
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/calendar/germany_school_holidays_generator.py`:
- Around line 40-43: DEFAULT_RAW_PDF_DIR currently points under the repo
ROOT_DIR which risks accidental commits; change the default to a per-user or OS
temp cache (e.g. use appdirs.user_cache_dir or tempfile.gettempdir() and build a
subpath like "holidays-germany-school-holidays/pdfs") instead of
ROOT_DIR/.cache, while leaving the existing CLI/config override (--input-dir /
input_dir) intact; update the constant DEFAULT_RAW_PDF_DIR and any code that
constructs the path to use the new user-temp/user-cache location (reference
DEFAULT_RAW_PDF_DIR and ROOT_DIR in the file) so only explicit overrides put
files under the repo.
- Around line 211-226: Several re.sub patterns use Java/POSIX-style possessive
quantifiers (e.g. "++" and "*+") which break on Python 3.10; update those
regexes to standard quantifiers. Locate the label-cleanup block (patterns like
r"^\d++\s++", r"\s*+\([^)]*+\)", r"\s++\d++$") and change "++" to "+" and "*+"
to "*" (e.g. r"^\d+\s+", r"\s*\([^)]*\)", r"\s+\d+$"); likewise update patterns
in _normalize_cell (r"\s*+/\s*+" -> r"\s*/\s*", r"\s*+-\s*+" -> r"\s*-\s*",
etc.). Keep the rest of the regexes unchanged.
In `@tests/countries/test_germany.py`:
- Around line 49-50: Replace the hardcoded 1989 SCHOOL check with the real
boundary check by using self.start_year - 1 for the Germany(subdiv="BY",
categories=SCHOOL) call, remove the duplicate Germany(years=1989,
categories=SCHOOL) assertion, and ensure the generic no-holidays test uses all
supported categories by invoking
test_no_holidays(categories=Germany.supported_categories) (or
CountryClass.supported_categories) so pre-start_year checks cover PUBLIC,
SCHOOL, OPTIONAL and subdivision-specific data; update the assertions that
reference Germany and the test_no_holidays helper accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 25479a2d-c39e-42b0-b4e5-ce44813f12e6
📒 Files selected for processing (4)
holidays/countries/germany.pyscripts/calendar/germany_school_holidays_generator.pytests/common.pytests/countries/test_germany.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/calendar/germany_school_holidays_generator.py`:
- Around line 441-444: The output file is written with locale defaults; change
the write to explicitly use UTF-8 and LF endings by replacing the
OUTPUT_PATH.write_text(render_python_module(data)) call in main() with an
explicit open/write using OUTPUT_PATH.open("w", encoding="utf-8", newline="\n")
and write(render_python_module(data)) so the generated module is encoded as
UTF-8 and uses '\n' line endings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f062fa28-dfcb-463d-ac6b-66ab595905d1
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
.gitignoreREADME.mddocs/examples.mddocs/holiday_categories.mdholidays/countries/germany.pyholidays/countries/germany_school_holidays.pyholidays/locale/de/LC_MESSAGES/DE.poholidays/locale/en_US/LC_MESSAGES/DE.poholidays/locale/th/LC_MESSAGES/DE.poholidays/locale/uk/LC_MESSAGES/DE.popyproject.tomlscripts/calendar/germany_school_holidays_generator.pytests/common.pytests/countries/test_germany.pytests/test_utils.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/calendar/germany_school_holidays_generator.py`:
- Around line 298-299: The current lookup for the school-year header using
next(...) can raise an opaque StopIteration if no header contains "/", so change
it to safely attempt the find (e.g., next(..., None)) and if not found raise a
clear, file-specific ValueError explaining the missing or unexpected school-year
header; reference the column_years mapping and the subsequent call to
_parse_school_year(school_year) in the error message so the failure points to
the exact cause and input shape problem.
- Around line 246-251: The _parse_school_year function mishandles two-digit end
years across centuries (e.g., "1999/00" -> 1900); fix by computing candidate_end
= (start_year // 100) * 100 + int(end_year_str) when end_year_str length != 4,
and if candidate_end <= start_year add 100 to roll into the next century, then
assign end_year = candidate_end; update the logic around start_year, end_year,
start_year_str, end_year_str in _parse_school_year accordingly so ranges like
"1999/00" become (1999, 2000).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6c897ec6-487a-407e-b7a1-cf677b0e8ec9
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
.gitignoreREADME.mddocs/examples.mddocs/holiday_categories.mdholidays/countries/germany.pyholidays/countries/germany_school_holidays.pyholidays/locale/de/LC_MESSAGES/DE.poholidays/locale/en_US/LC_MESSAGES/DE.poholidays/locale/th/LC_MESSAGES/DE.poholidays/locale/uk/LC_MESSAGES/DE.popyproject.tomlscripts/calendar/germany_school_holidays_generator.pytests/common.pytests/countries/test_germany.pytests/test_utils.py
KJhellico
left a comment
There was a problem hiding this comment.
This approach looks very robust and comprehensive! But files created by humans always require careful attention during processing. I didn't analyze the script, just the dates, and here are the discrepancies I found.
|
I fixed the reported data issues in the generator and regenerated the dataset instead of patching the generated file by hand. The parser now correctly handles cross-year resolution, duplicate holiday columns, footnote-adjacent dates, shared-month forms like As an extra validation step, I cross-checked the generated output against the currently published KMK |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dev #3533 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 313 314 +1
Lines 18711 18734 +23
Branches 2393 2396 +3
=========================================
+ Hits 18711 18734 +23 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Sonarqube complains about regex usage in dev script, so we need to either mute those warnings or replace regex by ad-hoc functions, but it will sacrifice maintainability. @KJhellico please let me know if the PR requires more polishing. |
KJhellico
left a comment
There was a problem hiding this comment.
This PR is a great boost for the development of school holidays support! Although I am in favor of an "algorithmic" approach to implementing this, we can deal with that later, using this data as a solid foundation.
A couple of practical suggestions:
It's not a critical issue. And please don't change commit history (with force push) when making further changes. |
Co-authored-by: ~Jhellico <KJhellico@users.noreply.github.com> Signed-off-by: Dmitry Tretyakov <dtretyakov@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@holidays/countries/germany.py`:
- Around line 181-185: The loop that reads GERMANY_SCHOOL_HOLIDAYS currently
only looks up GERMANY_SCHOOL_HOLIDAYS.get(self._year, {}).get(subdiv, ()) so
cross-year breaks that start in the previous year (e.g., a Christmas break
starting in year-1 and ending in self._year) are missed; update the iteration to
also include entries from GERMANY_SCHOOL_HOLIDAYS.get(self._year - 1,
{}).get(subdiv, ()) (or merge both year dicts) and still apply the same
start/end normalization (start_date, end_date, active_start, active_end) so
January continuations from prior-year records are considered when populating
self._year.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0a563e84-65ba-4593-965f-e2a0c3b8101e
📒 Files selected for processing (1)
holidays/countries/germany.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
holidays/countries/germany.py (1)
173-191:⚠️ Potential issue | 🟠 MajorInclude prior-year school records for January continuations.
This still only reads records keyed by
self._year. Since generated ranges are stored under the start year, breaks like Bavaria’s 1990 Christmas break ending on 1991-01-07 are skipped when populating 1991.🐛 Proposed fix
- for ( - start_year_offset, - start_month, - start_day, - end_year_offset, - end_month, - end_day, - name, - ) in GERMANY_SCHOOL_HOLIDAYS.get(self._year, {}).get(subdiv, ()): - start_date = date(self._year + start_year_offset, start_month, start_day) - end_date = date(self._year + end_year_offset, end_month, end_day) - active_start = max(start_date, date(self._year, JAN, 1)) - active_end = min(end_date, date(self._year, DEC, 31)) - if active_start > active_end: - continue - if self._add_holiday(name, active_start): - duration_days = (active_end - active_start).days - if duration_days: - self._add_multiday_holiday(active_start, duration_days, name=name) + for base_year in (self._year - 1, self._year): + for ( + start_year_offset, + start_month, + start_day, + end_year_offset, + end_month, + end_day, + name, + ) in GERMANY_SCHOOL_HOLIDAYS.get(base_year, {}).get(subdiv, ()): + start_date = date(base_year + start_year_offset, start_month, start_day) + end_date = date(base_year + end_year_offset, end_month, end_day) + active_start = max(start_date, date(self._year, JAN, 1)) + active_end = min(end_date, date(self._year, DEC, 31)) + if active_start > active_end: + continue + if self._add_holiday(name, active_start): + duration_days = (active_end - active_start).days + if duration_days: + self._add_multiday_holiday(active_start, duration_days, name=name)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@holidays/countries/germany.py` around lines 173 - 191, The loop only reads GERMANY_SCHOOL_HOLIDAYS entries keyed by self._year, so multi-day holidays that start in the prior year and continue into January are missed; fix by iterating over both year keys (self._year and self._year - 1) for the given subdiv, e.g. gather records = GERMANY_SCHOOL_HOLIDAYS.get(self._year, {}) and records_prev = GERMANY_SCHOOL_HOLIDAYS.get(self._year - 1, {}) and loop over both sets; when constructing start_date/end_date use the corresponding year_key (year_key + start_year_offset and year_key + end_year_offset) rather than always self._year so _add_holiday and _add_multiday_holiday see January continuations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@holidays/countries/germany.py`:
- Around line 173-191: The loop only reads GERMANY_SCHOOL_HOLIDAYS entries keyed
by self._year, so multi-day holidays that start in the prior year and continue
into January are missed; fix by iterating over both year keys (self._year and
self._year - 1) for the given subdiv, e.g. gather records =
GERMANY_SCHOOL_HOLIDAYS.get(self._year, {}) and records_prev =
GERMANY_SCHOOL_HOLIDAYS.get(self._year - 1, {}) and loop over both sets; when
constructing start_date/end_date use the corresponding year_key (year_key +
start_year_offset and year_key + end_year_offset) rather than always self._year
so _add_holiday and _add_multiday_holiday see January continuations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f34e9438-6d50-4739-9aa3-f9c4f8c4c3ed
📒 Files selected for processing (1)
holidays/countries/germany.py
There was a problem hiding this comment.
♻️ Duplicate comments (2)
scripts/calendar/germany_school_holidays_generator.py (2)
449-484: 🧹 Nitpick | 🔵 Trivial
normalize_ranges()still over the cognitive-complexity threshold.Sonar flagged this at 24 (limit 15) previously. The body still mixes parsing, year clipping, tuple rendering, and dedupe in one nested loop. Lifting the per-cell expansion into a helper that yields the 7-tuples would let this function focus on aggregation and dedup.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/calendar/germany_school_holidays_generator.py` around lines 449 - 484, The normalize_ranges function is doing too much (parsing cells, resolving years, clipping to calendar years, and building 7-tuples) which pushes cognitive complexity over the threshold; extract the per-cell expansion into a helper (e.g., _expand_cell_ranges or expand_cell_ranges) that takes (column_key, year_label, cell) and yields the 7-tuple entries (start_year_offset, start_month, start_day, end_year_offset, end_month, end_day, holiday_id) by using parse_cell_ranges, _resolve_years and HOLIDAY_IDS, and performing the year clipping there; then simplify normalize_ranges to iterate sources -> rows -> subdivisions -> use the new helper to extend data[year][subdiv], and keep the existing dedupe/sort step and final dict construction.
329-377: 🧹 Nitpick | 🔵 Trivial
_parse_pdf_table()still trips Sonar's cognitive-complexity gate.Sonar previously failed this at complexity 17 and the function still bundles header validation, row normalization, subdivision-set checks, and school-year detection. Extracting helpers for row building and subdivision validation would clear the gate without behavior change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/calendar/germany_school_holidays_generator.py` around lines 329 - 377, The _parse_pdf_table function is too complex — extract the row-processing and subdivision-check logic into small helpers to lower cognitive complexity without changing behavior: create a helper _build_rows(table_rows, column_specs, path) that encapsulates the loop that maps raw_row -> (subdiv -> list of (column, year_label, normalized_cell)) and raises the same ValueError when column counts mismatch (use symbols row_cells, columns, column_specs, _normalize_cell, _normalize_state_label, STATE_LABEL_TO_CODE), and create a helper _validate_subdivision_set(rows, path) that performs the missing/extra subdivision checks and raises the identical ValueError when the sets differ (use EXPECTED_SUBDIVISIONS and rows.keys()); call these helpers from _parse_pdf_table and keep the school_year detection and return semantics unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@scripts/calendar/germany_school_holidays_generator.py`:
- Around line 449-484: The normalize_ranges function is doing too much (parsing
cells, resolving years, clipping to calendar years, and building 7-tuples) which
pushes cognitive complexity over the threshold; extract the per-cell expansion
into a helper (e.g., _expand_cell_ranges or expand_cell_ranges) that takes
(column_key, year_label, cell) and yields the 7-tuple entries
(start_year_offset, start_month, start_day, end_year_offset, end_month, end_day,
holiday_id) by using parse_cell_ranges, _resolve_years and HOLIDAY_IDS, and
performing the year clipping there; then simplify normalize_ranges to iterate
sources -> rows -> subdivisions -> use the new helper to extend
data[year][subdiv], and keep the existing dedupe/sort step and final dict
construction.
- Around line 329-377: The _parse_pdf_table function is too complex — extract
the row-processing and subdivision-check logic into small helpers to lower
cognitive complexity without changing behavior: create a helper
_build_rows(table_rows, column_specs, path) that encapsulates the loop that maps
raw_row -> (subdiv -> list of (column, year_label, normalized_cell)) and raises
the same ValueError when column counts mismatch (use symbols row_cells, columns,
column_specs, _normalize_cell, _normalize_state_label, STATE_LABEL_TO_CODE), and
create a helper _validate_subdivision_set(rows, path) that performs the
missing/extra subdivision checks and raises the identical ValueError when the
sets differ (use EXPECTED_SUBDIVISIONS and rows.keys()); call these helpers from
_parse_pdf_table and keep the school_year detection and return semantics
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: cb2e14c4-dc50-46b1-948d-ac515d140d80
📒 Files selected for processing (4)
holidays/calendars/germany_school.pyholidays/countries/germany.pyscripts/calendar/germany_school_holidays_generator.pytests/countries/test_germany.py
Pass Germany instances inline in test_school_and_public_categories, test_public_category_excludes_school_holidays, and all test_l10n_*_school methods, removing intermediate variables per project style.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
scripts/calendar/germany_school_holidays_generator.py (3)
449-484: 🛠️ Refactor suggestion | 🟠 MajorExtract the per-cell expansion to clear
normalize_rangescomplexity.Sonar still fails this at cognitive complexity 24/15. Move the inner expansion (parse → resolve → year-clip → tuple emit) into a helper that yields the 7-tuples, and keep
normalize_rangesas aggregation/dedup/sort only.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/calendar/germany_school_holidays_generator.py` around lines 449 - 484, The normalize_ranges function is too complex; extract the per-cell expansion loop (parse_cell_ranges → _resolve_years → year clipping → emitting the 7-tuple with HOLIDAY_IDS) into a new generator helper (e.g., expand_cell_ranges or iter_cell_tuples) that yields (start_year_offset, start_month, start_day, end_year_offset, end_month, end_day, holiday_id) for a given column_key, year_label and cell; then have normalize_ranges call that helper to extend data[year][subdiv] for each yielded tuple and keep the rest (dedupe/sort and final dict construction) unchanged so cognitive complexity of normalize_ranges is reduced.
219-221:⚠️ Potential issue | 🟠 MajorDon't unlink arbitrary PDFs from a caller-supplied
--input-dir.The pruning loop is unguarded, so when
--input-dirpoints at any directory the user maintains, every non-indexed💡 Proposed fix
- existing_paths = {path.name: path for path in raw_pdf_dir.glob("*.pdf")} - for extra_name in sorted(existing_paths.keys() - indexed_pdfs.keys()): - existing_paths[extra_name].unlink() + if raw_pdf_dir == DEFAULT_RAW_PDF_DIR: + existing_paths = {path.name: path for path in raw_pdf_dir.glob("*.pdf")} + for extra_name in sorted(existing_paths.keys() - indexed_pdfs.keys()): + existing_paths[extra_name].unlink()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/calendar/germany_school_holidays_generator.py` around lines 219 - 221, The current loop builds existing_paths from raw_pdf_dir.glob("*.pdf") and then unlinks any names not in indexed_pdfs, which can delete arbitrary PDFs in a caller-supplied --input-dir; instead restrict deletion to the generator's managed cache or only files the generator previously created: change the source of files checked from raw_pdf_dir to the internal cache directory (or a dedicated managed subdirectory), or filter existing_paths to only include files with the generator's filename pattern/marker or those recorded in a “created_files” set, then iterate over that restricted set and call unlink on extra_name; update references to existing_paths, raw_pdf_dir, indexed_pdfs, extra_name, and unlink accordingly so only controlled files are removed.
329-377: 🛠️ Refactor suggestion | 🟠 MajorSplit
_parse_pdf_tableto clear the SonarCloud failure.This still combines extraction, header validation, row normalization, subdivision validation, and school-year detection — Sonar fails it at cognitive complexity 17/15. Extracting the row-building loop and the subdivision-set check into helpers should drop it under threshold without behavior change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/calendar/germany_school_holidays_generator.py` around lines 329 - 377, The function _parse_pdf_table is too complex; extract the row-processing loop that builds rows[subdiv] and the subdivision validation block into two small helpers (e.g., _build_rows_from_table(table_rows, column_specs, path) -> dict and _validate_subdivisions(rows, path) -> None) so _parse_pdf_table only orchestrates extraction, header/column checks and school-year detection; ensure the helpers accept the same inputs used in the loop (raw table rows, column_specs, EXPECTED_SUBDIVISIONS, STATE_LABEL_TO_CODE, _normalize_state_label, _normalize_cell) and preserve all existing ValueError messages and return types (start_year, rows) so behavior doesn't change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/calendar/germany_school_holidays_generator.py`:
- Around line 520-525: The generated calendar module currently appends a
"__all__ = (...)" block to the output; remove that append so the generator no
longer emits any "__all__" declaration (delete the code that pushes the string
starting with "__all__ = (" and its associated lines into the generated file
content). Locate the string literal "__all__ = (" in the generator and remove
the block that writes "__all__" (and any trailing closing lines), ensuring the
rest of the generated constants and functions (e.g., AUTUMN_BREAK,
CHRISTMAS_BREAK, GERMANY_SCHOOL_HOLIDAYS) are left untouched so the produced
module matches the other calendar modules.
---
Duplicate comments:
In `@scripts/calendar/germany_school_holidays_generator.py`:
- Around line 449-484: The normalize_ranges function is too complex; extract the
per-cell expansion loop (parse_cell_ranges → _resolve_years → year clipping →
emitting the 7-tuple with HOLIDAY_IDS) into a new generator helper (e.g.,
expand_cell_ranges or iter_cell_tuples) that yields (start_year_offset,
start_month, start_day, end_year_offset, end_month, end_day, holiday_id) for a
given column_key, year_label and cell; then have normalize_ranges call that
helper to extend data[year][subdiv] for each yielded tuple and keep the rest
(dedupe/sort and final dict construction) unchanged so cognitive complexity of
normalize_ranges is reduced.
- Around line 219-221: The current loop builds existing_paths from
raw_pdf_dir.glob("*.pdf") and then unlinks any names not in indexed_pdfs, which
can delete arbitrary PDFs in a caller-supplied --input-dir; instead restrict
deletion to the generator's managed cache or only files the generator previously
created: change the source of files checked from raw_pdf_dir to the internal
cache directory (or a dedicated managed subdirectory), or filter existing_paths
to only include files with the generator's filename pattern/marker or those
recorded in a “created_files” set, then iterate over that restricted set and
call unlink on extra_name; update references to existing_paths, raw_pdf_dir,
indexed_pdfs, extra_name, and unlink accordingly so only controlled files are
removed.
- Around line 329-377: The function _parse_pdf_table is too complex; extract the
row-processing loop that builds rows[subdiv] and the subdivision validation
block into two small helpers (e.g., _build_rows_from_table(table_rows,
column_specs, path) -> dict and _validate_subdivisions(rows, path) -> None) so
_parse_pdf_table only orchestrates extraction, header/column checks and
school-year detection; ensure the helpers accept the same inputs used in the
loop (raw table rows, column_specs, EXPECTED_SUBDIVISIONS, STATE_LABEL_TO_CODE,
_normalize_state_label, _normalize_cell) and preserve all existing ValueError
messages and return types (start_year, rows) so behavior doesn't change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 06aeade7-65cb-428d-b822-bc5dcf89a4b4
📒 Files selected for processing (4)
holidays/calendars/germany_school.pyholidays/countries/germany.pyscripts/calendar/germany_school_holidays_generator.pytests/countries/test_germany.py
- Fix generator OUTPUT_PATH: write to germany_school_dates.py (throwaway intermediate file) instead of the committed germany_school.py module. Update docstring to make the two-step workflow explicit for future maintainers (generate → diff → manually update germany_school.py). - Refactor test_l10n_*_school methods to compact loop style with one representative date per holiday type, consistent with other l10n tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use unicode escape sequences (\u2013 and \u2014) instead of literal dash characters to prevent the new normalize-unicode pre-commit hook from inadvertently modifying the script's replace arguments. Also update the docstring to use a standard hyphen.
|
@KJhellico I fixed the unicode symbols in generator to comply with |
KJhellico
left a comment
There was a problem hiding this comment.
Last small fix, and rest is LGTM.
Co-authored-by: ~Jhellico <KJhellico@users.noreply.github.com> Signed-off-by: Dmitry Tretyakov <dtretyakov@users.noreply.github.com>
PPsyrius
left a comment
There was a problem hiding this comment.
LGTM 🇩🇪
Great implementation 🚀
|
@cubic-dev-ai any suggestions? |
|
@arkid15r This is a solid implementation. Adding school holidays as a separate category while keeping the runtime logic minimal by using pre-generated data is the right approach for this library. Here are a few suggestions to polish the implementation: 1. Simplify Holiday Addition LogicIn You can simplify this to a single call that handles both single-day and multi-day holidays: # holidays/countries/germany.py
for (...) in GERMANY_SCHOOL_HOLIDAYS.get(self._year, {}).get(subdiv, ()):
name = school_holiday_names[holiday_id]
start_date = date(self._year + start_year_offset, start_month, start_day)
end_date = date(self._year + end_year_offset, end_month, end_day)
active_start = max(start_date, date(self._year, JAN, 1))
active_end = min(end_date, date(self._year, DEC, 31))
# This single call handles both 1-day (days=0) and multi-day ranges
self._add_multiday_holiday(active_start, (active_end - active_start).days, name=name)2. Consistency in Subdivision AliasesThe logic 3. Generator Script RobustnessThe generator script
4. Translation FilesThe 5. Documentation ExampleIn The test coverage is excellent, especially the internal consistency checks for the dataset. |
arkid15r
left a comment
There was a problem hiding this comment.
Thanks for adding this @dtretyakov 👍
Great work 👏
Proposed change
This PR adds Germany school holidays support via the
SCHOOLcategory (сloses #249).Approach:
SCHOOLtoGermany.supported_categories;holidays/countries/germany.pysmall and canonical by integrating with the existing category flow;holidays/countries/germany_school_holidays.py;Implementation notes:
Augsburgreuses Bavaria (BY) school holiday data, consistent with the existing special-case subdivision handling;Type of change
holidaysfunctionality in general)Checklist
make checklocally; all checks and tests passed.