Skip to content

Populate HostMetric.used_in_inventories with actual inventory counts#16436

Open
tll3r wants to merge 1 commit into
ansible:develfrom
tll3r:fix/populate-used-in-inventories
Open

Populate HostMetric.used_in_inventories with actual inventory counts#16436
tll3r wants to merge 1 commit into
ansible:develfrom
tll3r:fix/populate-used-in-inventories

Conversation

@tll3r

@tll3r tll3r commented Apr 29, 2026

Copy link
Copy Markdown

Summary

The used_in_inventories field on HostMetric was introduced in PR #13560 (Feb 2023) but the backend logic to populate it was never implemented, leaving the field permanently null since AAP 2.4. The UI column was subsequently removed in PR #13782 with the note "Revert this commit once the backend is ready."

This PR adds the missing backend:

  • HostMetricInventoryCountTask: counts distinct inventories per hostname by correlating Host.name with HostMetric.hostname via a single bulk UPDATE with a correlated subquery
  • update_host_metric_inventory_counts: scheduled task (dispatched every 4 hours, with an internal 7-day run-threshold guard matching the existing host_metric_summary_monthly pattern)
  • HOST_METRIC_INVENTORY_COUNTS_LAST_TS: persisted via the conf registry so the run-threshold survives restarts
  • awx-manage update_host_metric_inventory_counts: management command for on-demand execution
  • Functional tests: 6 tests covering zero hosts, single inventory, multiple inventories, mixed hosts, idempotency, and inventory membership changes

Related Issues

Note: This is a re-submission of #16355, which was accidentally closed when the fork was deleted.

Test plan

  • awx-manage update_host_metric_inventory_counts populates the field correctly
  • Scheduled task runs on the configured interval
  • GET /api/v2/host_metrics/?order_by=-used_in_inventories returns correct non-null values
  • All existing host_metrics tests still pass
  • New functional tests pass: pytest awx/main/tests/functional/commands/test_update_host_metric_inventory_counts.py

Made with Cursor

Summary by CodeRabbit

Release Notes

  • New Features

    • Added automatic tracking and periodic updates of inventory counts per host.
    • New management command enables manual triggering of host metric inventory count updates.
  • Chores

    • Registered background scheduler task for periodic host metric inventory count recalculation.

The used_in_inventories field on HostMetric was introduced in PR ansible#13560
but the backend logic to populate it was never implemented, leaving the
field permanently null since AAP 2.4.

This adds:
- HostMetricInventoryCountTask: counts distinct inventories per hostname
  by correlating Host.name with HostMetric.hostname
- update_host_metric_inventory_counts: weekly scheduled task that invokes
  the count update (controlled via HOST_METRIC_INVENTORY_COUNTS_INTERVAL)
- DISPATCHER_SCHEDULE entry so the task runs automatically every 4 hours
  (with an internal 7-day threshold guard matching the existing pattern)
- HOST_METRIC_INVENTORY_COUNTS_LAST_TS registered in conf for persistence
- awx-manage update_host_metric_inventory_counts: management command for
  on-demand execution
- Functional tests covering zero, single, multiple inventories,
  mixed hosts, idempotency, and inventory membership changes

Resolves the incomplete feature tracked in AWX PRs ansible#13560, ansible#13782, ansible#13999.

Made-with: Cursor
@coderabbitai

coderabbitai Bot commented Apr 29, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

A new scheduled task is introduced to recalculate and populate HostMetric.used_in_inventories counts, tracking how many inventories each host belongs to. The implementation includes configuration settings, a Django management command for manual execution, and comprehensive test coverage.

Changes

Cohort / File(s) Summary
Configuration & Settings
awx/main/conf.py, awx/settings/defaults.py
Registers new DateTime configuration field HOST_METRIC_INVENTORY_COUNTS_LAST_TS to track the last task execution time, and adds interval constant HOST_METRIC_INVENTORY_COUNTS_INTERVAL set to 1 day.
Task Implementation
awx/main/tasks/host_metrics.py
Introduces update_host_metric_inventory_counts task and HostMetricInventoryCountTask class with static method update_counts() that recalculates host metric inventory counts using advisory locks and settings-based interval gating, updating only changed records.
Management Command
awx/main/management/commands/update_host_metric_inventory_counts.py
Adds Django management command that invokes HostMetricInventoryCountTask.update_counts() for manual execution and reports updated record count.
Test Coverage
awx/main/tests/functional/commands/test_update_host_metric_inventory_counts.py
Comprehensive functional test suite covering initialization, single/multiple inventory counting, mixed scenarios, idempotence, and inventory membership changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 69.23% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main objective of the pull request: implementing functionality to populate the HostMetric.used_in_inventories field with actual inventory counts.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@awx/main/tasks/host_metrics.py`:
- Around line 309-333: The current code materializes a huge counts dict and
issues two separate UPDATEs using hostname__in=counts.keys(), which is
unscalable and racy; replace both steps with a single set-based update that uses
a correlated subquery and Coalesce to compute inventory membership per
Host.hostname and write 0 when no match. Build inventory_count_subquery like the
existing inventory_count_subquery
(Host.objects.filter(name=OuterRef('hostname')).annotate(cnt=Count('inventory_id',
distinct=True)).values('cnt')), then perform one update:
HostMetric.objects.exclude(used_in_inventories=Coalesce(Subquery(inventory_count_subquery),
Value(0))).update(used_in_inventories=Coalesce(Subquery(inventory_count_subquery),
Value(0))) so you never materialize counts in Python and only update rows whose
value will actually change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 9a660f42-9b98-4453-9cca-6f263c052177

📥 Commits

Reviewing files that changed from the base of the PR and between eac8968 and 1749ace.

📒 Files selected for processing (5)
  • awx/main/conf.py
  • awx/main/management/commands/update_host_metric_inventory_counts.py
  • awx/main/tasks/host_metrics.py
  • awx/main/tests/functional/commands/test_update_host_metric_inventory_counts.py
  • awx/settings/defaults.py

Comment on lines +309 to +333
counts = dict(
Host.objects.values('name')
.annotate(cnt=Count('inventory_id', distinct=True))
.values_list('name', 'cnt')
)

rows = 0

# Set hosts with inventory membership to their actual count (only where changed)
if counts:
inventory_count_subquery = (
Host.objects.filter(name=OuterRef('hostname'))
.values('name')
.annotate(cnt=Count('inventory_id', distinct=True))
.values('cnt')
)
rows += HostMetric.objects.filter(hostname__in=counts.keys()).exclude(
used_in_inventories=Subquery(inventory_count_subquery)
).update(used_in_inventories=Subquery(inventory_count_subquery))

# Set hosts without any inventory membership to 0 (only where not already 0)
rows += HostMetric.objects.exclude(hostname__in=counts.keys()).exclude(
used_in_inventories=0
).update(used_in_inventories=0)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid materializing full host-name counts into Python for updates.

counts = dict(...) plus hostname__in=counts.keys() does not scale for large datasets and can produce very large IN (...) clauses; it also introduces a stale key-set window across two UPDATEs. Use one correlated subquery with Coalesce(..., 0) for a single set-based update.

Proposed refactor
 class HostMetricInventoryCountTask:
@@
     `@staticmethod`
     def update_counts():
-        counts = dict(
-            Host.objects.values('name')
-            .annotate(cnt=Count('inventory_id', distinct=True))
-            .values_list('name', 'cnt')
-        )
-
-        rows = 0
-
-        # Set hosts with inventory membership to their actual count (only where changed)
-        if counts:
-            inventory_count_subquery = (
-                Host.objects.filter(name=OuterRef('hostname'))
-                .values('name')
-                .annotate(cnt=Count('inventory_id', distinct=True))
-                .values('cnt')
-            )
-            rows += HostMetric.objects.filter(hostname__in=counts.keys()).exclude(
-                used_in_inventories=Subquery(inventory_count_subquery)
-            ).update(used_in_inventories=Subquery(inventory_count_subquery))
-
-        # Set hosts without any inventory membership to 0 (only where not already 0)
-        rows += HostMetric.objects.exclude(hostname__in=counts.keys()).exclude(
-            used_in_inventories=0
-        ).update(used_in_inventories=0)
+        inventory_count_subquery = (
+            Host.objects.filter(name=OuterRef('hostname'))
+            .values('name')
+            .annotate(cnt=Count('inventory_id', distinct=True))
+            .values('cnt')[:1]
+        )
+        computed_count = Coalesce(Subquery(inventory_count_subquery), Value(0))
+
+        rows = HostMetric.objects.exclude(
+            used_in_inventories=computed_count
+        ).update(used_in_inventories=computed_count)

         settings.HOST_METRIC_INVENTORY_COUNTS_LAST_TS = now()
         logger.info(f"update_host_metric_inventory_counts: updated {rows} HostMetric records")
         return rows
As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
counts = dict(
Host.objects.values('name')
.annotate(cnt=Count('inventory_id', distinct=True))
.values_list('name', 'cnt')
)
rows = 0
# Set hosts with inventory membership to their actual count (only where changed)
if counts:
inventory_count_subquery = (
Host.objects.filter(name=OuterRef('hostname'))
.values('name')
.annotate(cnt=Count('inventory_id', distinct=True))
.values('cnt')
)
rows += HostMetric.objects.filter(hostname__in=counts.keys()).exclude(
used_in_inventories=Subquery(inventory_count_subquery)
).update(used_in_inventories=Subquery(inventory_count_subquery))
# Set hosts without any inventory membership to 0 (only where not already 0)
rows += HostMetric.objects.exclude(hostname__in=counts.keys()).exclude(
used_in_inventories=0
).update(used_in_inventories=0)
inventory_count_subquery = (
Host.objects.filter(name=OuterRef('hostname'))
.values('name')
.annotate(cnt=Count('inventory_id', distinct=True))
.values('cnt')[:1]
)
computed_count = Coalesce(Subquery(inventory_count_subquery), Value(0))
rows = HostMetric.objects.exclude(
used_in_inventories=computed_count
).update(used_in_inventories=computed_count)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@awx/main/tasks/host_metrics.py` around lines 309 - 333, The current code
materializes a huge counts dict and issues two separate UPDATEs using
hostname__in=counts.keys(), which is unscalable and racy; replace both steps
with a single set-based update that uses a correlated subquery and Coalesce to
compute inventory membership per Host.hostname and write 0 when no match. Build
inventory_count_subquery like the existing inventory_count_subquery
(Host.objects.filter(name=OuterRef('hostname')).annotate(cnt=Count('inventory_id',
distinct=True)).values('cnt')), then perform one update:
HostMetric.objects.exclude(used_in_inventories=Coalesce(Subquery(inventory_count_subquery),
Value(0))).update(used_in_inventories=Coalesce(Subquery(inventory_count_subquery),
Value(0))) so you never materialize counts in Python and only update rows whose
value will actually change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant