Skip to content

fix(aws-network-mcp-server): handle Regional NAT Gateways in process_nat_gateways#3525

Open
Garou11 wants to merge 1 commit into
awslabs:mainfrom
Garou11:fix/regional-nat-gateway-process-nat-gateways
Open

fix(aws-network-mcp-server): handle Regional NAT Gateways in process_nat_gateways#3525
Garou11 wants to merge 1 commit into
awslabs:mainfrom
Garou11:fix/regional-nat-gateway-process-nat-gateways

Conversation

@Garou11
Copy link
Copy Markdown

@Garou11 Garou11 commented May 14, 2026

Summary

get_vpc_network (and any other tool that calls process_nat_gateways) raises KeyError: 'SubnetId' — and in async contexts can manifest as a complete hang — for any VPC that contains a Regional NAT Gateway. This PR makes process_nat_gateways regional-NAT-safe with three small defensive changes.

Background

Regional NAT Gateway is a 2026 addition to the EC2 API. Unlike the original ("zonal") NAT Gateway:

  • A regional NAT does not have a SubnetId field. It binds to a route table via RouteTableId and auto-provisions ENIs across AZs.
  • Entries in NatGatewayAddresses for regional NATs may legitimately omit PrivateIp / PublicIp while addresses are being provisioned.

The current process_nat_gateways in awslabs/aws_network_mcp_server/utils/vcp_details.py makes three assumptions that don't hold for regional NATs:

'subnet_id': nat['SubnetId'],                 # KeyError on regional NAT
for address in nat['NatGatewayAddresses']:    # missing key risk on shape variations
    gw.private_ips.append(address['PrivateIp'])  # KeyError when not yet provisioned
    gw.public_ips.append(address['PublicIp'])    # KeyError when not yet provisioned

Plus the strict pydantic dataclass:

@dataclass
class NatGatewayDict:
    subnet_id: str   # rejects None even after the .get() fix below

This PR fixes all four together so callers see a clean response with subnet_id=None for regional NATs instead of an exception (which, depending on how a downstream framework propagates the resulting ToolError, can manifest to MCP clients as an indefinite hang on the affected VPC).

Changes

src/aws-network-mcp-server/awslabs/aws_network_mcp_server/utils/vcp_details.py:

  1. NatGatewayDict.subnet_id: strOptional[str] — a regional NAT genuinely has no subnet, so the field has to admit None. (Optional is already imported.)
  2. nat['SubnetId']nat.get('SubnetId') — surface None instead of KeyError.
  3. nat['NatGatewayAddresses']nat.get('NatGatewayAddresses', []) and per-key if 'PrivateIp' in address / if 'PublicIp' in address guards — same defensive shape as fix(aws-network-mcp-server): handle missing PublicIp in NAT gateway addresses #2876, applied here together so we don't end up with a half-fix.

Diff is ~14 lines.

Relationship to existing work

Test plan

Reproduction (without the fix):

  • Create a Regional NAT in any VPC (aws ec2 create-nat-gateway --route-table-id rtb-… …)
  • From an MCP client, call get_vpc_network against that VPC
  • Observed: error / hang. The strict dataclass also rejects None for subnet_id even if you guard the dict access.

With the fix:

  • Same call returns a clean vpc_network document, with the regional NAT's subnet_id set to null.
  • Zonal NATs unchanged (the .get() returns the existing string; the dataclass still accepts it).
  • A NAT mid-provisioning with empty NatGatewayAddresses — or addresses lacking either key — no longer crashes; private_ips / public_ips simply skip the missing entries.

I can add a unit test against process_nat_gateways with synthetic zonal + regional + provisioning-in-progress inputs if reviewers prefer; happy to do that in this PR or as a follow-up.

Notes for downstream

Consumers that read subnet_id should already handle None (regional NATs are a property of aws ec2 describe-nat-gateways output and exist whether or not this MCP server reflects them). For downstream that wants to surface where a regional NAT lives, the RouteTableId field on the upstream NAT entry is the equivalent. That's out of scope for this PR — happy to add it as a follow-up if useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: To triage

Development

Successfully merging this pull request may close these issues.

1 participant