Skip to content

feat: add cns iptables reconciliation #3885

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

QxBytes
Copy link
Contributor

@QxBytes QxBytes commented Aug 1, 2025

Reason for Change:

CNS currently programs snat iptables rules only on NNC update/create. This PR makes it so that it also does so during cns bootup. Additionally, CNS will now attempt to "reconcile" the iptables on the node to what it expects. This allows us to "rollback" future updates to CNS iptables rules to match the expected state each time CNS reboots or restarts.

It is currently possible for CNI to add iptables rules to the node using chain "SWIFT". We want to jump to the cns-controlled chain "SWIFT-POSTROUTING" before we jump to the chain "SWIFT". The first piece of logic is to migrate to that:

Case 1: CNI already added a jump to SWIFT chain in nat POSTROUTING chain and it is at a higher priority than any SWIFT-POSTROUTING jump we may have added: We find the position of the SWIFT jump rule, and insert our SWIFT-POSTROUTING jump at that position, pushing the SWIFT jump rule to a lower priority

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT
jump to SWIFT-POSTROUTING

becomes

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT-POSTROUTING
jump to SWIFT

or

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT

becomes

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT-POSTROUTING
jump to SWIFT

Case 2: The SWIFT-POSTROUTING jump rule exists before the SWIFT jump rule, or exists without any SWIFT jump rule: We do not modify the nat POSTROUTING chain

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT-POSTROUTING
jump to SWIFT

or

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT-POSTROUTING

stays the same

Case 3: The nat POSTROUTING chain does not contain a SWIFT jump rule nor a SWIFT-POSTROUTING rule: We append the SWIFT-POSTROUTING jump rule to the nat POSTROUTING chain

POSTROUTING
jump to KUBE POSTROUTING
...

becomes

POSTROUTING
jump to KUBE POSTROUTING
...
jump to SWIFT-POSTROUTING

Now, packets will go to the nat SWIFT-POSTROUTING chain first. We then take a look at the SWIFT-POSTROUTING chain.
Case 1: The nat SWIFT-POSTROUTING chain does not contain every single rule we expect exactly, or the # of rules in the SWIFT-POSTROUTING chain does not match how many rules we expect: Flush the chain and reinstall the iptables rules we expect.

Case 2: The nat SWIFT-POSTROUTING chain contains every single rule we expect exactly, and the # of rules matches how many rules we expect: We do not modify or flush the SWIFT-POSTROUTING chain

Issue Fixed:

Requirements:

Notes:

  • Check iptables insert of bounds just in case
  • When we list iptables rules, the first rule is always -N ... for custom chains or -P ... for built in chains (not shown in iptables -nvL -t nat). Feel free to check-- the library calls iptables -t <table> -S <chain> under the hood.
  • When we list iptables rules, the rules use -A syntax when listed no matter how they were added (via insert, append, etc.)
  • IPTables is 1-indexed, so insert at position 1 will be the first rule in the chain-- so when using iptables -nvL -t nat the first rule that shows up in a chain is at index 1.

@QxBytes QxBytes self-assigned this Aug 1, 2025
@QxBytes QxBytes added the cns Related to CNS. label Aug 1, 2025
@QxBytes QxBytes requested a review from Copilot August 1, 2025 23:54
Copilot

This comment was marked as outdated.

@QxBytes QxBytes requested a review from Copilot August 2, 2025 00:01
Copilot

This comment was marked as outdated.

@QxBytes QxBytes force-pushed the alew/add-cns-reconciliation branch from a00472e to f4af45b Compare August 2, 2025 00:09
@QxBytes QxBytes requested a review from Copilot August 2, 2025 00:09
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds iptables reconciliation functionality to CNS, enabling it to restore iptables rules to the expected state during startup and handle migration from legacy SWIFT chains to SWIFT-POSTROUTING chains. The changes ensure CNS can recover from external iptables modifications and properly sequence rule execution.

  • Adds iptables reconciliation during CNS bootup to restore expected SNAT rules
  • Implements migration logic to transition from legacy SWIFT chain to SWIFT-POSTROUTING chain with proper priority ordering
  • Enhances the iptables mock implementation to support new operations needed for testing

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
cns/restserver/util.go Adds reconciliation logic during CNS startup to reprogram SNAT rules for all containers
cns/restserver/restserver.go Extends iptables interface with List, ClearChain, and Delete methods
cns/restserver/internalapi_linux.go Implements comprehensive iptables reconciliation and migration logic from SWIFT to SWIFT-POSTROUTING
cns/restserver/internalapi_linux_test.go Adds extensive tests covering migration scenarios and rule reconciliation
cns/fakes/iptablesfake.go Enhances mock implementation with proper iptables simulation and tracking
Comments suppressed due to low confidence (1)

cns/fakes/iptablesfake.go:192

  • [nitpick] The method name 'ClearChainCallCount' is ambiguous - it could mean either getting the count or clearing the count. Consider renaming to 'GetClearChainCallCount' to clearly indicate it's a getter method.
func (c *IPTablesMock) ClearChainCallCount() int {

Comment on lines +92 to +93
logger.Printf("[Azure CNS] Inserting SWIFT-POSTROUTING Chain at iptables position %d", swiftRuleIndex)
err = ipt.Insert(iptables.Nat, iptables.Postrouting, swiftRuleIndex, "-j", SWIFTPOSTROUTING)
Copy link
Preview

Copilot AI Aug 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The insert position calculation may be incorrect. The comment indicates that swiftRuleIndex gives the correct 1-indexed iptables position, but if swiftRuleIndex is len(rules), this could exceed valid iptables positions. Consider adding bounds checking or using swiftRuleIndex+1 when swiftRuleIndex equals len(rules).

Suggested change
logger.Printf("[Azure CNS] Inserting SWIFT-POSTROUTING Chain at iptables position %d", swiftRuleIndex)
err = ipt.Insert(iptables.Nat, iptables.Postrouting, swiftRuleIndex, "-j", SWIFTPOSTROUTING)
// Ensure insert position does not exceed valid iptables positions
insertPos := swiftRuleIndex
if swiftRuleIndex >= len(rules) {
insertPos = len(rules)
}
logger.Printf("[Azure CNS] Inserting SWIFT-POSTROUTING Chain at iptables position %d", insertPos)
err = ipt.Insert(iptables.Nat, iptables.Postrouting, insertPos, "-j", SWIFTPOSTROUTING)

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

@QxBytes QxBytes Aug 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rules correspond to what is received from List(), which includes the built-in -P or -N rule. So, if you see one rule with iptables -nvL -t nat, List gives us the built in rule (-P or -N) and also the one rule we see, leading to a length of 2. If we insert using iptables index 2, it ends up at the end of the chain (after the one rule we see from iptables -nvL -t nat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cns Related to CNS.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant