Markdown Version | Session Recording

Session Date/Time: 07 Nov 2023 08:30

# rtgwg Meeting Minutes - IETF 118 Prague

## Summary

The Routing Working Group (rtgwg) met at IETF 118 in Prague. The agenda covered several draft proposals including TILFA, Multi-Segment SD-WAN, Security Considerations for Packet Metadata, Hierarchical SFC with Segment Routing, SRv6 Reliability, Scalable Zero Touch Routing (Kira), Service ID for Addressing and Networking, AI Data Center Reliability, and Coordinated Congestion Management for AI Training Networks. Key discussions focused on the operational safety of TILFA, the need for normative or informational guidance for new proposals, and reliability issues specific to AI training networks.

## Key Discussion Points

*   **TILFA Draft (Stewart):**
    *   Area reviews raised concerns about operational unsafety without microloop avoidance.
    *   Discussion on whether post-convergence constraint should be optional.
    *   Debate on whether a combined TILFA/microloop avoidance document is necessary or if they should remain separate.
    *   Suggestion to include an operational section providing guidance on limitations and microloop mitigation strategies.

*   **Multi-Segment SD-WAN (Linda):**
    *   Security concerns regarding man-in-the-middle attacks on the Geneva header.
    *   Discussion of potential solutions: AH authentication, lightweight integrity checks (hashmap), or doing nothing.

*   **Security Considerations for Packet Metadata (Donald Eastlake):**
    *   Draft aims to provide security considerations for adding metadata to packets.
    *   Discussion on whether the draft should be normative or informational.  The chair suggested consideration of the document's goal to determine its progress.
    *   Key security considerations: minimization, encryption, obfuscation.

*   **Hierarchical SFC with Segment Routing (Presenter):**
    *   Presented problem statement, use cases, and requirements for hierarchical SFC.
    *   Discussion on active OAM in SFC and which entity is a function forwarder.

*   **SRv6 Reliability (Leon):**
    *   Proposed reliability mechanisms for SRv6 SFCs.
    *   Discussion on failure detection mechanisms (e.g., BFD) and potential routing loops.

*   **Scalable Zero Touch Routing (Kira) (Roland):**
    *   Presented Kira, a scalable zero-touch routing protocol for control planes.
    *   Discussion about the difference between Kira and Babble, ad-hoc networking, and data planes.
    *   Encouraged to engage with MANET community.

*   **Service ID for Addressing and Networking (Stan Von):**
    *   Presented the concept of service IDs for identifying user-oriented services across terminal networking clouds.
    *   Clarification requested on the relationship between service IDs and existing technologies like APN.

*   **AI Data Center Reliability (Presenter):**
    *   Highlighted the importance of network reliability in AI training.
    *   Discussion on existing reliability mechanisms and their limitations for AI training.
    *   Proposal to shorten fault detection and switchover times to sub-millisecond levels.

*   **Coordinated Congestion Management for AI Training Networks (Lily Lu):**
    *   Presented coordinated congestion management for AI training networks.
    *   Classified congestion types (fan-in, in-network, incast).
    *   Proposed distinct traffic and non-CC traffic.

## Decisions and Action Items

*   **TILFA:** Shepherd and co-authors to work on updated text addressing microloop concerns, potentially including an operational considerations section. Side meeting to be scheduled to focus on the issue.
*   **General:** The chair emphasized the importance of specifying the end goal of each draft.

## Next Steps

*   **TILFA:** Side meeting to be scheduled to discuss microloop avoidance strategies.
*   **Service ID and AI cluster drafts:** Discuss further on mailing list and at the side meetings
*   **General:** Authors to continue refining drafts based on feedback received.