Markdown Version | Session Recording
Session Date/Time: 27 Jul 2023 22:30
rift
Summary
The rift meeting in San Francisco covered the progress of the base specification, key-value registry, auto-flood reflection, and YANG model. There was also discussion on RIFT extensions for SRv6, and DC routing for high-performance computing (HPC) specifically dragon fly topologies. The meeting discussed implementation efforts, potential interoperability testing, and future directions for the RIFT protocol.
Key Discussion Points
- Base Spec Progress: Readability improvements in version 18. Addressing AD comments related to applicability draft considerations such as misscabling, TTL values, and IPv4 broadcast usage. Focus on addressing all existing comments and reviews.
- Key Value Registry: Minor maintenance and editorial fixes. Dependent on the base spec being finalized.
- Auto-Flood Reflection: Fixed errors in ASCII topologies. Added matching SVGs. Completed the YANG model. Discussed working group adoption after the base spec is finalized.
- RIFT YANG Model: Nearing completion with minor modifications. Further review encouraged and has been submitted to the IESG. Will wait for base spec stability before further IESG processing.
- RIFT Extensions for SRv6 (SRv6 Locator KW TIE): Initial attempt to support SRv6 in data centers. Discussion on scalability challenges associated with flooding large key-value entries. Need to avoid broadcast approach and instead target configuration distribution. Suggestion to evolve the draft into a more generic mechanism for configuration distribution.
- DC Routing for High-Performance Compute (Dragonfly Topologies): Explored the applicability of RIFT to dragonfly topologies for HPC and potentially AI/ML clusters. Concerns about the complexity of BGP solutions. Agreement that RIFT could support dragonfly topologies with summarization. Discussion of horizontal links between spines. Concerns were raised about supporting non-equal cost multipath and minimizing latency for best performance in sparse dragonfly implementations. Congestion notification and awareness. Also discussed HPC versus ML use cases with Dragonfly.
Decisions and Action Items
- Jordan Head (Juniper): Double-check and address all outstanding AD comments on the base spec and reach out to Drew.
- Sanghi (ZTE): Update the working group on progress towards interoperability testing, and investigate the possibility of a RIFT hackathon in Prague.
- Tony P. (Juniper): Investigate how to address concerns on scaling the size of the locator key value tie in the SRv6 draft and how to generalize the draft into something more generic.
- Authors of RIFT extensions for SRv6 (SRv6 Locator KW TIE): Evolve the draft to be more generic to handle various use cases.
Next Steps
- Finalize the RIFT base specification.
- Progress towards working group adoption of auto-flood reflection.
- Continued discussion on applying RIFT to dragonfly topologies and high-performance computing environments in particular AI/ML and synchronization methodologies.
- Potential interim meeting to discuss primary/secondary approaches for state synchronization.