Markdown Version | Session Recording
Session Date/Time: 29 Mar 2023 06:30
cats
Summary
The first meeting of the cats (Compute-Aware Traffic Steering) working group focused on reviewing the charter, milestones, and discussing initial draft contributions related to problem statements, use cases, requirements, a framework architecture, and compute resource modeling and distribution. The meeting highlighted the need for clear definitions, consolidated efforts across drafts, and consideration of existing work in related areas.
Key Discussion Points
- Charter and Milestones: The chairs emphasized focusing on deliverables and milestones in the charter, which include adopting drafts around problem statements, use cases, gap analysis (of existing protocols and requirements), and developing a framework and architecture. The chairs also cautioned against premature focus on solutions before establishing groundwork.
- Problem Statement and Use Cases: Two problem statement and use case drafts were presented. Concerns were raised about the realism and scope of certain use cases (e.g., autonomous vehicle braking), the lack of explicit mention of bandwidth considerations, the relevance of SFC use cases and potential inclusion of policy considerations. The chairs suggested merging the documents into a single problem statement and use case document, clarifying scope, and differentiating motivating, potential, and future use cases.
- Requirements and Gap Analysis: A draft outlining requirements and gap analysis was presented. Requirements include supporting dynamic access based on metrics, establishing a metric model for computing resources, defining a representation for computing resource status, maintaining session continuity and service affinity, and avoiding exposure of sensitive information. Gaps were identified in existing solutions regarding architecture, efficiency, and metric exposure.
- Framework and Architecture: A framework architecture draft was presented, defining functional components like cats routers (ingress and egress), agents for collecting metrics, and a pass selector. Key aspects include an overlay architecture, cats service IDs, and binding IDs. Concerns were raised about the definition of "compute" and the need for distributed vs centralized models.
- Compute Resource Modeling and Distribution: A hierarchical three-layer model for computing resource evaluation was presented, encompassing hardware, node-level, and service-level indexes. The discussion included potential addition of TPU modeling. The discussion also introduced questions of how to distribute computing metrics given centralized and distributed modes.
Decisions and Action Items
- Action Item: Authors of the problem statement and use case drafts to collaborate and consolidate their work into a single document defining the scope and distinguishing between types of use cases.
- Action Item: Authors of the requirements and gap analysis draft to bundle the requirements section with the problem statement and separate the protocol analysis into its own document.
- Action Item: The working group to define the meaning of "compute" clearly.
Next Steps
- Authors to revise and update drafts based on feedback received during the meeting.
- Continued discussion on the mailing list and GitHub to refine the drafts and address open issues.
- Chairs to coordinate with the Alto chairs on compute metric standardization.
- Chairs to discuss the inclusion of energy considerations within the working group's scope with the relevant AD.