-17 submitted

This commit is contained in:
Randy Bush 2025-03-03 09:28:58 -08:00
parent f5b98ec717
commit b9832f9b11

View file

@ -98,17 +98,12 @@
<t>The Massive Data Center (MDC) environment presents unusual
problems of scale, e.g. O(10,000) forwarding devices, while its
homogeneity presents opportunities for simple approaches.
Approaches such as <eref target="https://xml2rfc.tools.ietf.org/public/rfc/bibxml-doi/reference.DOI.10.1145/2975159.xml?anchor=JUPITER">&quot;Jupiter Rising: A study of non-blocking switching networks&quot; [PAYWALLED]</eref>
<!-- <xref target="JUPITER"/>
--> use a
Approaches such as <xref target="JUPITER"/> use a
central controller to deal with scaling, while BGP-SPF <xref
target="I-D.ietf-lsvr-bgp-spf"/> provides massive scale-out without
centralization using a tried and tested scalable distributed control
plane, offering a scalable routing solution in
<!-- <xref target="Clos0"/> -->
<eref target="https://en.wikipedia.org/wiki/Clos_network/">&quot;Clos Networks&quot;</eref>
<!--<xref target="Clos1"/>
-->
<xref target="Clos"/>
and similar environments.
But BGP-SPF and similar higher level device-spanning protocols,
e.g. <xref target="I-D.malhotra-bess-evpn-lsoe"/>, need logical link
@ -119,12 +114,12 @@
<t>Layer-3 Discovery and Liveness (L3DL) provides brutally simple
mechanisms for devices to <list style="symbols">
<t>Discover each other's unique endpoint identification,</t>
<t>Discover mutually supported layer-3 encapsulations, e.g.
IP/MPLS,</t>
<t>Discover Layer-3 IP and/or MPLS addressing of interfaces of the
encapsulations,</t>
<t>Discover mutually supported layer-3 and layer-2.5
encapsulations, e.g. IP/MPLS,</t>
<t>Discover Layer-3 IP and/or layer-22.5 MPLS addressing of
interfaces of the encapsulations,</t>
<t>Present these data, using a very restricted profile of a BGP-LS
<xref target="RFC7752"/> API, to BGP-SPF which computes the
<xref target="RFC9552"/> API, to BGP-SPF which computes the
topology and builds routing and forwarding tables,</t>
<t>Enable Layer-3 link liveness such as BFD,</t>
<t>Provide Layer-2 keep-alive messages for session continuity,</t>
@ -163,7 +158,7 @@
<t hangText="BGP-LS:">A mechanism by which link-state and TE
information can be collected from networks and shared with
external components using the BGP routing protocol. See <xref
target="RFC7752"/>.</t>
target="RFC9552"/>.</t>
<t hangText="BGP-SPF">A hybrid protocol using BGP transport but
a Dijkstra Shortest Path First decision process. See <xref
target="I-D.ietf-lsvr-bgp-spf"/>.</t>
@ -228,8 +223,6 @@
assumed. Familiarity with BGP-SPF, <xref
target="I-D.ietf-lsvr-bgp-spf"/>, might be useful. </t>
<t>L3DL assumes a new IEEE assigned EtherType (TBD).</t>
<t>The number of addresses of one Encapsulation type on an interface
link may be quite large given a TOR with tens of servers, each
server having a few hundred micro-services, resulting in an
@ -237,9 +230,10 @@
migration can cause serious address prefix disaggregation, resulting
in interfaces with thousands of disaggregated prefixes.</t>
<t>Therefore the L3DL protocol is session oriented and uses
incremental announcement and withdrawal with session restart, a la
BGP (<xref target="RFC4271"/>).</t>
<t>To provide the scalability, reliability, ordering, etc. for the
above, the L3DL protocol is session oriented and uses incremental
announcement and withdrawal with session restart, a la BGP (<xref
target="RFC4271"/>).</t>
</section>
@ -249,11 +243,11 @@
<t>Devices discover each other on logical links</t>
<t>Logical Link Endpoint Identifiers (LLEIs) are exchanged</t>
<t>Layer-2 Liveness checks may be started</t>
<t>Encapsulation data are exchanged and IP-Level Liveness checks
<t>Encapsulation data are exchanged and layer-3 Liveness checks
enabled</t>
<t>A BGP-like upper layer protocol is assumed to use the
identifiers and encapsulation data to discover and build a topology
database</t>
<t>A BGP-like upper layer protocol (BGP-SPF in this example) is
assumed to use the identifiers and encapsulation data to discover
and build a topology database</t>
</list></t>
<figure>
@ -286,13 +280,14 @@
<t>There are two protocols, the inter-device (left-right in the
diagram) per-link layer-3 discovery and the API to the upper level
BGP-like routing protocol (up-down in the above diagram):
(BGP-SPF in this example) routing protocol (up-down in the above
diagram):
<list style="symbols">
<t>Inter-device PDUs are used to exchange device and logical link
identities and layer-2.5 (MPLS) and 3 identifiers (not payloads),
e.g. device IDs, port identities, VLAN IDs, Encapsulations, and IP
addresses.</t>
<t>Inter-device PDUs are used to exchange device/system and
logical link identities (see <xref target="llei"/>) and layer-2.5
(MPLS) and 3 identifiers (not payloads), e.g. device IDs, port
identities, VLAN IDs, Encapsulations, and IP addresses.</t>
<t>A Link Layer to BGP API presents these data up the stack to
a BGP protocol or an other device-spanning upper layer protocol,
@ -300,11 +295,14 @@
</list></t>
<t>The upper layer BGP family routing protocols cross all the
devices, though they are not part of these L3DL protocols.</t>
<t>L3DL assumes a new IEEE assigned EtherType (TBD).</t>
<t>To simplify this document, Layer-2 framing is not shown. L3DL is
about layer-3.</t>
<t>The upper layer BGP family routing protocols cross all the
devices, though they are not part of the L3DL protocol.</t>
<t>To simplify this document, Layer-2 framing is not shown.
Ethernet framing is extremely well documented elsewhere, see <xref
target="EtherFrame"/>).</t>
</section>
@ -316,6 +314,9 @@
on such a topology, and only on a multi-link topology, send periodic
HELLOs forever, see <xref target="dhello"/>.</t>
<t>Devices may be directly connected or through an intermediate
device, see <xref target="hello"/>.</t>
<t>Once a new device is recognized, both devices attempt to
negotiate and establish a session by sending unicast OPEN PDUs
(<xref target="open"/>) to the source MAC addresses (plus VIDs if
@ -324,10 +325,10 @@
target="afisafi"/>) configured on an end point may be announced and
modified. Note that these are only the encapsulation and addresses
configured on the announcing interface; though a device's loopback
and overlay interface(s) may also be announced. When two devices on
a link have compatible Encapsulations and addresses, i.e. the same
AFI/SAFI and the same subnet, the link is announced via the BGP-LS
API.</t>
and any pseudo/overlay interface(s) may also be announced. When two
devices on a link have compatible Encapsulations and addresses,
i.e. the same Encapsulation and the same subnet, the link is
announced via the BGP-LS API.</t>
<section anchor="ladder" title="L3DL Ladder Diagram">
@ -337,11 +338,11 @@
the identities of logical link endpoint(s) reachable from a
Logical Link Endpoint, <xref target="llei"/>.</t>
<t>The HELLO and OPEN, <xref target="open"/>, PDUs, which are used
to discover and exchange detailed Logical Link Endpoint
Identifiers, LLEIs, and the ACK/ERROR PDU, are mandatory; other
PDUs are optional; though at least one encapsulation SHOULD be
agreed at some point.</t>
<t>The HELLO , <xref target="hello"/> and OPEN, <xref
target="open"/>, PDUs, which are used to discover and exchange
detailed Logical Link Endpoint Identifiers, LLEIs, and the
ACK/ERROR PDU, are mandatory; other PDUs are optional; though at
least one encapsulation SHOULD be agreed at some point.</t>
<t>The following is a ladder-style diagram of the L3DL protocol
exchanges:</t>
@ -1774,7 +1775,7 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
<section anchor="ls" title="Use BGP-LS as Much as Possible">
<t>BGP-LS <xref target="RFC7752"/> defines BGP-like Datagrams
<t>BGP-LS <xref target="RFC9552"/> defines BGP-like Datagrams
describing logical link state (links, nodes, link prefixes, and
many other things), and a new BGP path attribute providing
Northbound transport, all of which can be ingested by upper layer
@ -2499,7 +2500,7 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
<?rfc include="reference.RFC.5226.xml"?>
<?rfc include="reference.RFC.5880.xml"?>
<?rfc include="reference.RFC.6286.xml"?>
<?rfc include="reference.RFC.7752.xml"?>
<?rfc include="reference.RFC.9552.xml"?>
<?rfc include="reference.RFC.8174.xml"?>
<?rfc include="reference.I-D.ietf-idr-bgpls-segment-routing-epe.xml"?>
<?rfc include="reference.I-D.ietf-idr-bgp-ls-segment-routing-ext.xml"?>
@ -2553,9 +2554,16 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
<?rfc include="reference.RFC.5280.xml"?>
<?rfc include="reference.RFC.7210.xml"?>
<?rfc include="reference.I-D.malhotra-bess-evpn-lsoe.xml"?>
<!--
<?rfc include="https://xml2rfc.tools.ietf.org/public/rfc/bibxml-doi/reference.DOI.10.1145/2975159.xml?anchor=JUPITER"?>
<reference anchor="Clos0" >
<reference anchor="JUPITER" target="http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p183.pdf">
<front>
<title>Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Googles Datacenter Network</title>
<author>
<organization>Google</organization>
</author>
<date month="August" year="2015"/>
</front>
</reference>
<reference anchor="Clos" target="" >
<front>
<title>A study of non-blocking switching networks [PAYWALLED]</title>
<author initials="C." surname="Clos" fullname="Charles Clos">
@ -2565,14 +2573,15 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
</front>
<seriesInfo name="Bell System Technical Journal" value="32 (2), pp 406-424"/>
</reference>
<reference anchor="Clos1" target="https://en.wikipedia.org/wiki/Clos_network/">
<reference anchor="EtherFrame" target="https://ieeexplore.ieee.org/document/8457469">
<front>
<title>Clos Network</title>
<author/>
<date/>
<title>802.3-2018 - IEEE Standard for Ethernet</title>
<author>
<organization>IEEE</organization>
</author>
<date month="August" year="2018"/>
</front>
</reference>
-->
</references>
</back>