full pass
This commit is contained in:
parent
b3211adc46
commit
6d9b41fba5
1 changed files with 174 additions and 158 deletions
|
|
@ -11,14 +11,14 @@
|
|||
<?rfc tocindent="yes"?>
|
||||
<?rfc tocompact="yes"?>
|
||||
|
||||
<rfc category="std" docName="draft-ietf-lsvr-l3dl-01" ipr="trust200902">
|
||||
<rfc category="std" docName="draft-ietf-lsvr-l3dl-02" ipr="trust200902">
|
||||
|
||||
<front>
|
||||
|
||||
<title>Layer 3 Discovery and Liveness</title>
|
||||
|
||||
<author fullname="Randy Bush" initials="R." surname="Bush">
|
||||
<organization>Arrcus & IIJ</organization>
|
||||
<organization>Arrcus & Internet Initiative Japan</organization>
|
||||
<address>
|
||||
<postal>
|
||||
<street>5147 Crystal Springs</street>
|
||||
|
|
@ -60,9 +60,9 @@
|
|||
protocols are used to build topology and reachability databases.
|
||||
These protocols need to discover IP Layer 3 attributes of links,
|
||||
such as logical link IP encapsulation abilities, IP neighbor address
|
||||
discovery, and link liveness. The Layer 3 Discovery and Liveness
|
||||
protocol specified in this document collects these data, which are
|
||||
then disseminated using BGP-SPF and similar protocols.</t>
|
||||
discovery, and link liveness. This Layer 3 Discovery and Liveness
|
||||
protocol collects these data, which may then be disseminated using
|
||||
BGP-SPF and similar protocols.</t>
|
||||
|
||||
</abstract>
|
||||
|
||||
|
|
@ -83,10 +83,10 @@
|
|||
<section anchor="intro" title="Introduction">
|
||||
|
||||
<t>The Massive Data Center (MDC) environment presents unusual
|
||||
problems of scale, e.g. O(10,000) devices, while its homogeneity
|
||||
presents opportunities for simple approaches. Approaches such as
|
||||
Jupiter Rising <xref target="JUPITER"/> use a central controller to
|
||||
deal with scaling, while BGP-SPF <xref
|
||||
problems of scale, e.g. O(10,000) forwarding devices, while its
|
||||
homogeneity presents opportunities for simple approaches.
|
||||
Approaches such as Jupiter Rising <xref target="JUPITER"/> use a
|
||||
central controller to deal with scaling, while BGP-SPF <xref
|
||||
target="I-D.ietf-lsvr-bgp-spf"/> provides massive scale-out without
|
||||
centralization using a tried and tested scalable distributed control
|
||||
plane, offering a scalable routing solution in Clos <xref
|
||||
|
|
@ -99,17 +99,16 @@
|
|||
|
||||
<t>Layer 3 Discovery and Liveness (L3DL) provides brutally simple
|
||||
mechanisms for devices to <list style="symbols">
|
||||
<t>Discover unique identities of devices/ports/... on a logical
|
||||
link,</t>
|
||||
<t>Run Layer 2 keep-alive messages for session continuity,</t>
|
||||
<t>Discover each other's unique endpoint identification,</t>
|
||||
<t>Discover mutually supported encapsulations, e.g. IP/MPLS,</t>
|
||||
<t>Discover mutually supported layer 3 encapsulations,
|
||||
e.g. IP/MPLS,</t>
|
||||
<t>Discover Layer 3 IP and/or MPLS addressing of interfaces of the
|
||||
encapsulations,</t>
|
||||
<t>Enable layer 3 link liveness such as BFD, and finally</t>
|
||||
<t>Present these data, using a very restricted profile of a BGP-LS
|
||||
<xref target="RFC7752"/> API, to BGP-SPF which computes the
|
||||
topology and builds routing and forwarding tables.</t>
|
||||
topology and builds routing and forwarding tables,</t>
|
||||
<t>Enable layer 3 link liveness such as BFD, and finally</t>
|
||||
<t>Provide Layer 2 keep-alive messages for session continuity.</t>
|
||||
</list></t>
|
||||
|
||||
<t>This protocol may be more widely applicable to a range of routing
|
||||
|
|
@ -133,7 +132,7 @@
|
|||
external components using the BGP routing protocol. See <xref
|
||||
target="RFC7752"/>.</t>
|
||||
<t hangText="BGP-SPF">A hybrid protocol using BGP transport but
|
||||
a Dijkstra SPF decision process. See <xref
|
||||
a Dijkstra Shortest Path First decision process. See <xref
|
||||
target="I-D.ietf-lsvr-bgp-spf"/>.</t>
|
||||
<t hangText="Clos:">A hierarchic subset of a crossbar switch
|
||||
topology commonly used in data centers.</t>
|
||||
|
|
@ -141,7 +140,7 @@
|
|||
frame. A full L3DL PDU may be packaged in multiple Datagrams.</t>
|
||||
<t hangText="Encapsulation:">Address Family Indicator and
|
||||
Subsequent Address Family Indicator (AFI/SAFI). I.e. classes of
|
||||
layer 2.5 and 3 addresses such as IPv4, IPv6, MPLS, ...</t>
|
||||
layer 2.5 and 3 addresses such as IPv4, IPv6, MPLS, etc.</t>
|
||||
<t hangText="Frame:">A Layer 2 packet.</t>
|
||||
<t hangText="Link or Logical Link:">A logical connection between
|
||||
two logical ports on two devices. E.g. two VLANs between the same
|
||||
|
|
@ -153,8 +152,8 @@
|
|||
since they are used by all widely deployed Layer 2 network
|
||||
technologies of interest, especially Ethernet. See <xref
|
||||
target="IEEE.802_2001"/>.</t>
|
||||
<t hangText="MDC:">Massive Data Center, commonly thousands of
|
||||
TORs.</t>
|
||||
<t hangText="MDC:">Massive Data Center, commonly composed of
|
||||
thousands of Top of Rack Switches (TORs).</t>
|
||||
<t hangText="MTU:">Maximum Transmission Unit, the size in octets
|
||||
of the largest packet that can be sent on a medium, see <xref
|
||||
target="RFC1122"/> 1.3.3.</t>
|
||||
|
|
@ -201,7 +200,7 @@
|
|||
in interfaces with thousands of disaggregated prefixes.</t>
|
||||
|
||||
<t>Therefore the L3DL protocol is session oriented and uses
|
||||
incremental announcement and widrawal with hot restart, a la BGP
|
||||
incremental announcement and widrawal with session restart, a la BGP
|
||||
(<xref target="RFC4271"/>).</t>
|
||||
|
||||
</section>
|
||||
|
|
@ -247,7 +246,7 @@
|
|||
</figure>
|
||||
|
||||
<t>There are two protocols, the inter-device per-link layer 3
|
||||
discovery and the interface to the upper level BGP-like API:
|
||||
discovery and the API to the upper level BGP-like routing prototol:
|
||||
<list style="symbols">
|
||||
|
||||
<t>Inter-device PDUs are used to exchange device and logical link
|
||||
|
|
@ -272,21 +271,21 @@
|
|||
<section anchor="ilpo" title="Inter-Link Protocol Overview">
|
||||
|
||||
<t>Two devices discover each other and their respective identities
|
||||
by sending multicast HELLO PDUs (<xref target="hello"/>). To allow
|
||||
by sending multicast HELLO PDUs (<xref target="hello"/>). To assure
|
||||
discovery of new devices coming up on a multi-link topology, devices
|
||||
on such a topology send periodic HELLOs forever, see <xref
|
||||
target="dhello"/>.</t>
|
||||
|
||||
<t>Once a new device is recognized, both devices attempt to
|
||||
negotiate and establish peering by sending unicast OPEN PDUs (<xref
|
||||
target="open"/>). In an established peering, the Encapsulations
|
||||
(<xref target="afisafi"/>) configured on an end point may be
|
||||
announced and modified. Note that these are only the encapsuation
|
||||
and addresses on the announcing interface; though a device's
|
||||
loopback interface(s) may also be announced. When two devices on a
|
||||
link have compatible Encapsulations and addresses, i.e. the same
|
||||
AFI/SAFI and the same subnet, the link is announced via the BGP-LS
|
||||
API.</t>
|
||||
negotiate and establish a session by sending unicast OPEN PDUs
|
||||
(<xref target="open"/>). In an established session, the
|
||||
Encapsulations (<xref target="afisafi"/>) configured on an end point
|
||||
may be announced and modified. Note that these are only the
|
||||
encapsuation and addresses configured on the announcing interface;
|
||||
though a device's loopback and overlay interface(s) may also be
|
||||
announced. When two devices on a link have compatible
|
||||
Encapsulations and addresses, i.e. the same AFI/SAFI and the same
|
||||
subnet, the link is announced via the BGP-LS API.</t>
|
||||
|
||||
<section anchor="ladder" title="L3DL Ladder Diagram">
|
||||
|
||||
|
|
@ -302,7 +301,7 @@
|
|||
PDUs are optional; though at least one encapsulation SHOULD be
|
||||
agreed at some point.</t>
|
||||
|
||||
<t>The following is a ladder-style sketch of the L3DL protocol
|
||||
<t>The following is a ladder-style diagram of the L3DL protocol
|
||||
exchanges:</t>
|
||||
|
||||
<figure>
|
||||
|
|
@ -380,8 +379,8 @@
|
|||
<section anchor="transport" title="Transport Layer">
|
||||
|
||||
<t>L3DL PDUs are carried by a simple transport layer which allows
|
||||
long PDUs to occupy many Ethernet frames. An L3DL frame is referred
|
||||
to as a Datagram.</t>
|
||||
PDUs to occupy many Ethernet frames. An L3DL Ethernet frame is
|
||||
referred to as a Datagram.</t>
|
||||
|
||||
<t>The L3DL Transport Layer encapsulates each Datagram using a
|
||||
common transport header.</t>
|
||||
|
|
@ -402,7 +401,7 @@
|
|||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
| Datagram Length | Checksum ~
|
||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
~ | Payload... |
|
||||
~ | Payload... ~
|
||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
</artwork>
|
||||
</figure>
|
||||
|
|
@ -411,9 +410,9 @@
|
|||
<list style="hanging">
|
||||
|
||||
<t hangText="Version:">Seven-bit Version number of the protocol,
|
||||
currently 0. Values other than 0 are treated as errors. The
|
||||
protocol version nees to be in one and only one place, so it is in
|
||||
the datagram as opposed to, for example, the PDU header.</t>
|
||||
currently 0. Values other than 0 MUST BE treated as an error.
|
||||
The protocol version nees to be in one and only one place, so it
|
||||
is in the datagram as opposed to, for example, the PDU header.</t>
|
||||
|
||||
<t hangText="L:">A bit that set to one if this Datagram is the
|
||||
last Datagram of the PDU. For a PDU which fits in only one
|
||||
|
|
@ -436,6 +435,12 @@
|
|||
thereof.</t>
|
||||
|
||||
</list></t>
|
||||
|
||||
<t>To avoid the need for a receiver to reassemble two PDUs at the
|
||||
same time, a sender MUST NOT send a subsequent PDU when a PDU is
|
||||
already in flight and not yet acknowledged if it is an ACKed PDU
|
||||
Type.</t>
|
||||
|
||||
</section>
|
||||
|
||||
<section anchor="checksum" title="The Checksum">
|
||||
|
|
@ -528,7 +533,7 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
|
|||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
| Sig Type | Signature Length | ~
|
||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
|
||||
~ Signature |
|
||||
~ Signature ~
|
||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
</artwork>
|
||||
</figure>
|
||||
|
|
@ -557,7 +562,7 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
|
|||
|
||||
<t hangText="Signature Length:">The length of the Signature,
|
||||
possibly including padding, in octets. If Sig Type is 0,
|
||||
Signature Length must be 0.</t>
|
||||
Signature Length MUST BE 0.</t>
|
||||
|
||||
<t hangText="Signature:">The result of running the signature
|
||||
algorithm specified in Sig Type over all octets of the PDU except
|
||||
|
|
@ -636,10 +641,6 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
|
|||
|
||||
<section anchor="hello" title="HELLO">
|
||||
|
||||
<t>WARNING: The second multicast address below is incorrect. We
|
||||
need to get a new assignment. , which is what we really wanted with the second address
|
||||
below.</t>
|
||||
|
||||
<t>The HELLO PDU is unique in that it is encapsulated in a multicast
|
||||
Ethernet frame. It solicits response(s) from other LLEI(s) on the
|
||||
link. See <xref target="dhello"/> for why multicast is used. The
|
||||
|
|
@ -649,13 +650,15 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
|
|||
|
||||
<t hangText="01-80-C2-00-00-0E:">Nearest Bridge = Propagation
|
||||
constrained to a single physical link; stopped by all types of
|
||||
bridges (including MPRs (media converters)).</t>
|
||||
bridges (including MPRs (media converters)). This SHOULD BE used
|
||||
when the link is known to be a simple point to point link.</t>
|
||||
|
||||
<t hangText="To Be Assigned:"> When a switch receives a frame with
|
||||
a multicast destination MAC it does not recognize, it forwards to
|
||||
all ports. This destination MAC is to be sent when the interface
|
||||
is known to be connected to a switch. See <xref
|
||||
target="ieee"/>.</t>
|
||||
target="ieee"/>. This SHOULD BE used when the link may be a
|
||||
multi-point link.</t>
|
||||
|
||||
<?rfc subcompact="no"?></list></t>
|
||||
|
||||
|
|
@ -664,11 +667,12 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
|
|||
exchange.</t>
|
||||
|
||||
<t>When an interface is turned up on a device, it SHOULD issue a
|
||||
HELLO.</t>
|
||||
HELLO if it is to participate in L3DL sessions.</t>
|
||||
|
||||
<t>If a constrained destination address configured, see above, then
|
||||
the HELLO need not be repeated once a session has been created by an
|
||||
exchange of OPENs.</t>
|
||||
<t>If a constrained Nearest Bridge destination address is configured
|
||||
for a point-to-point interface, see above, then the HELLO SHOULD NOT
|
||||
be repeated once a session has been created by an exchange of
|
||||
OPENs.</t>
|
||||
|
||||
<t>If the configured destination address is one that is propagated
|
||||
by switches, the HELLO SHOULD be repeated at a configured interval,
|
||||
|
|
@ -696,8 +700,8 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
|
|||
separate logical link.</t>
|
||||
|
||||
<t>When a HELLO is received from a source MAC address with which
|
||||
there is no established L3DL adjacency, the receiver SHOULD respond
|
||||
with an OPEN PDU. The two devices establish an L3DL adjacency by
|
||||
there is no established L3DL session, the receiver SHOULD respond
|
||||
with an OPEN PDU. The two devices establish an L3DL session by
|
||||
exchanging OPEN PDUs.</t>
|
||||
|
||||
<t>The Payload Length is zero as there is no payload.</t>
|
||||
|
|
@ -711,7 +715,7 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
|
|||
|
||||
<t>Each device has learned the other's MAC Address from the HELLO
|
||||
exchange, see <xref target="hello"/>. Therefore the OPEN and
|
||||
subsequent PDUs are unicast, as opposed to the HELLO's multicast
|
||||
subsequent PDUs MUST BE unicast, as opposed to the HELLO's multicast
|
||||
frame.</t>
|
||||
|
||||
<!--
|
||||
|
|
@ -754,9 +758,10 @@ q-->
|
|||
<t>My LLEI is the sender's LLEI, see <xref target="llei"/>.</t>
|
||||
|
||||
<t>AttrCount is the number of attributes in the Attribute List.
|
||||
Attributes are single octets whose semantics are user-defined.</t>
|
||||
Attributes are single octets the semantics of which are
|
||||
operator-defined.</t>
|
||||
|
||||
<t>A node may have zero or more user-defined attributes, e.g.
|
||||
<t>A node may have zero or more operator-defined attributes, e.g.:
|
||||
spine, leaf, backbone, route reflector, arabica, ...</t>
|
||||
|
||||
<t>Attribute syntax and semantics are local to an operator or
|
||||
|
|
@ -767,19 +772,19 @@ q-->
|
|||
target="tlv"/>.</t>
|
||||
|
||||
<t>Key Length is a 16-bit field denoting the length in octets of the
|
||||
Key itself, not including the Auth Type or the Key Lengths. If
|
||||
there is no Key, the Auth Type and key Length MUST both be zero.</t>
|
||||
Key itself, not including the Auth Type or the Key Length. If there
|
||||
is no Key, the Auth Type and key Length MUST both be zero.</t>
|
||||
|
||||
<t>The Key is specific to the operational environment. A failure to
|
||||
authenticate is a failure to start the L3DL session, an ERROR PDU is
|
||||
sent (Error Code 2), and HELLOs MUST be restarted.</t>
|
||||
authenticate is a failure to start the L3DL session, an ERROR PDU
|
||||
MUST BE sent (Error Code 2), and HELLOs MUST be restarted.</t>
|
||||
|
||||
<t>The Serial Number is that of the last received and processed
|
||||
Encapsulation PDU. This allows a receiver sending an OPEN to tell
|
||||
the sender that the receiver wants to resume a session and the
|
||||
sender only needs to send data more recent than the Serial Number.
|
||||
If this OPEN is not trying to restart a lost session, the Serial
|
||||
Number MUST be set to zero.</t>
|
||||
<t>The Serial Number is that of the last received and processed PDU.
|
||||
This allows a receiver sending an OPEN to tell the sender that the
|
||||
receiver wants to resume a session and the sender only needs to send
|
||||
data more recent than the Serial Number. If this OPEN is not trying
|
||||
to restart a lost session, the Serial Number MUST BE set to
|
||||
zero.</t>
|
||||
|
||||
<t>The Signature fields are described in <xref target="tlv"/> and in
|
||||
an asymmetric key environment serve as a proof of possession of the
|
||||
|
|
@ -791,19 +796,29 @@ q-->
|
|||
keep the session semantics alive. The timing and acceptable drop of
|
||||
KEEPALIVE PDUs are discussed in <xref target="keepalive"/>.</t>
|
||||
|
||||
<t>If a sender of OPEN does not receive an ACK of the OPEN PDU Type,
|
||||
then they MUST resend the same OPEN PDU, with the same Nonce.
|
||||
Resending an unacknowledged OPEN PDU, like other ACKed PDUs, SHOULD
|
||||
use exponential back-off, see <xref target="RFC1122"/>.</t>
|
||||
<t>If a sender of OPEN does not receive an ACK of the OPEN PDU, then
|
||||
they MUST resend the same OPEN PDU, with the same Nonce. Resending
|
||||
an unacknowledged OPEN PDU, like other ACKed PDUs, SHOULD use
|
||||
exponential back-off, see <xref target="RFC1122"/>.</t>
|
||||
|
||||
<t>If a properly authenticated OPEN arrives with a new Nonce from an
|
||||
LLEI with which the receiving logical link endpoint believes it
|
||||
already has an L3DL session (OPENs have already been exchanged), the
|
||||
receiver MAY assume that the sending LLEI or entire device has been
|
||||
reset. If the Serial Number in the OPEN is zero, then all
|
||||
discovered encapsulation data SHOULD be withdrawn via the BGP-LS API
|
||||
and the recipient MUST respond with a new OPEN. In this
|
||||
circumstance encapsulations SHOULD NOT be kept.</t>
|
||||
already has an L3DL session (OPENs have already been exchanged), and
|
||||
the Serial Number in the OPEN is non-zero, the receiver SHOULD
|
||||
establish a new session by sending an OPEN with the Serial Number of
|
||||
the last data it received. Each party MUST resume sending
|
||||
encapsulations etc. subsequent to the other party's Sequence Number.
|
||||
And each MUST retain all previously discovered encapsulation and
|
||||
other data.</t>
|
||||
|
||||
<t>If a properly authenticated OPEN arrives with a new Nonce from an
|
||||
LLEI with which the receiving logical link endpoint believes it
|
||||
already has an L3DL session (OPENs have already been exchanged), and
|
||||
the Serial Number in the OPEN is zero, then the receiver MUST assume
|
||||
that the sending LLEI or entire device has been reset. All
|
||||
previously discovered encapsulation data MUST NOT be kept and MUST
|
||||
be withdrawn via the BGP-LS API and the recipient MUST respond with
|
||||
a new OPEN.</t>
|
||||
|
||||
</section>
|
||||
|
||||
|
|
@ -836,7 +851,7 @@ q-->
|
|||
PDU, etc.</t>
|
||||
|
||||
<t>The ACKed PDU is the PDU Type of the PDU being acknowledged,
|
||||
e.g., OPEN or one of the Encapsulations.</t>
|
||||
e.g., OPEN, one of the Encapsulations, etc.</t>
|
||||
|
||||
<t>If there was an error processing the received PDU, then the EType
|
||||
is non-zero. If the EType is zero, Error Code and Error Hint MUST
|
||||
|
|
@ -848,12 +863,21 @@ q-->
|
|||
error.</t>
|
||||
|
||||
<t>The decimal value of EType gives a strong hint how the receiver
|
||||
sending the ACK believes things should proceed. The ETypes are
|
||||
listed in <xref target="iana-error"/>. Someone stuck in the 1990s
|
||||
might think of the error codes as 0x1zzz, 0x2zzz, etc. They might
|
||||
be right. Or not.</t>
|
||||
sending the ACK believes things should proceed:
|
||||
<list style="empty">
|
||||
<?rfc subcompact="yes"?>
|
||||
<t>0 - No Error, Error Code and Error Hint MUST be zero</t>
|
||||
<t>1 - Warning, something not too serious happened, continue</t>
|
||||
<t>2 - Session should not be continued, try to restart</t>
|
||||
<t>3 - Restart is hopeless, call the operator</t>
|
||||
<t>4-15 - Reserved</t>
|
||||
<?rfc subcompact="no"?>
|
||||
</list></t>
|
||||
|
||||
<t>The Error Code indicates the type of error.</t>
|
||||
<t>The Error Codes, noting protocol failures listed in thi document,
|
||||
are listed in <xref target="iana-error"/>. Someone stuck in the
|
||||
1990s might think the catenation of EType and Error Code as an echo
|
||||
of 0x1zzz, 0x2zzz, etc. They might be right; or not.</t>
|
||||
|
||||
<t>The Error Hint is any additional data the sender of the error PDU
|
||||
thinks will help the recipient or the debugger with the particular
|
||||
|
|
@ -873,8 +897,7 @@ q-->
|
|||
case of this ACK failure.</t>
|
||||
|
||||
<t>If the link is broken at layer 2, retransmission MAY BE retried
|
||||
when the link comes back up if data have not changed in the
|
||||
interim.</t>
|
||||
when the link is restored.</t>
|
||||
|
||||
</section>
|
||||
|
||||
|
|
@ -887,11 +910,10 @@ q-->
|
|||
session is considered established, and the devices SHOULD exchange
|
||||
L3 interface encapsulations, L3 addresses, and L2.5 labels.</t>
|
||||
|
||||
<t>The Encapsulation types the peers exchange may be IPv4
|
||||
Announcement (<xref target="ipv4"/>), IPv6 Announcement (<xref
|
||||
target="ipv6"/>), MPLS IPv4 Announcement (<xref target="mpls4"/>),
|
||||
MPLS IPv6 Announcement (<xref target="mpls6"/>), and/or possibly
|
||||
others not defined here.</t>
|
||||
<t>The Encapsulation types the peers exchange may be IPv4 (<xref
|
||||
target="ipv4"/>), IPv6 (<xref target="ipv6"/>), MPLS IPv4 (<xref
|
||||
target="mpls4"/>), MPLS IPv6 (<xref target="mpls6"/>), and/or
|
||||
possibly others not defined here.</t>
|
||||
|
||||
<t>The sender of an Encapsulation PDU MUST NOT assume that the peer
|
||||
is capable of the same Encapsulation Type. An ACK (<xref
|
||||
|
|
@ -937,12 +959,12 @@ q-->
|
|||
</artwork>
|
||||
</figure>
|
||||
|
||||
<t>The 24-bit Count is the number of Encapsulations in the
|
||||
Encapsulation list.</t>
|
||||
|
||||
<t>An Encapsulation PDU describes zero or more addresses of the
|
||||
encapsulation type.</t>
|
||||
|
||||
<t>The 24-bit Count is the number of Encapsulations in the
|
||||
Encapsulation list.</t>
|
||||
|
||||
<t>The Serial Number is a monotonically increasing 32-bit value
|
||||
representing the sender's state in time. It may be an integer, a
|
||||
timestamp, etc. On session restart (new OPEN), a receiver MAY
|
||||
|
|
@ -950,7 +972,7 @@ q-->
|
|||
send newer data.</t>
|
||||
|
||||
<t>If a sender has multiple links on the same interface, separate
|
||||
state: data, ACKs, etc. must be kept for each peer.</t>
|
||||
state: data, ACKs, etc. must be kept for each peer session.</t>
|
||||
|
||||
<t>Over time, multiple Encapsulation PDUs may be sent for an
|
||||
interface as configuration changes.</t>
|
||||
|
|
@ -988,9 +1010,10 @@ q-->
|
|||
</artwork>
|
||||
</figure>
|
||||
|
||||
<t>An Encapsulation PDU of Type T may announce new and/or withdraw
|
||||
old encapsulations of Type T. It indicates this with the Ann/With
|
||||
Encapsulation Flag, Announce == 1, Withdraw == 0.</t>
|
||||
<t>Each encapsulation in an Encapsulation PDU of Type T may
|
||||
announce new and/or withdraw old encapsulations of Type T. It
|
||||
indicates this with the Ann/With Encapsulation Flag, Announce ==
|
||||
1, Withdraw == 0.</t>
|
||||
|
||||
<t>Each Encapsulation interface address in an Encapsulation PDU is
|
||||
either a new encapsulation be announced (Ann/With == 1) (yes, a la
|
||||
|
|
@ -1006,20 +1029,18 @@ q-->
|
|||
be marked as primary for a particular encapsulation type.</t>
|
||||
|
||||
<t>An Encapsulation interface address in an Encapsulation PDU MAY
|
||||
be marked as a loopback, in which case the Loopback bit is
|
||||
set.</t>
|
||||
|
||||
<t>Loopback addresses are generally not seen directly on an
|
||||
external interface. One or more loopback addresses MAY be exposed
|
||||
by configuration on one or more L3DL speaking external interfaces,
|
||||
be marked as a loopback, in which case the Loopback bit is set.
|
||||
Loopback addresses are generally not seen directly on an external
|
||||
interface. One or more loopback addresses MAY be exposed by
|
||||
configuration on one or more L3DL speaking external interfaces,
|
||||
e.g. for iBGP peering. They SHOULD be marked as such, Loopback
|
||||
Flag == 1.</t>
|
||||
|
||||
<t>Each Encapsulation interface address in an Encapsulation PDU is
|
||||
that of the direct 'underlay interface (Under/Over == 1), or an
|
||||
'overlay' address (Under/Over == 0), likely that of a VM or
|
||||
container guest bridged on to the interface with an underlay
|
||||
address.</t>
|
||||
container guest bridged or configured on to the interface already
|
||||
having an underlay address.</t>
|
||||
|
||||
</section>
|
||||
|
||||
|
|
@ -1053,7 +1074,8 @@ q-->
|
|||
</artwork>
|
||||
</figure>
|
||||
|
||||
<t>The 24-bit Count is the number of IPv4 Encapsulations.</t>
|
||||
<t>The 24-bit Count is the number of IPv4 Encapsulations being
|
||||
announced and/or withdrawn.</t>
|
||||
|
||||
</section>
|
||||
|
||||
|
|
@ -1094,7 +1116,8 @@ q-->
|
|||
</artwork>
|
||||
</figure>
|
||||
|
||||
<t>The 24-bit Count is the number of IPv6 Encapsulations.</t>
|
||||
<t>The 24-bit Count is the number of IPv6 Encapsulations being
|
||||
announced and/or withdrawn.</t>
|
||||
|
||||
</section>
|
||||
|
||||
|
|
@ -1160,7 +1183,8 @@ q-->
|
|||
</artwork>
|
||||
</figure>
|
||||
|
||||
<t>The 24-bit Count is the number of MPLSv4 Encapsulations.</t>
|
||||
<t>The 24-bit Count is the number of MPLSv4 Encapsulation being
|
||||
announced and/or withdrawns.</t>
|
||||
|
||||
</section>
|
||||
|
||||
|
|
@ -1169,7 +1193,7 @@ q-->
|
|||
<t>The MPLS IPv4 Encapsulation describes a logical link's ability
|
||||
to exchange labeled IPv4 packets on one or more subnets. It does
|
||||
so by stating the interface's addresses, the corresponding prefix
|
||||
lengths, and the corresponding labels which will be accepted fpr
|
||||
lengths, and the corresponding labels which will be accepted for
|
||||
each address.</t>
|
||||
<!--
|
||||
protocol "PDU Type = 7:8,Payload Length:32,Count:24,Serial Number:32,Encaps Flags:8,MPLS Label List ...:16,IPv6 Address:128,Prefix Len:8,more ...:8,Sig Type:8,Signature Length:16,Signature ...:32"
|
||||
|
|
@ -1203,13 +1227,8 @@ q-->
|
|||
</artwork>
|
||||
</figure>
|
||||
|
||||
<t>The 24-bit Count is the number of MPLSv6 Encapsulations.</t>
|
||||
|
||||
<t>The MPLS IPv6 Encapsulation describes a logical link's ability
|
||||
to exchange labeled IPv6 packets on one or more subnets. It does
|
||||
so by stating the interface's addresses, the corresponding prefix
|
||||
lengths, and the corresponding labels which will be accepted fpr
|
||||
each address.</t>
|
||||
<t>The 24-bit Count is the number of MPLSv6 Encapsulations being
|
||||
announced and/or withdrawn.</t>
|
||||
|
||||
</section>
|
||||
</section>
|
||||
|
|
@ -1256,26 +1275,6 @@ q-->
|
|||
|
||||
<section anchor="keepalive" title="KEEPALIVE - Layer 2 Liveness">
|
||||
|
||||
<t>L3DL devices SHOULD beacon frequent Layer 2 KEEPALIVE PDUs to
|
||||
ensure session continuity. A receiver may choose to ignore
|
||||
KEEPALIVE PDUs.</t>
|
||||
|
||||
<t>An operational deployment MUST BE configured whether to use
|
||||
KEEPALIVEs or not, either globally, or down to per-link granularity.
|
||||
Disagreement MAY result in repeated session break and
|
||||
reestablishment.</t>
|
||||
|
||||
<t>KEEPALIVEs SHOULD be beaconed at a configured frequency. One per
|
||||
second is the default. Layer 3 liveness, such as BFD, may be more
|
||||
(or less) aggressive.</t>
|
||||
|
||||
<t>If a KEEPALIVE is not received from a peer with which a receiver
|
||||
has an open session for a configurable time (default 30 seconds),
|
||||
the link SHOULD BE presumed down. The devices MAY keep
|
||||
configuration state and restore it without retransmission if no data
|
||||
have changed. Otherwise, a new session SHOULD BE established and
|
||||
new Encapsulation PDUs exchanged.</t>
|
||||
|
||||
<!--
|
||||
protocol "PDU Type = 2:8,Payload Length = 0:32,Sig Type = 0:8,Signature Length = 0:16"
|
||||
-->
|
||||
|
|
@ -1292,6 +1291,31 @@ q-->
|
|||
</artwork>
|
||||
</figure>
|
||||
|
||||
<t>L3DL devices SHOULD beacon frequent Layer 2 KEEPALIVE PDUs to
|
||||
ensure session continuity. A receiver may choose to ignore
|
||||
KEEPALIVE PDUs.</t>
|
||||
|
||||
<t>An operational deployment MUST BE configured whether to use
|
||||
KEEPALIVEs or not, either globally, or down to per-link granularity.
|
||||
Disagreement MAY result in repeated session break and
|
||||
reestablishment.</t>
|
||||
|
||||
<t>KEEPALIVEs SHOULD be beaconed at a configured frequency. One per
|
||||
second is the default. Layer 3 liveness, such as BFD, may be more
|
||||
(or less) aggressive.</t>
|
||||
|
||||
<t>When a sender transmits a PDU which is not a KEEPALIVE, the
|
||||
sender SHOULD reset the KEEPALIVE timer. I.e. sending any PDU acts
|
||||
as a keepalive. Once the last fragment has been sent, the
|
||||
KEEPALIVE timer SHOULD BE restarted. Do not wait for the ACK.</t>
|
||||
|
||||
<t>If a KEEPALIVE or other PDUs have not been received from a peer
|
||||
with which a receiver has an open session for a configurable time
|
||||
(default 30 seconds), the link SHOULD BE presumed down. The devices
|
||||
MAY keep configuration state and restore it without retransmission
|
||||
if no data have changed. Otherwise, a new session SHOULD BE
|
||||
established and new Encapsulation PDUs exchanged.</t>
|
||||
|
||||
</section>
|
||||
|
||||
<section anchor="l3liveness" title="Layers 2.5 and 3 Liveness">
|
||||
|
|
@ -1303,7 +1327,7 @@ q-->
|
|||
technique.</t>
|
||||
|
||||
<t>This protocol assumes that one or more Encapsulation addresses
|
||||
will be used to ping, run BFD, or whatever the operator
|
||||
may be used to ping, run BFD, or whatever the operator
|
||||
configures.</t>
|
||||
|
||||
</section>
|
||||
|
|
@ -1317,7 +1341,7 @@ q-->
|
|||
LLEIs and Encapsulations on each logical link interface.</t>
|
||||
|
||||
<t>Full topology discovery is not appropriate at the L3DL layer, so
|
||||
Dijkstra à la IS-IS etc. is assumed to be done by higher level
|
||||
Dijkstra a la IS-IS etc. is assumed to be done by higher level
|
||||
protocols such as BGP-SPF.</t>
|
||||
|
||||
<t>Therefore the LLEIs, link Encapsulations, and state changes are
|
||||
|
|
@ -1370,24 +1394,15 @@ q-->
|
|||
|
||||
<section anchor="dhello" title="HELLO Discussion">
|
||||
|
||||
<!--
|
||||
|
||||
<t>There is the question of whether to allow an intermediate
|
||||
switch to be transparent to discovery. We consider that an
|
||||
interface on a device is a Layer 2 or a Layer 3 interface. In
|
||||
theory it could be a Layer 3 interface with no encapsulation or
|
||||
Layer 3 addressing currently configured.</t>
|
||||
-->
|
||||
|
||||
<t>A device with multiple Layer 2 interfaces, traditionally called
|
||||
a switch, may be used to forward frames and therefore packets from
|
||||
multiple devices to one logical interface (LLEI), I, on an L3DL
|
||||
speaking device. Interface I could discover a peer J across the
|
||||
switch. Later, a prospective peer K could come up across the
|
||||
switch. If I was not still sending and listening for HELLOs, the
|
||||
potential peering with K could not be discovered. Therefore,
|
||||
interfaces MUST continue to send HELLOs as long as they are turned
|
||||
up.</t>
|
||||
potential peering with K could not be discovered. Therefore, on
|
||||
multi-link interfaces MUST continue to send HELLOs as long as they
|
||||
are turned up.</t>
|
||||
|
||||
</section>
|
||||
|
||||
|
|
@ -1444,15 +1459,15 @@ q-->
|
|||
encapsulation, the implementation MAY mark it as primary by
|
||||
default.</t>
|
||||
|
||||
<t>An implementation SHOULD allow optional configuration which
|
||||
updates the local forwarding table with overlay and underlay data
|
||||
both learned from L3DL peers and configured locally.</t>
|
||||
<t>An implementation MAY allow optional configuration which updates
|
||||
the local forwarding table with overlay and underlay data both
|
||||
learned from L3DL peers and configured locally.</t>
|
||||
|
||||
</section>
|
||||
|
||||
<section anchor="security" title="Security Considerations">
|
||||
|
||||
<t>The protocol as it is MUST NOT be used outside a datacenter or
|
||||
<t>The protocol as is MUST NOT be used outside a datacenter or
|
||||
similarly closed environment due to lack of formal definition of the
|
||||
authentication and authorization mechanism. Sufficient mechanisms
|
||||
may be described in separate documents.</t>
|
||||
|
|
@ -1588,12 +1603,13 @@ q-->
|
|||
|
||||
<section anchor="acks" title="Acknowledgments">
|
||||
|
||||
<t>The authors thank Cristel Pelsser for multiple reviews, Jeff Haas
|
||||
for review and comments, Joe Clarke for a useful review, John
|
||||
Scudder for deeply serious review and comments, Larry Kreeger for a
|
||||
lot of layer 2 clue, Martijn Schmidt for his contribution, Neeraj
|
||||
Malhotra for review, Russ Housley for checksum discussion and sBox,
|
||||
and Steve Bellovin for checksum advice.</t>
|
||||
<t>The authors thank Cristel Pelsser for multiple reviews, Harsha
|
||||
Kovuru for comments during implementation, Jeff Haas for review and
|
||||
comments, Joe Clarke for a useful review, John Scudder for deeply
|
||||
serious review and comments, Larry Kreeger for a lot of layer 2
|
||||
clue, Martijn Schmidt for his contribution, Neeraj Malhotra for
|
||||
review, Russ Housley for checksum discussion and sBox, and Steve
|
||||
Bellovin for checksum advice.</t>
|
||||
|
||||
</section>
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue