full pass

This commit is contained in:
Randy Bush 2019-07-07 10:27:54 -07:00
parent b3211adc46
commit 6d9b41fba5

View file

@ -11,14 +11,14 @@
<?rfc tocindent="yes"?>
<?rfc tocompact="yes"?>
<rfc category="std" docName="draft-ietf-lsvr-l3dl-01" ipr="trust200902">
<rfc category="std" docName="draft-ietf-lsvr-l3dl-02" ipr="trust200902">
<front>
<title>Layer 3 Discovery and Liveness</title>
<author fullname="Randy Bush" initials="R." surname="Bush">
<organization>Arrcus &amp; IIJ</organization>
<organization>Arrcus &amp; Internet Initiative Japan</organization>
<address>
<postal>
<street>5147 Crystal Springs</street>
@ -60,9 +60,9 @@
protocols are used to build topology and reachability databases.
These protocols need to discover IP Layer 3 attributes of links,
such as logical link IP encapsulation abilities, IP neighbor address
discovery, and link liveness. The Layer 3 Discovery and Liveness
protocol specified in this document collects these data, which are
then disseminated using BGP-SPF and similar protocols.</t>
discovery, and link liveness. This Layer 3 Discovery and Liveness
protocol collects these data, which may then be disseminated using
BGP-SPF and similar protocols.</t>
</abstract>
@ -83,10 +83,10 @@
<section anchor="intro" title="Introduction">
<t>The Massive Data Center (MDC) environment presents unusual
problems of scale, e.g. O(10,000) devices, while its homogeneity
presents opportunities for simple approaches. Approaches such as
Jupiter Rising <xref target="JUPITER"/> use a central controller to
deal with scaling, while BGP-SPF <xref
problems of scale, e.g. O(10,000) forwarding devices, while its
homogeneity presents opportunities for simple approaches.
Approaches such as Jupiter Rising <xref target="JUPITER"/> use a
central controller to deal with scaling, while BGP-SPF <xref
target="I-D.ietf-lsvr-bgp-spf"/> provides massive scale-out without
centralization using a tried and tested scalable distributed control
plane, offering a scalable routing solution in Clos <xref
@ -99,17 +99,16 @@
<t>Layer 3 Discovery and Liveness (L3DL) provides brutally simple
mechanisms for devices to <list style="symbols">
<t>Discover unique identities of devices/ports/... on a logical
link,</t>
<t>Run Layer 2 keep-alive messages for session continuity,</t>
<t>Discover each other's unique endpoint identification,</t>
<t>Discover mutually supported encapsulations, e.g. IP/MPLS,</t>
<t>Discover mutually supported layer 3 encapsulations,
e.g. IP/MPLS,</t>
<t>Discover Layer 3 IP and/or MPLS addressing of interfaces of the
encapsulations,</t>
<t>Enable layer 3 link liveness such as BFD, and finally</t>
<t>Present these data, using a very restricted profile of a BGP-LS
<xref target="RFC7752"/> API, to BGP-SPF which computes the
topology and builds routing and forwarding tables.</t>
topology and builds routing and forwarding tables,</t>
<t>Enable layer 3 link liveness such as BFD, and finally</t>
<t>Provide Layer 2 keep-alive messages for session continuity.</t>
</list></t>
<t>This protocol may be more widely applicable to a range of routing
@ -133,7 +132,7 @@
external components using the BGP routing protocol. See <xref
target="RFC7752"/>.</t>
<t hangText="BGP-SPF">A hybrid protocol using BGP transport but
a Dijkstra SPF decision process. See <xref
a Dijkstra Shortest Path First decision process. See <xref
target="I-D.ietf-lsvr-bgp-spf"/>.</t>
<t hangText="Clos:">A hierarchic subset of a crossbar switch
topology commonly used in data centers.</t>
@ -141,7 +140,7 @@
frame. A full L3DL PDU may be packaged in multiple Datagrams.</t>
<t hangText="Encapsulation:">Address Family Indicator and
Subsequent Address Family Indicator (AFI/SAFI). I.e. classes of
layer 2.5 and 3 addresses such as IPv4, IPv6, MPLS, ...</t>
layer 2.5 and 3 addresses such as IPv4, IPv6, MPLS, etc.</t>
<t hangText="Frame:">A Layer 2 packet.</t>
<t hangText="Link or Logical Link:">A logical connection between
two logical ports on two devices. E.g. two VLANs between the same
@ -153,8 +152,8 @@
since they are used by all widely deployed Layer 2 network
technologies of interest, especially Ethernet. See <xref
target="IEEE.802_2001"/>.</t>
<t hangText="MDC:">Massive Data Center, commonly thousands of
TORs.</t>
<t hangText="MDC:">Massive Data Center, commonly composed of
thousands of Top of Rack Switches (TORs).</t>
<t hangText="MTU:">Maximum Transmission Unit, the size in octets
of the largest packet that can be sent on a medium, see <xref
target="RFC1122"/> 1.3.3.</t>
@ -201,7 +200,7 @@
in interfaces with thousands of disaggregated prefixes.</t>
<t>Therefore the L3DL protocol is session oriented and uses
incremental announcement and widrawal with hot restart, a la BGP
incremental announcement and widrawal with session restart, a la BGP
(<xref target="RFC4271"/>).</t>
</section>
@ -247,7 +246,7 @@
</figure>
<t>There are two protocols, the inter-device per-link layer 3
discovery and the interface to the upper level BGP-like API:
discovery and the API to the upper level BGP-like routing prototol:
<list style="symbols">
<t>Inter-device PDUs are used to exchange device and logical link
@ -272,21 +271,21 @@
<section anchor="ilpo" title="Inter-Link Protocol Overview">
<t>Two devices discover each other and their respective identities
by sending multicast HELLO PDUs (<xref target="hello"/>). To allow
by sending multicast HELLO PDUs (<xref target="hello"/>). To assure
discovery of new devices coming up on a multi-link topology, devices
on such a topology send periodic HELLOs forever, see <xref
target="dhello"/>.</t>
<t>Once a new device is recognized, both devices attempt to
negotiate and establish peering by sending unicast OPEN PDUs (<xref
target="open"/>). In an established peering, the Encapsulations
(<xref target="afisafi"/>) configured on an end point may be
announced and modified. Note that these are only the encapsuation
and addresses on the announcing interface; though a device's
loopback interface(s) may also be announced. When two devices on a
link have compatible Encapsulations and addresses, i.e. the same
AFI/SAFI and the same subnet, the link is announced via the BGP-LS
API.</t>
negotiate and establish a session by sending unicast OPEN PDUs
(<xref target="open"/>). In an established session, the
Encapsulations (<xref target="afisafi"/>) configured on an end point
may be announced and modified. Note that these are only the
encapsuation and addresses configured on the announcing interface;
though a device's loopback and overlay interface(s) may also be
announced. When two devices on a link have compatible
Encapsulations and addresses, i.e. the same AFI/SAFI and the same
subnet, the link is announced via the BGP-LS API.</t>
<section anchor="ladder" title="L3DL Ladder Diagram">
@ -302,7 +301,7 @@
PDUs are optional; though at least one encapsulation SHOULD be
agreed at some point.</t>
<t>The following is a ladder-style sketch of the L3DL protocol
<t>The following is a ladder-style diagram of the L3DL protocol
exchanges:</t>
<figure>
@ -380,8 +379,8 @@
<section anchor="transport" title="Transport Layer">
<t>L3DL PDUs are carried by a simple transport layer which allows
long PDUs to occupy many Ethernet frames. An L3DL frame is referred
to as a Datagram.</t>
PDUs to occupy many Ethernet frames. An L3DL Ethernet frame is
referred to as a Datagram.</t>
<t>The L3DL Transport Layer encapsulates each Datagram using a
common transport header.</t>
@ -402,7 +401,7 @@
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Datagram Length | Checksum ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
~ | Payload... |
~ | Payload... ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
</artwork>
</figure>
@ -411,9 +410,9 @@
<list style="hanging">
<t hangText="Version:">Seven-bit Version number of the protocol,
currently 0. Values other than 0 are treated as errors. The
protocol version nees to be in one and only one place, so it is in
the datagram as opposed to, for example, the PDU header.</t>
currently 0. Values other than 0 MUST BE treated as an error.
The protocol version nees to be in one and only one place, so it
is in the datagram as opposed to, for example, the PDU header.</t>
<t hangText="L:">A bit that set to one if this Datagram is the
last Datagram of the PDU. For a PDU which fits in only one
@ -436,6 +435,12 @@
thereof.</t>
</list></t>
<t>To avoid the need for a receiver to reassemble two PDUs at the
same time, a sender MUST NOT send a subsequent PDU when a PDU is
already in flight and not yet acknowledged if it is an ACKed PDU
Type.</t>
</section>
<section anchor="checksum" title="The Checksum">
@ -528,7 +533,7 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sig Type | Signature Length | ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
~ Signature |
~ Signature ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
</artwork>
</figure>
@ -557,7 +562,7 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
<t hangText="Signature Length:">The length of the Signature,
possibly including padding, in octets. If Sig Type is 0,
Signature Length must be 0.</t>
Signature Length MUST BE 0.</t>
<t hangText="Signature:">The result of running the signature
algorithm specified in Sig Type over all octets of the PDU except
@ -636,10 +641,6 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
<section anchor="hello" title="HELLO">
<t>WARNING: The second multicast address below is incorrect. We
need to get a new assignment. , which is what we really wanted with the second address
below.</t>
<t>The HELLO PDU is unique in that it is encapsulated in a multicast
Ethernet frame. It solicits response(s) from other LLEI(s) on the
link. See <xref target="dhello"/> for why multicast is used. The
@ -649,13 +650,15 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
<t hangText="01-80-C2-00-00-0E:">Nearest Bridge = Propagation
constrained to a single physical link; stopped by all types of
bridges (including MPRs (media converters)).</t>
bridges (including MPRs (media converters)). This SHOULD BE used
when the link is known to be a simple point to point link.</t>
<t hangText="To Be Assigned:"> When a switch receives a frame with
a multicast destination MAC it does not recognize, it forwards to
all ports. This destination MAC is to be sent when the interface
is known to be connected to a switch. See <xref
target="ieee"/>.</t>
target="ieee"/>. This SHOULD BE used when the link may be a
multi-point link.</t>
<?rfc subcompact="no"?></list></t>
@ -664,11 +667,12 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
exchange.</t>
<t>When an interface is turned up on a device, it SHOULD issue a
HELLO.</t>
HELLO if it is to participate in L3DL sessions.</t>
<t>If a constrained destination address configured, see above, then
the HELLO need not be repeated once a session has been created by an
exchange of OPENs.</t>
<t>If a constrained Nearest Bridge destination address is configured
for a point-to-point interface, see above, then the HELLO SHOULD NOT
be repeated once a session has been created by an exchange of
OPENs.</t>
<t>If the configured destination address is one that is propagated
by switches, the HELLO SHOULD be repeated at a configured interval,
@ -696,8 +700,8 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
separate logical link.</t>
<t>When a HELLO is received from a source MAC address with which
there is no established L3DL adjacency, the receiver SHOULD respond
with an OPEN PDU. The two devices establish an L3DL adjacency by
there is no established L3DL session, the receiver SHOULD respond
with an OPEN PDU. The two devices establish an L3DL session by
exchanging OPEN PDUs.</t>
<t>The Payload Length is zero as there is no payload.</t>
@ -711,7 +715,7 @@ uint32_t sbox_checksum_32(const uint8_t *b, const size_t n)
<t>Each device has learned the other's MAC Address from the HELLO
exchange, see <xref target="hello"/>. Therefore the OPEN and
subsequent PDUs are unicast, as opposed to the HELLO's multicast
subsequent PDUs MUST BE unicast, as opposed to the HELLO's multicast
frame.</t>
<!--
@ -754,9 +758,10 @@ q-->
<t>My LLEI is the sender's LLEI, see <xref target="llei"/>.</t>
<t>AttrCount is the number of attributes in the Attribute List.
Attributes are single octets whose semantics are user-defined.</t>
Attributes are single octets the semantics of which are
operator-defined.</t>
<t>A node may have zero or more user-defined attributes, e.g.
<t>A node may have zero or more operator-defined attributes, e.g.:
spine, leaf, backbone, route reflector, arabica, ...</t>
<t>Attribute syntax and semantics are local to an operator or
@ -767,19 +772,19 @@ q-->
target="tlv"/>.</t>
<t>Key Length is a 16-bit field denoting the length in octets of the
Key itself, not including the Auth Type or the Key Lengths. If
there is no Key, the Auth Type and key Length MUST both be zero.</t>
Key itself, not including the Auth Type or the Key Length. If there
is no Key, the Auth Type and key Length MUST both be zero.</t>
<t>The Key is specific to the operational environment. A failure to
authenticate is a failure to start the L3DL session, an ERROR PDU is
sent (Error Code 2), and HELLOs MUST be restarted.</t>
authenticate is a failure to start the L3DL session, an ERROR PDU
MUST BE sent (Error Code 2), and HELLOs MUST be restarted.</t>
<t>The Serial Number is that of the last received and processed
Encapsulation PDU. This allows a receiver sending an OPEN to tell
the sender that the receiver wants to resume a session and the
sender only needs to send data more recent than the Serial Number.
If this OPEN is not trying to restart a lost session, the Serial
Number MUST be set to zero.</t>
<t>The Serial Number is that of the last received and processed PDU.
This allows a receiver sending an OPEN to tell the sender that the
receiver wants to resume a session and the sender only needs to send
data more recent than the Serial Number. If this OPEN is not trying
to restart a lost session, the Serial Number MUST BE set to
zero.</t>
<t>The Signature fields are described in <xref target="tlv"/> and in
an asymmetric key environment serve as a proof of possession of the
@ -791,19 +796,29 @@ q-->
keep the session semantics alive. The timing and acceptable drop of
KEEPALIVE PDUs are discussed in <xref target="keepalive"/>.</t>
<t>If a sender of OPEN does not receive an ACK of the OPEN PDU Type,
then they MUST resend the same OPEN PDU, with the same Nonce.
Resending an unacknowledged OPEN PDU, like other ACKed PDUs, SHOULD
use exponential back-off, see <xref target="RFC1122"/>.</t>
<t>If a sender of OPEN does not receive an ACK of the OPEN PDU, then
they MUST resend the same OPEN PDU, with the same Nonce. Resending
an unacknowledged OPEN PDU, like other ACKed PDUs, SHOULD use
exponential back-off, see <xref target="RFC1122"/>.</t>
<t>If a properly authenticated OPEN arrives with a new Nonce from an
LLEI with which the receiving logical link endpoint believes it
already has an L3DL session (OPENs have already been exchanged), the
receiver MAY assume that the sending LLEI or entire device has been
reset. If the Serial Number in the OPEN is zero, then all
discovered encapsulation data SHOULD be withdrawn via the BGP-LS API
and the recipient MUST respond with a new OPEN. In this
circumstance encapsulations SHOULD NOT be kept.</t>
already has an L3DL session (OPENs have already been exchanged), and
the Serial Number in the OPEN is non-zero, the receiver SHOULD
establish a new session by sending an OPEN with the Serial Number of
the last data it received. Each party MUST resume sending
encapsulations etc. subsequent to the other party's Sequence Number.
And each MUST retain all previously discovered encapsulation and
other data.</t>
<t>If a properly authenticated OPEN arrives with a new Nonce from an
LLEI with which the receiving logical link endpoint believes it
already has an L3DL session (OPENs have already been exchanged), and
the Serial Number in the OPEN is zero, then the receiver MUST assume
that the sending LLEI or entire device has been reset. All
previously discovered encapsulation data MUST NOT be kept and MUST
be withdrawn via the BGP-LS API and the recipient MUST respond with
a new OPEN.</t>
</section>
@ -836,7 +851,7 @@ q-->
PDU, etc.</t>
<t>The ACKed PDU is the PDU Type of the PDU being acknowledged,
e.g., OPEN or one of the Encapsulations.</t>
e.g., OPEN, one of the Encapsulations, etc.</t>
<t>If there was an error processing the received PDU, then the EType
is non-zero. If the EType is zero, Error Code and Error Hint MUST
@ -848,12 +863,21 @@ q-->
error.</t>
<t>The decimal value of EType gives a strong hint how the receiver
sending the ACK believes things should proceed. The ETypes are
listed in <xref target="iana-error"/>. Someone stuck in the 1990s
might think of the error codes as 0x1zzz, 0x2zzz, etc. They might
be right. Or not.</t>
sending the ACK believes things should proceed:
<list style="empty">
<?rfc subcompact="yes"?>
<t>0 - No Error, Error Code and Error Hint MUST be zero</t>
<t>1 - Warning, something not too serious happened, continue</t>
<t>2 - Session should not be continued, try to restart</t>
<t>3 - Restart is hopeless, call the operator</t>
<t>4-15 - Reserved</t>
<?rfc subcompact="no"?>
</list></t>
<t>The Error Code indicates the type of error.</t>
<t>The Error Codes, noting protocol failures listed in thi document,
are listed in <xref target="iana-error"/>. Someone stuck in the
1990s might think the catenation of EType and Error Code as an echo
of 0x1zzz, 0x2zzz, etc. They might be right; or not.</t>
<t>The Error Hint is any additional data the sender of the error PDU
thinks will help the recipient or the debugger with the particular
@ -873,8 +897,7 @@ q-->
case of this ACK failure.</t>
<t>If the link is broken at layer 2, retransmission MAY BE retried
when the link comes back up if data have not changed in the
interim.</t>
when the link is restored.</t>
</section>
@ -887,11 +910,10 @@ q-->
session is considered established, and the devices SHOULD exchange
L3 interface encapsulations, L3 addresses, and L2.5 labels.</t>
<t>The Encapsulation types the peers exchange may be IPv4
Announcement (<xref target="ipv4"/>), IPv6 Announcement (<xref
target="ipv6"/>), MPLS IPv4 Announcement (<xref target="mpls4"/>),
MPLS IPv6 Announcement (<xref target="mpls6"/>), and/or possibly
others not defined here.</t>
<t>The Encapsulation types the peers exchange may be IPv4 (<xref
target="ipv4"/>), IPv6 (<xref target="ipv6"/>), MPLS IPv4 (<xref
target="mpls4"/>), MPLS IPv6 (<xref target="mpls6"/>), and/or
possibly others not defined here.</t>
<t>The sender of an Encapsulation PDU MUST NOT assume that the peer
is capable of the same Encapsulation Type. An ACK (<xref
@ -937,12 +959,12 @@ q-->
</artwork>
</figure>
<t>The 24-bit Count is the number of Encapsulations in the
Encapsulation list.</t>
<t>An Encapsulation PDU describes zero or more addresses of the
encapsulation type.</t>
<t>The 24-bit Count is the number of Encapsulations in the
Encapsulation list.</t>
<t>The Serial Number is a monotonically increasing 32-bit value
representing the sender's state in time. It may be an integer, a
timestamp, etc. On session restart (new OPEN), a receiver MAY
@ -950,7 +972,7 @@ q-->
send newer data.</t>
<t>If a sender has multiple links on the same interface, separate
state: data, ACKs, etc. must be kept for each peer.</t>
state: data, ACKs, etc. must be kept for each peer session.</t>
<t>Over time, multiple Encapsulation PDUs may be sent for an
interface as configuration changes.</t>
@ -988,9 +1010,10 @@ q-->
</artwork>
</figure>
<t>An Encapsulation PDU of Type T may announce new and/or withdraw
old encapsulations of Type T. It indicates this with the Ann/With
Encapsulation Flag, Announce == 1, Withdraw == 0.</t>
<t>Each encapsulation in an Encapsulation PDU of Type T may
announce new and/or withdraw old encapsulations of Type T. It
indicates this with the Ann/With Encapsulation Flag, Announce ==
1, Withdraw == 0.</t>
<t>Each Encapsulation interface address in an Encapsulation PDU is
either a new encapsulation be announced (Ann/With == 1) (yes, a la
@ -1006,20 +1029,18 @@ q-->
be marked as primary for a particular encapsulation type.</t>
<t>An Encapsulation interface address in an Encapsulation PDU MAY
be marked as a loopback, in which case the Loopback bit is
set.</t>
<t>Loopback addresses are generally not seen directly on an
external interface. One or more loopback addresses MAY be exposed
by configuration on one or more L3DL speaking external interfaces,
be marked as a loopback, in which case the Loopback bit is set.
Loopback addresses are generally not seen directly on an external
interface. One or more loopback addresses MAY be exposed by
configuration on one or more L3DL speaking external interfaces,
e.g. for iBGP peering. They SHOULD be marked as such, Loopback
Flag == 1.</t>
<t>Each Encapsulation interface address in an Encapsulation PDU is
that of the direct 'underlay interface (Under/Over == 1), or an
'overlay' address (Under/Over == 0), likely that of a VM or
container guest bridged on to the interface with an underlay
address.</t>
container guest bridged or configured on to the interface already
having an underlay address.</t>
</section>
@ -1053,7 +1074,8 @@ q-->
</artwork>
</figure>
<t>The 24-bit Count is the number of IPv4 Encapsulations.</t>
<t>The 24-bit Count is the number of IPv4 Encapsulations being
announced and/or withdrawn.</t>
</section>
@ -1094,7 +1116,8 @@ q-->
</artwork>
</figure>
<t>The 24-bit Count is the number of IPv6 Encapsulations.</t>
<t>The 24-bit Count is the number of IPv6 Encapsulations being
announced and/or withdrawn.</t>
</section>
@ -1160,7 +1183,8 @@ q-->
</artwork>
</figure>
<t>The 24-bit Count is the number of MPLSv4 Encapsulations.</t>
<t>The 24-bit Count is the number of MPLSv4 Encapsulation being
announced and/or withdrawns.</t>
</section>
@ -1169,7 +1193,7 @@ q-->
<t>The MPLS IPv4 Encapsulation describes a logical link's ability
to exchange labeled IPv4 packets on one or more subnets. It does
so by stating the interface's addresses, the corresponding prefix
lengths, and the corresponding labels which will be accepted fpr
lengths, and the corresponding labels which will be accepted for
each address.</t>
<!--
protocol "PDU Type = 7:8,Payload Length:32,Count:24,Serial Number:32,Encaps Flags:8,MPLS Label List ...:16,IPv6 Address:128,Prefix Len:8,more ...:8,Sig Type:8,Signature Length:16,Signature ...:32"
@ -1203,14 +1227,9 @@ q-->
</artwork>
</figure>
<t>The 24-bit Count is the number of MPLSv6 Encapsulations.</t>
<t>The 24-bit Count is the number of MPLSv6 Encapsulations being
announced and/or withdrawn.</t>
<t>The MPLS IPv6 Encapsulation describes a logical link's ability
to exchange labeled IPv6 packets on one or more subnets. It does
so by stating the interface's addresses, the corresponding prefix
lengths, and the corresponding labels which will be accepted fpr
each address.</t>
</section>
</section>
@ -1256,26 +1275,6 @@ q-->
<section anchor="keepalive" title="KEEPALIVE - Layer 2 Liveness">
<t>L3DL devices SHOULD beacon frequent Layer 2 KEEPALIVE PDUs to
ensure session continuity. A receiver may choose to ignore
KEEPALIVE PDUs.</t>
<t>An operational deployment MUST BE configured whether to use
KEEPALIVEs or not, either globally, or down to per-link granularity.
Disagreement MAY result in repeated session break and
reestablishment.</t>
<t>KEEPALIVEs SHOULD be beaconed at a configured frequency. One per
second is the default. Layer 3 liveness, such as BFD, may be more
(or less) aggressive.</t>
<t>If a KEEPALIVE is not received from a peer with which a receiver
has an open session for a configurable time (default 30 seconds),
the link SHOULD BE presumed down. The devices MAY keep
configuration state and restore it without retransmission if no data
have changed. Otherwise, a new session SHOULD BE established and
new Encapsulation PDUs exchanged.</t>
<!--
protocol "PDU Type = 2:8,Payload Length = 0:32,Sig Type = 0:8,Signature Length = 0:16"
-->
@ -1292,6 +1291,31 @@ q-->
</artwork>
</figure>
<t>L3DL devices SHOULD beacon frequent Layer 2 KEEPALIVE PDUs to
ensure session continuity. A receiver may choose to ignore
KEEPALIVE PDUs.</t>
<t>An operational deployment MUST BE configured whether to use
KEEPALIVEs or not, either globally, or down to per-link granularity.
Disagreement MAY result in repeated session break and
reestablishment.</t>
<t>KEEPALIVEs SHOULD be beaconed at a configured frequency. One per
second is the default. Layer 3 liveness, such as BFD, may be more
(or less) aggressive.</t>
<t>When a sender transmits a PDU which is not a KEEPALIVE, the
sender SHOULD reset the KEEPALIVE timer. I.e. sending any PDU acts
as a keepalive. Once the last fragment has been sent, the
KEEPALIVE timer SHOULD BE restarted. Do not wait for the ACK.</t>
<t>If a KEEPALIVE or other PDUs have not been received from a peer
with which a receiver has an open session for a configurable time
(default 30 seconds), the link SHOULD BE presumed down. The devices
MAY keep configuration state and restore it without retransmission
if no data have changed. Otherwise, a new session SHOULD BE
established and new Encapsulation PDUs exchanged.</t>
</section>
<section anchor="l3liveness" title="Layers 2.5 and 3 Liveness">
@ -1303,7 +1327,7 @@ q-->
technique.</t>
<t>This protocol assumes that one or more Encapsulation addresses
will be used to ping, run BFD, or whatever the operator
may be used to ping, run BFD, or whatever the operator
configures.</t>
</section>
@ -1317,7 +1341,7 @@ q-->
LLEIs and Encapsulations on each logical link interface.</t>
<t>Full topology discovery is not appropriate at the L3DL layer, so
Dijkstra à la IS-IS etc. is assumed to be done by higher level
Dijkstra a la IS-IS etc. is assumed to be done by higher level
protocols such as BGP-SPF.</t>
<t>Therefore the LLEIs, link Encapsulations, and state changes are
@ -1370,24 +1394,15 @@ q-->
<section anchor="dhello" title="HELLO Discussion">
<!--
<t>There is the question of whether to allow an intermediate
switch to be transparent to discovery. We consider that an
interface on a device is a Layer 2 or a Layer 3 interface. In
theory it could be a Layer 3 interface with no encapsulation or
Layer 3 addressing currently configured.</t>
-->
<t>A device with multiple Layer 2 interfaces, traditionally called
a switch, may be used to forward frames and therefore packets from
multiple devices to one logical interface (LLEI), I, on an L3DL
speaking device. Interface I could discover a peer J across the
switch. Later, a prospective peer K could come up across the
switch. If I was not still sending and listening for HELLOs, the
potential peering with K could not be discovered. Therefore,
interfaces MUST continue to send HELLOs as long as they are turned
up.</t>
potential peering with K could not be discovered. Therefore, on
multi-link interfaces MUST continue to send HELLOs as long as they
are turned up.</t>
</section>
@ -1444,15 +1459,15 @@ q-->
encapsulation, the implementation MAY mark it as primary by
default.</t>
<t>An implementation SHOULD allow optional configuration which
updates the local forwarding table with overlay and underlay data
both learned from L3DL peers and configured locally.</t>
<t>An implementation MAY allow optional configuration which updates
the local forwarding table with overlay and underlay data both
learned from L3DL peers and configured locally.</t>
</section>
<section anchor="security" title="Security Considerations">
<t>The protocol as it is MUST NOT be used outside a datacenter or
<t>The protocol as is MUST NOT be used outside a datacenter or
similarly closed environment due to lack of formal definition of the
authentication and authorization mechanism. Sufficient mechanisms
may be described in separate documents.</t>
@ -1588,12 +1603,13 @@ q-->
<section anchor="acks" title="Acknowledgments">
<t>The authors thank Cristel Pelsser for multiple reviews, Jeff Haas
for review and comments, Joe Clarke for a useful review, John
Scudder for deeply serious review and comments, Larry Kreeger for a
lot of layer 2 clue, Martijn Schmidt for his contribution, Neeraj
Malhotra for review, Russ Housley for checksum discussion and sBox,
and Steve Bellovin for checksum advice.</t>
<t>The authors thank Cristel Pelsser for multiple reviews, Harsha
Kovuru for comments during implementation, Jeff Haas for review and
comments, Joe Clarke for a useful review, John Scudder for deeply
serious review and comments, Larry Kreeger for a lot of layer 2
clue, Martijn Schmidt for his contribution, Neeraj Malhotra for
review, Russ Housley for checksum discussion and sBox, and Steve
Bellovin for checksum advice.</t>
</section>