reorganized and framed Races, Ordering, and Transactions
-20 published
This commit is contained in:
parent
28289e3b8f
commit
9461a09b73
1 changed files with 107 additions and 76 deletions
|
|
@ -1739,87 +1739,118 @@ Cache Router
|
|||
</t>
|
||||
</section>
|
||||
|
||||
<section anchor="races" title="ROA PDU Race Minimization">
|
||||
<t>
|
||||
When a cache is sending ROA (IPv4 or IPv6) PDUs to a router
|
||||
undesirable race conditions are possible:
|
||||
<list style="hanging">
|
||||
<t hangText="Make Before Break:">
|
||||
For some prefix P, an operator may create two or more ROAs
|
||||
with different ASes because they are in the process of
|
||||
changing what provider AS may announce P. This is a case of
|
||||
"make before break." If a cache is feeding a router and sends
|
||||
the one not yet in service a significant time before sending
|
||||
the one currently in service, then BGP data could be marked
|
||||
Invalid during the interval. To minimize that interval, the
|
||||
cache SHOULD semd all VRPs for the same prefix as close to
|
||||
sequentially as possible.
|
||||
<section anchor="rot" title="Races, Ordering, and Transactions">
|
||||
<t>
|
||||
<xref target="races"/> describes race conditions which could, if
|
||||
not mitigated, cause BGP states to be temporarily marked as
|
||||
Invalid by Route Origin Validation. Minimal mitigations are
|
||||
described.
|
||||
</t>
|
||||
<t>
|
||||
<xref target="order"/> describes an optional but RECOMMENDED
|
||||
ordering of all PDUs which allows mitigation of the race
|
||||
conditions described in <xref target="races"/>.
|
||||
</t>
|
||||
<t>
|
||||
<xref target="trans"/> describes how a router MAY process
|
||||
ordered PDUs, especially IPvX VRPs, into atomic transactions to
|
||||
be committed in a manner to mitigate the race conditions.
|
||||
</t>
|
||||
|
||||
<section anchor="races" title="ROA PDU Race Minimization">
|
||||
<t>
|
||||
When a cache is sending ROA (IPv4 or IPv6) VRPs to a router
|
||||
undesirable race conditions are possible:
|
||||
<list style="hanging">
|
||||
<t hangText="Make Before Break:">
|
||||
For prefix P, an operator may create two or more ROAs with
|
||||
different ASes because they are in the process of changing
|
||||
what provider AS may announce P. This is a known as "make
|
||||
before break." If a cache is feeding a router and sends
|
||||
the one not yet in service a significant time before
|
||||
sending the one currently in service, then BGP data could
|
||||
be marked Invalid during the interval. To minimize that
|
||||
interval, the cache SHOULD send all VRPs for the same
|
||||
prefix as close to sequentially as possible.
|
||||
</t>
|
||||
<t hangText="Longest Prefix Match:">
|
||||
If an operator has created a ROA for prefix P0, and another
|
||||
operator (often their customer) has created a ROA for P1 which
|
||||
is a sub-prefix covered by P0, a router which receives the VRP
|
||||
for P0 before the VRP for P1 might mark BGP for prefix P1 Invalid
|
||||
until the P1 announcement is processed. Therefore, the cache
|
||||
SHOULD announce the sub-prefix P1 before the covering prefix
|
||||
P0. Conversely, the cache SHOULD withdraw covering prefixes
|
||||
before covered sub-prefixes.
|
||||
</t>
|
||||
<t hangText="AS 0:">
|
||||
To minimize risk of inadvertent marking of BGP data as
|
||||
Invalid, an announcement VRP for prefix P which has an AS
|
||||
of 0, SHOULD be sent after all other VRPs for prefix P.
|
||||
Conversely, a withdrawal VRP for prefix P which has an AS
|
||||
of 0, SHOULD be sent before all other prefix PDUs for
|
||||
prefix P.
|
||||
</t>
|
||||
</list>
|
||||
</t>
|
||||
<t hangText="Longest Prefix Match:">
|
||||
If an operator has created a ROA for prefix P0, and another
|
||||
operator (often their customer) has created a ROA for P1 which
|
||||
is a sub-prefix covered by P0, a router which receives the ROA
|
||||
for P0 before that for P1 might mark BGP for prefix P1 Invalid
|
||||
until the P1 announcement is processed. Therefore, the cache
|
||||
SHOULD announce the sub-prefix P1 before the covering prefix
|
||||
P0. Conversely, the cache SHOULD withdraw covering prefixes
|
||||
before covered sub-prefixes.
|
||||
</section>
|
||||
|
||||
<section anchor="order" title="PDU Ordering">
|
||||
<t>
|
||||
A Version 2 Cache SHOULD, unless it requires major revision of
|
||||
existing code, order Payload PDUs it sends to routers.
|
||||
Ascending order is considered somewhat more efficient as
|
||||
routers are likely building trees. Iff ordering, with the
|
||||
exceptions in <xref target="races"/> above, ordering MUST be,
|
||||
as follows:
|
||||
</t>
|
||||
<t hangText="AS 0:">
|
||||
To minimize risk of inadvertent marking of BGP data as
|
||||
Invalid, an announcement VRP for prefix P which has an AS of
|
||||
0, SHOULD be sent after all other prefix PDUs for prefix P.
|
||||
Conversely, a withdrawal VRP for prefix P which has an AS of
|
||||
0, SHOULD be sent before all other prefix PDUs for prefix P.
|
||||
<list style="symbols">
|
||||
<t>
|
||||
PDUs are first ordered by PDU Type,
|
||||
</t>
|
||||
<t>
|
||||
IPv4 and IPv6 Prefix VRPs are ordered by: first IPvX Prefix,
|
||||
second Prefix Length, third Max Length, and fourth Autonomous
|
||||
System Number. Treating announcements of VPUs with AS 0 as
|
||||
sorting last, and withdrawals as sorting first, fulfills the "AS
|
||||
0" requirement of <xref target="races"/>. Treating
|
||||
announcements of sub-prefixes as sorting first fulfills the
|
||||
"Longest Prefix Match" requirement of <xref target="races"/>.
|
||||
</t>
|
||||
<t>
|
||||
Router Key PDUs are ordered by AS Number and then Subject Public
|
||||
Key Info.
|
||||
</t>
|
||||
<t>
|
||||
And ASPA PDUs ordered by Customer AS.
|
||||
</t>
|
||||
</list>
|
||||
<t>
|
||||
Unless specifically configured for a particular cache, a
|
||||
router MUST NOT depend on payload PDU ordering.
|
||||
</t>
|
||||
</list>
|
||||
<t>
|
||||
In order to further mitigate such race conditions, a router MAY
|
||||
choose not to make effective the PDUs received in response to a
|
||||
request until the relevant End of Data PDU is received.
|
||||
</t>
|
||||
<t>
|
||||
However, a router MAY apply a time limit for how long it is
|
||||
willing to wait for the End of Data PDU.
|
||||
</t>
|
||||
</t>
|
||||
|
||||
</section>
|
||||
|
||||
<section anchor="ordering" title="PDU Ordering">
|
||||
<t>
|
||||
A Version 2 Cache SHOULD, unless it requires major revision of
|
||||
existing code, order payload PDUs (IPvX, Router Key, ASPA) it
|
||||
sends to routers. Ascending order is considered somewhat more
|
||||
efficient as routers are likely building trees. Iff ordering,
|
||||
with the exceptions in <xref target="races"/> above, ordering MUST
|
||||
be, as follows:
|
||||
</t>
|
||||
<list style="symbols">
|
||||
<t>
|
||||
PDUs are first ordered by PDU Type,
|
||||
</t>
|
||||
<t>
|
||||
IPv4 and IPv6 Prefix VRPs are ordered by: first IPvX Prefix,
|
||||
second Prefix Length, third Max Length, and fourth Autonomous
|
||||
System Number. Treating announcements of VPUs with AS 0 as
|
||||
sorting last, and withdrawals as sorting first, fulfills the "AS
|
||||
0" requirement of <xref target="races"/>,
|
||||
</t>
|
||||
<t>
|
||||
Router Key PDUs are ordered by AS Number and then Subject Public
|
||||
Key Info,
|
||||
</t>
|
||||
<t>
|
||||
And ASPA PDUs ordered by Customer AS.
|
||||
</t>
|
||||
</list>
|
||||
<t>
|
||||
Routers MUST NOT depend on payload PDU ordering.
|
||||
</t>
|
||||
</section>
|
||||
|
||||
<section anchor="trans" title="Transaction-like Commit">
|
||||
<t>
|
||||
Iff a router has been configured to know that a particular
|
||||
cache's data are ordered per <xref target="order"/>, a router
|
||||
MAY wait to commit, i.e. make effective, IPvX VRPs only after
|
||||
all sub-prefixes of a received covering prefix are received.
|
||||
</t>
|
||||
<t>
|
||||
Another method a router MAY choose to mitigate the above race
|
||||
conditions is not to commit, i.e. make effective, the VRPs
|
||||
received in response to a request until the relevant End of
|
||||
Data PDU is received. During start or restart of a session,
|
||||
this approach may consume considerable memory. If using this
|
||||
approach, a router MUST apply a time limit for how long it is
|
||||
willing to wait for the End of Data PDU.
|
||||
</t>
|
||||
</section>
|
||||
|
||||
</section>
|
||||
|
||||
|
||||
<!---
|
||||
<section anchor="Scenarios" title="Deployment Scenarios">
|
||||
<t>
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue