From b2a6cb798c3768a9f2b426c7a6527a8b269723b5 Mon Sep 17 00:00:00 2001 From: Randy Bush Date: Thu, 6 Feb 2020 16:47:36 -0800 Subject: [PATCH] xml from rfced\'s version of rfc 8210 --- draft-ymbk-r210bis.xml | 1851 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1851 insertions(+) create mode 100644 draft-ymbk-r210bis.xml diff --git a/draft-ymbk-r210bis.xml b/draft-ymbk-r210bis.xml new file mode 100644 index 0000000..f7fed87 --- /dev/null +++ b/draft-ymbk-r210bis.xml @@ -0,0 +1,1851 @@ + + + + + + + + + + + + + + + + + + The Resource Public Key Infrastructure (RPKI) to Router + Protocol, Version 1 + + + + Internet Initiative Japan +
+ + 5147 Crystal Springs + Bainbridge Island + Washington + 98110 + United States of America + + randy@psg.com +
+
+ + + Dragon Research Labs +
+ sra@hactrn.net +
+
+ + + + + + In order to verifiably validate the origin Autonomous Systems + and Autonomous System Paths of BGP announcements, routers need + a simple but reliable mechanism to receive Resource Public Key + Infrastructure (RFC 6480) prefix origin data and router keys + from a trusted cache. This document describes a protocol to + deliver them. + + + This document describes version 1 of the RPKI-Router protocol. + RFC 6810 describes version 0. This document updates RFC 6810. + + + +
+ + + +
+ + In order to verifiably validate the origin Autonomous Systems + (ASes) and AS paths of BGP announcements, routers need a + simple but reliable mechanism to receive cryptographically + validated Resource Public Key Infrastructure (RPKI) + prefix origin data and router keys + from a trusted cache. This document describes a protocol to + deliver them. The design is intentionally constrained to be + usable on much of the current generation of ISP router + platforms. + + + This document updates . + + + describes the deployment structure, and + then presents an operational overview. + The binary payloads of the protocol are formally described in + , and the expected Protocol Data Unit + (PDU) sequences are described in . + The transport protocol options are described in + . details + how routers and caches are configured to connect and authenticate. + describes likely deployment + scenarios. The traditional security and IANA considerations end + the document. + + + The protocol is extensible in order to support new PDUs with + new semantics, if deployment experience indicates that they are + needed. PDUs are versioned should deployment experience call + for change. + + +
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", + "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", + "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document + are to be interpreted as described in BCP 14 + + when, + and only when, they appear in all capitals, as shown here. +
+ +
+ + This section summarizes the significant changes between + and the protocol described in this + document. + + + + + New Router Key PDU type () added. + + + Explicit timing parameters (, + ) added. + + + Protocol version number incremented from 0 (zero) to 1 (one). + + + Protocol version number negotiation + () added. + + + +
+ +
+ +
+ + The following terms are used with special meaning. + + + + The authoritative data of the RPKI are published in a + distributed set of servers at the IANA, Regional Internet + Registries (RIRs), National Internet Registries (NIRs), + and ISPs; see . + + + A cache is a coalesced copy + of the published Global RPKI data, periodically fetched or + refreshed, directly or indirectly, using the + rsync protocol or some + successor. Relying Party software is used to gather and + validate the distributed data of the RPKI into a cache. + Trusting this cache further is a matter between the + provider of the cache and a Relying Party. + + + "Serial Number" is a + 32&nbhy;bit strictly increasing unsigned integer which wraps + from 2^32-1 to 0. It denotes the logical version of a + cache. A cache increments the value when it successfully + updates its data from a parent cache or from primary RPKI + data. While a cache is receiving updates, new incoming + data and implicit deletes are associated with the new + serial but MUST NOT be sent until the fetch is complete. + A Serial Number is not commensurate between different + caches or different protocol versions, nor need it be + maintained across resets of the cache server. See + on DNS Serial Number Arithmetic + for too much detail on the topic. + + + When a cache server is started, it generates a Session ID + to uniquely identify the instance of the cache and + to bind it to the sequence of Serial Numbers that cache + instance will generate. This allows the router to restart a + failed session knowing that the Serial Number it is using is + commensurate with that of the cache. + + + A payload PDU is a protocol message which contains data for + use by the router, as opposed to a PDU which conveys the control + mechanisms of this protocol. Prefixes and Router Keys are + examples of payload PDUs. + + + +
+ +
+ + Deployment of the RPKI to reach routers has a three-level + structure as follows: + + + The authoritative data of the RPKI are published in a + distributed set of servers at the IANA, RIRs, NIRs, and + ISPs (see ). + + + Local caches are a local set of one or more collected and + verified caches of RPKI data. A Relying Party, e.g., router + or other client, MUST have a trust relationship with, and a + trusted transport channel to, any cache(s) it uses. + + + A router fetches data from a local cache using the protocol + described in this document. It is said to be a client of the + cache. There MAY be mechanisms for the router to assure + itself of the authenticity of the cache and to authenticate + itself to the cache (see ). + + + +
+ +
+ + A router establishes and keeps open a connection to one or more + caches with which it has client/server relationships. It is + configured with a semi-ordered list of caches and establishes a + connection to the most preferred cache, or set of caches, which + accept the connections. + + + The router MUST choose the most preferred, by configuration, + cache or set of caches so that the operator may control load + on their caches and the Global RPKI. + + + Periodically, the router sends to the cache the most recent + Serial Number for which it has received data from that + cache, i.e., the router's current Serial Number, in the form of a + Serial Query. When a router establishes a new session with a + cache or wishes to reset a current relationship, it sends a + Reset Query. + + + The cache responds to the Serial Query with all data changes + which took place since the given Serial Number. This may be the + null set, in which case the End of Data PDU () + is still sent. Note that the Serial Number comparison used to + determine "since the given Serial Number" MUST take wrap-around + into account; see . + + + When the router has received all data records from the cache, + it sets its current Serial Number to that of the Serial Number + in the received End of Data PDU. + + + When the cache updates its database, it sends a Notify PDU to + every currently connected router. This is a hint + that now would be a good time for the router to poll for an + update, but it is only a hint. The protocol requires the router + to poll for updates periodically in any case. + + + Strictly speaking, a router could track a cache simply by + asking for a complete data set every time it updates, but this + would be very inefficient. The Serial-Number-based + incremental update mechanism allows an efficient transfer of + just the data records which have changed since the last update. + As with any update protocol based on incremental transfers, + the router must be prepared to fall back to a full transfer if + for any reason the cache is unable to provide the necessary + incremental data. Unlike some incremental transfer protocols, + this protocol requires the router to make an explicit request + to start the fallback process; this is deliberate, as the + cache has no way of knowing whether the router has also + established sessions with other caches that may be able to + provide better service. + + + As a cache server must evaluate certificates and ROAs (Route + Origin Authorizations; see ), + which are time dependent, servers' clocks MUST be correct to a + tolerance of approximately an hour. + +
+ +
+ + The exchanges between the cache and the router are sequences of + exchanges of the following PDUs according to the rules described + in . + + + Reserved fields (marked "zero" in PDU diagrams) MUST be zero + on transmission and MUST be ignored on receipt. + + +
+ + PDUs contain the following data elements: + + + An 8-bit unsigned integer, currently 1, denoting the + version of this protocol. + + + An 8-bit unsigned integer, denoting the type of the PDU, + e.g., IPv4 Prefix. + + + The Serial Number of the RPKI cache when this set of PDUs + was received from an upstream cache server or gathered from + the Global RPKI. A cache increments its Serial Number when + completing a rigorously validated update from a parent cache + or the Global RPKI. + + + A 16-bit unsigned integer. + When a cache server is started, it generates a Session + ID to identify the instance of the cache and to bind it + to the sequence of Serial Numbers that cache instance + will generate. This allows the router to restart a + failed session knowing that the Serial Number it is + using is commensurate with that of the cache. If, at + any time after the protocol version has been negotiated + (), either the router or the + cache finds that the value of the Session ID is not the + same as the other's, the party which detects the mismatch + MUST immediately terminate the session with an Error + Report PDU with code 0 ("Corrupt Data"), + and the router MUST flush all data learned from that cache. + + + Note that sessions are specific to a particular protocol + version. That is, if a cache server supports multiple + versions of this protocol, happens to use the same + Session ID value for multiple protocol versions, and + further happens to use the same Serial Number values for + two or more sessions using the same Session ID but + different Protocol Version values, the Serial Numbers + are not commensurate. The full test for whether Serial + Numbers are commensurate requires comparing Protocol + Version, Session ID, and Serial Number. To reduce the + risk of confusion, cache servers SHOULD NOT use the same + Session ID across multiple protocol versions, but even + if they do, routers MUST treat sessions with different + Protocol Version fields as separate sessions even if + they do happen to have the same Session ID. + + + Should a cache erroneously reuse a Session ID so that a + router does not realize that the session has changed (old + Session ID and new Session ID have the same numeric value), + the router may become confused as to the content of the cache. + The time it takes the router to discover that it is confused + will depend on whether the Serial Numbers are also reused. If + the Serial Numbers in the old and new sessions are different + enough, the cache will respond to the router's Serial Query + with a Cache Reset, which will solve the problem. If, + however, the Serial Numbers are close, the cache may respond + with a Cache Response, which may not be enough to bring the + router into sync. In such cases, it's likely but not + certain that the router will detect some discrepancy between + the state that the cache expects and its own state. For + example, the Cache Response may tell the router to drop a + record which the router does not hold or may tell the + router to add a record which the router already has. In + such cases, a router will detect the error and reset the + session. The one case in which the router may stay out of + sync is when nothing in the Cache Response contradicts any + data currently held by the router. + + + Using persistent storage for the Session ID or a + clock-based scheme for generating Session IDs should + avoid the risk of Session ID collisions. + + + The Session ID might be a pseudorandom value, a + strictly increasing value if the cache has reliable + storage, et cetera. A seconds-since-epoch timestamp + value such as the POSIX time() function makes a good + Session ID value. + + + A 32-bit unsigned integer which has as its value the count + of the bytes in the entire PDU, including the 8 bytes of + header which includes the length field. + + + The lowest-order bit of the Flags field is 1 for an + announcement and 0 for a withdrawal. For a Prefix PDU + (IPv4 or IPv6), the flag indicates whether this PDU + announces a new right to announce the prefix or + withdraws a previously announced right; a withdraw + effectively deletes one previously announced Prefix PDU + with the exact same Prefix, Length, Max-Len, and + Autonomous System Number (ASN). Similarly, for a Router + Key PDU, the flag indicates whether this PDU announces a + new Router Key or deletes one previously announced + Router Key PDU with the exact same AS Number, + subjectKeyIdentifier, and subjectPublicKeyInfo. + + + The remaining bits in the Flags field are reserved for + future use. In protocol version 1, they MUST be zero on + transmission and MUST be ignored on receipt. + + + An 8-bit unsigned integer denoting the shortest prefix + allowed by the Prefix element. + + + An 8-bit unsigned integer denoting the longest prefix + allowed by the Prefix element. This MUST NOT be less + than the Prefix Length element. + + + The IPv4 or IPv6 prefix of the ROA. + + + A 32-bit unsigned integer representing an ASN allowed to + announce a prefix or associated with a router key. + + 20-octet + Subject Key Identifier (SKI) value of a router key, as + described in . + + A router key's + subjectPublicKeyInfo value, as described in + . This is the + full ASN.1 DER encoding of the subjectPublicKeyInfo, + including the ASN.1 tag and length values of the + subjectPublicKeyInfo SEQUENCE. + + + Interval between normal cache polls. + See . + + + Interval between cache poll retries after a failed cache poll. + See . + + + Interval during which data fetched from a cache remains + valid in the absence of a successful subsequent cache poll. + See . + + + +
+ +
+ + The cache notifies the router that the cache has new data. + + + The Session ID reassures the router that the Serial Numbers + are commensurate, i.e., the cache session has not been + changed. + + + Upon receipt of a Serial Notify PDU, the router MAY issue an + immediate Serial Query () or + Reset Query () without waiting for + the Refresh Interval timer (see ) + to expire. + + + Serial Notify is the only message that the cache can send + that is not in response to a message from the router. + + + If the router receives a Serial Notify PDU during the + initial startup period where the router and cache are still + negotiating to agree on a protocol version, the router + MUST simply ignore the Serial Notify PDU, even if the + Serial Notify PDU is for an unexpected protocol version. + See for details. + + +
+ +0 8 16 24 31 +.-------------------------------------------. +| Protocol | PDU | | +| Version | Type | Session ID | +| 1 | 0 | | ++-------------------------------------------+ +| | +| Length=12 | +| | ++-------------------------------------------+ +| | +| Serial Number | +| | +`-------------------------------------------' + +
+
+ +
+ + The router sends a Serial Query to ask the cache + for all announcements and withdrawals which have + occurred since the Serial Number specified in the Serial + Query. + + + The cache replies to this query with a Cache Response PDU + () if the cache has a + (possibly null) record of the changes since the Serial Number + specified by the router, followed by zero or more payload + PDUs and an End Of Data PDU (). + + + When replying to a Serial Query, the cache MUST return the + minimum set of changes needed to bring the router into sync + with the cache. That is, if a particular prefix or router + key underwent multiple changes between the Serial Number + specified by the router and the cache's current Serial + Number, the cache MUST merge those changes to present the + simplest possible view of those changes to the router. In + general, this means that, for any particular prefix or + router key, the data stream will include at most one + withdrawal followed by at most one announcement, and if all + of the changes cancel out, the data stream will not mention + the prefix or router key at all. + + + The rationale for this approach is that the entire purpose of + the RPKI&nbhy;Router protocol is to offload work from the router + to the cache, and it should therefore be the cache's job to + simplify the change set, thus reducing work for the router. + + + If the cache does not have the data needed to update the + router, perhaps because its records do not go back to the + Serial Number in the Serial Query, then it responds with a + Cache Reset PDU (). + + + The Session ID tells the cache what instance the router + expects to ensure that the Serial Numbers are commensurate, + i.e., the cache session has not been changed. + +
+ +0 8 16 24 31 +.-------------------------------------------. +| Protocol | PDU | | +| Version | Type | Session ID | +| 1 | 1 | | ++-------------------------------------------+ +| | +| Length=12 | +| | ++-------------------------------------------+ +| | +| Serial Number | +| | +`-------------------------------------------' + +
+
+ +
+ + The router tells the cache that it wants to + receive the total active, current, non-withdrawn database. + The cache responds with a Cache Response PDU + (), followed by zero or more + payload PDUs and an End of Data PDU (). + +
+ +0 8 16 24 31 +.-------------------------------------------. +| Protocol | PDU | | +| Version | Type | zero | +| 1 | 2 | | ++-------------------------------------------+ +| | +| Length=8 | +| | +`-------------------------------------------' + +
+
+ +
+ + The cache responds to queries with zero or more payload + PDUs. When replying to a Serial Query + (), the cache sends the set of + announcements and withdrawals that have occurred since the + Serial Number sent by the client router. When replying to a + Reset Query (), the cache sends + the set of all data records it has; in this case, the + withdraw/announce field in the payload PDUs MUST have the + value 1 (announce). + + + In response to a Reset Query, the new value of the Session ID + tells the router the instance of the cache session for future + confirmation. In response to a Serial Query, the Session ID + being the same reassures the router that the Serial Numbers + are commensurate, i.e., the cache session has not been changed. + +
+ +0 8 16 24 31 +.-------------------------------------------. +| Protocol | PDU | | +| Version | Type | Session ID | +| 1 | 3 | | ++-------------------------------------------+ +| | +| Length=8 | +| | +`-------------------------------------------' + +
+
+ +
+
+ +0 8 16 24 31 +.-------------------------------------------. +| Protocol | PDU | | +| Version | Type | zero | +| 1 | 4 | | ++-------------------------------------------+ +| | +| Length=20 | +| | ++-------------------------------------------+ +| | Prefix | Max | | +| Flags | Length | Length | zero | +| | 0..32 | 0..32 | | ++-------------------------------------------+ +| | +| IPv4 Prefix | +| | ++-------------------------------------------+ +| | +| Autonomous System Number | +| | +`-------------------------------------------' + +
+ + The lowest-order bit of the Flags field is 1 for an + announcement and 0 for a withdrawal. + + + In the RPKI, nothing prevents a signing certificate from + issuing two identical ROAs. In this case, there would be no + semantic difference between the objects, merely a process + redundancy. + + + In the RPKI, there is also an actual need for what might + appear to a router as identical IPvX PDUs. + This can occur when an upstream certificate is being reissued + or there is an address ownership transfer up the validation + chain. The ROA would be identical in the router sense, + i.e., have the same {Prefix, Len, Max-Len, ASN}, but it would + have a different validation path in the RPKI. This is + important to the RPKI but not to the router. + + + The cache server MUST ensure that it has told the router + client to have one and only one IPvX PDU for a unique {Prefix, + Len, Max-Len, ASN} at any one point in time. Should the + router client receive an IPvX PDU with a {Prefix, Len, + Max-Len, ASN} identical to one it already has active, it + SHOULD raise a Duplicate Announcement Received error. + +
+ +
+
+ +0 8 16 24 31 +.-------------------------------------------. +| Protocol | PDU | | +| Version | Type | zero | +| 1 | 6 | | ++-------------------------------------------+ +| | +| Length=32 | +| | ++-------------------------------------------+ +| | Prefix | Max | | +| Flags | Length | Length | zero | +| | 0..128 | 0..128 | | ++-------------------------------------------+ +| | ++--- ---+ +| | ++--- IPv6 Prefix ---+ +| | ++--- ---+ +| | ++-------------------------------------------+ +| | +| Autonomous System Number | +| | +`-------------------------------------------' + +
+ + Analogous to the IPv4 Prefix PDU, it has 96 more bits and no magic. + +
+ +
+ + The cache tells the router it has no more data for the request. + + + The Session ID and Protocol Version MUST be the same as that of + the corresponding Cache Response which began the (possibly null) + sequence of payload PDUs. + +
+ +0 8 16 24 31 +.-------------------------------------------. +| Protocol | PDU | | +| Version | Type | Session ID | +| 1 | 7 | | ++-------------------------------------------+ +| | +| Length=24 | +| | ++-------------------------------------------+ +| | +| Serial Number | +| | ++-------------------------------------------+ +| | +| Refresh Interval | +| | ++-------------------------------------------+ +| | +| Retry Interval | +| | ++-------------------------------------------+ +| | +| Expire Interval | +| | +`-------------------------------------------' + +
+ + The Refresh Interval, Retry Interval, and Expire Interval + are all 32-bit elapsed times measured in seconds. They express + the timing parameters which the cache expects the router to + use in deciding when to send subsequent Serial Query or + Reset Query PDUs to the cache. + See for an explanation of the use + and the range of allowed values for these parameters. + +
+ +
+ + The cache may respond to a Serial Query informing the router + that the cache cannot provide an incremental update + starting from the Serial Number specified by the router. + The router must decide whether to issue a Reset Query or + switch to a different cache. + +
+ +0 8 16 24 31 +.-------------------------------------------. +| Protocol | PDU | | +| Version | Type | zero | +| 1 | 8 | | ++-------------------------------------------+ +| | +| Length=8 | +| | +`-------------------------------------------' + +
+
+ +
+
+ +0 8 16 24 31 +.-------------------------------------------. +| Protocol | PDU | | | +| Version | Type | Flags | zero | +| 1 | 9 | | | ++-------------------------------------------+ +| | +| Length | +| | ++-------------------------------------------+ +| | ++--- ---+ +| Subject Key Identifier | ++--- ---+ +| | ++--- ---+ +| (20 octets) | ++--- ---+ +| | ++-------------------------------------------+ +| | +| AS Number | +| | ++-------------------------------------------+ +| | +| Subject Public Key Info | +| | +`-------------------------------------------' + +
+ + The lowest-order bit of the Flags field is 1 for an + announcement and 0 for a withdrawal. + + + The cache server MUST ensure that it has told the router + client to have one and only one Router Key PDU for a unique + {SKI, ASN, Subject Public Key} at any one point in time. + Should the router client receive a Router Key PDU with a + {SKI, ASN, Subject Public Key} identical to one it already + has active, it SHOULD raise a Duplicate Announcement + Received error. + + + Note that a particular ASN may appear in multiple Router Key + PDUs with different Subject Public Key values, while a + particular Subject Public Key value may appear in multiple + Router Key PDUs with different ASNs. In the interest of + keeping the announcement and withdrawal semantics as simple + as possible for the router, this protocol makes no attempt + to compress either of these cases. + + + Also note that it is possible, albeit very unlikely, for + multiple distinct Subject Public Key values to hash to the + same SKI. For this reason, implementations MUST compare + Subject Public Key values as well as SKIs when detecting + duplicate PDUs. + +
+ +
+ + This PDU is used by either party to report an error to the + other. + + + Error reports are only sent as responses to other PDUs, not + to report errors in Error Report PDUs. + + + Error codes are described in . + + + If the error is generic (e.g., "Internal Error") and not + associated with the PDU to which it is responding, the + Erroneous PDU field MUST be empty and the Length of + Encapsulated PDU field MUST be zero. + + + An Error Report PDU MUST NOT be sent for an Error Report PDU. + If an erroneous Error Report PDU is received, the session + SHOULD be dropped. + + + If the error is associated with a PDU of excessive length, + i.e., too long to be any legal PDU other than another Error + Report, or a possibly corrupt length, the Erroneous PDU field + MAY be truncated. + + + The diagnostic text is optional; if not present, the Length of + Error Text field MUST be zero. If error text is present, it + MUST be a string in UTF-8 encoding (see ). + +
+ +0 8 16 24 31 +.-------------------------------------------. +| Protocol | PDU | | +| Version | Type | Error Code | +| 1 | 10 | | ++-------------------------------------------+ +| | +| Length | +| | ++-------------------------------------------+ +| | +| Length of Encapsulated PDU | +| | ++-------------------------------------------+ +| | +~ Erroneous PDU ~ +| | ++-------------------------------------------+ +| | +| Length of Error Text | +| | ++-------------------------------------------+ +| | +| Arbitrary Text | +| of | +~ Error Diagnostic Message ~ +| | +`-------------------------------------------' + +
+
+ +
+ +
+ + Since the data the cache distributes via the RPKI-Router protocol + are retrieved from the Global RPKI system at intervals which + are only known to the cache, only the cache can really know + how frequently it makes sense for the router to poll the + cache, or how long the data are likely to remain valid (or, at + least, unchanged). For this reason, as well as to allow the + cache some control over the load placed on it by its client + routers, the End Of Data PDU includes three values that allow + the cache to communicate timing parameters to the router: + + + + + This parameter tells the router how long to wait before + next attempting to poll the cache and between subsequent + attempts, using a Serial Query or Reset Query PDU. The + router SHOULD NOT poll the cache sooner than indicated by + this parameter. Note that receipt of a Serial Notify PDU + overrides this interval and suggests that the router issue + an immediate query without waiting for the Refresh + Interval to expire. Countdown for this timer starts upon + receipt of the containing End Of Data PDU. + + 1 second. + 86400 seconds (1 day). + 3600 seconds (1 hour). + + + + This parameter tells the router how long to wait before + retrying a failed Serial Query or Reset Query. The router + SHOULD NOT retry sooner than indicated by this parameter. + Note that a protocol version mismatch overrides this + interval: if the router needs to downgrade to a lower + protocol version number, it MAY send the first Serial + Query or Reset Query immediately. Countdown for this + timer starts upon failure of the query and restarts after + each subsequent failure until a query succeeds. + + 1 second. + 7200 seconds (2 hours). + 600 seconds (10 minutes). + + + + This parameter tells the router how long it can continue + to use the current version of the data while unable to + perform a successful subsequent query. The router MUST + NOT retain the data past the time indicated by this + parameter. Countdown for this timer starts upon receipt + of the containing End Of Data PDU. + + 600 seconds (10 minutes). + 172800 seconds (2 days). + 7200 seconds (2 hours). + + + + + + If the router has never issued a successful query against a + particular cache, it SHOULD retry periodically using the default + Retry Interval, above. + + + Caches MUST set Expire Interval to a value larger than + either Refresh Interval or Retry Interval. + +
+ +
+ + A router MUST start each transport connection by issuing either a + Reset Query or a Serial Query. This query will tell the cache + which version of this protocol the router implements. + + + If a cache which supports version 1 receives a query from a + router which specifies version 0, the cache MUST downgrade to + protocol version 0 or send a version + 1 Error Report PDU with Error Code 4 ("Unsupported Protocol + Version") and terminate the connection. + + + If a router which supports version 1 sends a query to a cache + which only supports version 0, one of two things will happen: + + + The cache may terminate the connection, perhaps with a + version 0 Error Report PDU. In this case, the router MAY + retry the connection using protocol version 0. + + + The cache may reply with a version 0 response. In this + case, the router MUST either downgrade to version 0 or + terminate the connection. + + + + + In any of the downgraded combinations above, the new features + of version 1 will not be available, and all PDUs will have 0 + in their version fields. + + + If either party receives a PDU containing an unrecognized + Protocol Version (neither 0 nor 1) during this negotiation, it + MUST either downgrade to a known version or terminate the + connection, with an Error Report PDU unless the received PDU + is itself an Error Report PDU. + + + The router MUST ignore any Serial Notify PDUs it might receive + from the cache during this initial startup period, regardless + of the Protocol Version field in the Serial Notify PDU. Since + Session ID and Serial Number values are specific to a + particular protocol version, the values in the notification + are not useful to the router. Even if these values were + meaningful, the only effect that processing the notification + would have would be to trigger exactly the same Reset Query or + Serial Query that the router has already sent as part of the + not-yet-complete version negotiation process, so there is + nothing to be gained by processing notifications until version + negotiation completes. + + + Caches SHOULD NOT send Serial Notify PDUs before version + negotiation completes. Routers, however, MUST handle + such notifications (by ignoring them) for backwards + compatibility with caches serving protocol version 0. + + + Once the cache and router have agreed upon a Protocol Version + via the negotiation process above, that version is stable for + the life of the session. See for a + discussion of the interaction between Protocol Version and + Session ID. + + + If either party receives a PDU for a different Protocol + Version once the above negotiation completes, that party MUST + drop the session; unless the PDU containing the unexpected + Protocol Version was itself an Error Report PDU, the party + dropping the session SHOULD send an Error Report with an error + code of 8 ("Unexpected Protocol Version"). + +
+ +
+ + + The sequences of PDU transmissions fall into four + conversations as follows: + + +
+
+ +Cache Router + ~ ~ + | <----- Reset Query -------- | R requests data (or Serial Query) + | | + | ----- Cache Response -----> | C confirms request + | ------- Payload PDU ------> | C sends zero or more + | ------- Payload PDU ------> | IPv4 Prefix, IPv6 Prefix, + | ------- Payload PDU ------> | or Router Key PDUs + | ------- End of Data ------> | C sends End of Data + | | and sends new serial + ~ ~ + +
+ + When a transport connection is first established, the router + MUST send either a Reset Query or a Serial Query. A Serial + Query would be appropriate if the router has significant + unexpired data from a broken session with the same cache and + remembers the Session ID of that session, in which case a + Serial Query containing the Session ID from the previous + session will allow the router to bring itself up to date + while ensuring that the Serial Numbers are commensurate and + that the router and cache are speaking compatible versions + of the protocol. In all other cases, the router lacks the + necessary data for fast resynchronization and therefore + MUST fall back to a Reset Query. + + + The Reset Query sequence is also used when the router + receives a Cache Reset, chooses a new cache, or fears that + it has otherwise lost its way. + + + See for details on version + negotiation. + + + To limit the length of time a cache must keep the data + necessary to generate incremental updates, a router MUST + send either a Serial Query or a Reset Query periodically. + This also acts as a keep-alive at the application layer. + See for details on the required + polling frequency. + +
+ +
+
+ +Cache Router + ~ ~ + | -------- Notify ----------> | (optional) + | | + | <----- Serial Query ------- | R requests data + | | + | ----- Cache Response -----> | C confirms request + | ------- Payload PDU ------> | C sends zero or more + | ------- Payload PDU ------> | IPv4 Prefix, IPv6 Prefix, + | ------- Payload PDU ------> | or Router Key PDUs + | ------- End of Data ------> | C sends End of Data + | | and sends new serial + ~ ~ + +
+ + The cache server SHOULD send a Notify PDU with its current + Serial Number when the cache's serial changes, with the + expectation that the router MAY then issue a Serial Query + earlier than it otherwise might. This is analogous to DNS + NOTIFY in . The cache MUST rate&nbhy;limit + Serial Notifies to no more frequently than one per minute. + + + When the transport layer is up and either a timer has gone + off in the router or the cache has sent a Notify PDU, the router + queries for new data by sending a Serial Query, and the cache + sends all data newer than the serial in the Serial Query. + + + To limit the length of time a cache must keep old withdraws, + a router MUST send either a Serial Query or a Reset Query + periodically. See for details on the + required polling frequency. + +
+ +
+
+ +Cache Router + ~ ~ + | <------ Serial Query ------ | R requests data + | ------- Cache Reset ------> | C cannot supply update + | | from specified serial + | <------ Reset Query ------- | R requests new data + | ----- Cache Response -----> | C confirms request + | ------- Payload PDU ------> | C sends zero or more + | ------- Payload PDU ------> | IPv4 Prefix, IPv6 Prefix, + | ------- Payload PDU ------> | or Router Key PDUs + | ------- End of Data ------> | C sends End of Data + | | and sends new serial + ~ ~ + +
+ + The cache may respond to a Serial Query with a Cache Reset, + informing the router that the cache cannot supply an + incremental update from the Serial Number specified by the + router. This might be because the cache has lost state, or + because the router has waited too long between polls and the + cache has cleaned up old data that it no longer believes it + needs, or because the cache has run out of storage space and + had to expire some old data early. Regardless of how this + state arose, the cache replies with a Cache Reset to tell + the router that it cannot honor the request. When a router + receives this, the router SHOULD attempt to connect to any + more-preferred caches in its cache list. If there are + no more-preferred caches, it MUST issue a Reset Query and + get an entire new load from the cache. + +
+ +
+
+ +Cache Router + ~ ~ + | <------ Serial Query ------ | R requests data + | ---- Error Report PDU ----> | C No Data Available + ~ ~ + +Cache Router + ~ ~ + | <------ Reset Query ------- | R requests data + | ---- Error Report PDU ----> | C No Data Available + ~ ~ + +
+ + The cache may respond to either a Serial Query or a Reset + Query informing the router that the cache cannot supply any + update at all. The most likely cause is that the cache has + lost state, perhaps due to a restart, and has not yet + recovered. While it is possible that a cache might go into + such a state without dropping any of its active sessions, + a router is more likely to see this behavior when it + initially connects and issues a Reset Query while the cache + is still rebuilding its database. + + + When a router receives this kind of error, the router + SHOULD attempt to connect to any other caches in its cache + list, in preference order. If no other caches are + available, the router MUST issue periodic Reset Queries + until it gets a new usable load from the cache. + +
+ +
+ +
+ + The transport-layer session between a router and a cache + carries the binary PDUs in a persistent session. + + + To prevent cache spoofing and DoS attacks by illegitimate + routers, it is highly desirable that the router and the cache + be authenticated to each other. Integrity protection for + payloads is also desirable to protect against + monkey-in-the-middle (MITM) attacks. Unfortunately, there is + no protocol to do so on all currently used platforms. + Therefore, as of the writing of this document, there is no + mandatory-to-implement transport which provides authentication + and integrity protection. + + + To reduce exposure to dropped but non-terminated sessions, both + caches and routers SHOULD enable keep-alives when available in + the chosen transport protocol. + + + It is expected that, when the TCP Authentication Option + (TCP-AO) is available on all + platforms deployed by operators, it will become the + mandatory-to-implement transport. + + + Caches and routers MUST implement unprotected transport over + TCP using a port, rpki-rtr (323); see + . Operators SHOULD use procedural means, + e.g., access control lists (ACLs), to reduce the exposure to + authentication issues. + + + If unprotected TCP is the transport, the cache and routers MUST be + on the same trusted and controlled network. + + + If available to the operator, caches and routers MUST use one + of the following more protected protocols: + + + + + Caches and routers SHOULD use TCP-AO transport + over the rpki-rtr port. + + + Caches and routers MAY use Secure Shell version 2 (SSHv2) transport + using the normal SSH port. For an + example, see . + + + Caches and routers MAY use TCP MD5 transport + using the rpki-rtr port. Note that + TCP MD5 has been obsoleted by TCP-AO + . + + + Caches and routers MAY use TCP over IPsec transport + using the rpki-rtr port. + + + Caches and routers MAY use Transport Layer Security (TLS) transport + using port rpki-rtr-tls (324); see + . + + + +
+ + To run over SSH, the client router first establishes an SSH + transport connection using the SSHv2 transport protocol, and + the client and server exchange keys for message integrity and + encryption. The client then invokes the "ssh-userauth" + service to authenticate the application, as described in the + SSH authentication protocol . + Once the application has been successfully + authenticated, the client invokes the "ssh-connection" + service, also known as the SSH connection protocol. + + + After the ssh-connection service is established, the client + opens a channel of type "session", which results in an SSH + session. + + + Once the SSH session has been established, the application + invokes the application transport as an SSH subsystem called + "rpki-rtr". Subsystem support is a feature of SSHv2 and is not + included in SSHv1. Running this protocol as an SSH subsystem + avoids the need for the application to recognize shell prompts + or skip over extraneous information, such as a system message + that is sent at shell startup. + + + It is assumed that the router and cache have exchanged keys + out of band by some reasonably secured means. + + + Cache servers supporting SSH transport MUST accept RSA + authentication and SHOULD accept Elliptic Curve Digital + Signature Algorithm (ECDSA) authentication. User + authentication MUST be supported; host authentication MAY be + supported. Implementations MAY support password + authentication. Client routers SHOULD verify the public key + of the cache to avoid MITM attacks. + +
+ +
+ + Client routers using TLS transport MUST present client-side + certificates to authenticate themselves to the cache in + order to allow the cache to manage the load by rejecting + connections from unauthorized routers. In principle, any + type of certificate and Certification Authority (CA) may be + used; however, in general, cache operators will wish to + create their own small-scale CA and issue certificates to + each authorized router. This simplifies credential + rollover; any unrevoked, unexpired certificate from the + proper CA may be used. + + + Certificates used to authenticate client routers in this + protocol MUST include a subjectAltName extension + + containing one or more iPAddress identities; when + authenticating the router's certificate, the cache MUST check + the IP address of the TLS connection against these iPAddress + identities and SHOULD reject the connection if none of the + iPAddress identities match the connection. + + + Routers MUST also verify the cache's TLS server certificate, + using subjectAltName dNSName identities as described in + , to avoid MITM attacks. The rules + and guidelines defined in apply here, + with the following considerations: + + + + + Support for the DNS-ID identifier type (that is, the dNSName + identity in the subjectAltName extension) is REQUIRED in + rpki-rtr server and client implementations which use TLS. + Certification authorities which issue rpki-rtr server + certificates MUST support the DNS-ID identifier type, and + the DNS-ID identifier type MUST be present in rpki-rtr + server certificates. + + + DNS names in rpki-rtr server certificates SHOULD NOT + contain the wildcard character "*". + + + rpki-rtr implementations which use TLS MUST NOT use + Common Name (CN-ID) identifiers; a CN field may be present + in the server certificate's subject name but MUST NOT be + used for authentication within the rules described in + . + + + The client router MUST set its "reference identifier" to + the DNS name of the rpki-rtr cache. + + + +
+ +
+ + If TCP MD5 is used, implementations MUST support key lengths + of at least 80 printable ASCII bytes, per Section 4.5 of + . Implementations MUST also support + hexadecimal sequences of at least 32 characters, i.e., + 128 bits. + + + Key rollover with TCP MD5 is problematic. Cache servers + SHOULD support . + +
+ +
+ + Implementations MUST support key lengths of at least 80 + printable ASCII bytes. Implementations MUST also support + hexadecimal sequences of at least 32 characters, i.e., 128 + bits. Message Authentication Code (MAC) lengths of at least + 96 bits MUST be supported, per Section 5.1 of + . + + + The cryptographic algorithms and associated parameters described in + MUST be supported. + +
+ +
+ +
+ + A cache has the public authentication data for each router it + is configured to support. + + + A router may be configured to peer with a selection of caches, + and a cache may be configured to support a selection of routers. + Each must have the name of, and authentication data for, each + peer. In addition, in a router, this list has a non-unique + preference value for each server. This + preference merely denotes proximity, not trust, preferred + belief, et cetera. The client router attempts to establish + a session with each potential serving cache in preference order + and then starts to load data from the most preferred cache to which + it can connect and authenticate. The router's list of caches has + the following elements: + + + An unsigned integer denoting the router's preference to + connect to that cache; the lower the value, the more preferred. + + + The IP address or fully qualified domain name of the cache. + + + Any credential (such as a public key) needed to + authenticate the cache's identity to the router. + + + Any credential (such as a private key or certificate) + needed to authenticate the router's identity to the cache. + + + + + Due to the distributed nature of the RPKI, caches simply + cannot be rigorously synchronous. A client may hold data from + multiple caches but MUST keep the data marked as to source, as + later updates MUST affect the correct data. + + + Just as there may be more than one covering ROA from a single + cache, there may be multiple covering ROAs from multiple caches. + The results are as described in + . + + + If data from multiple caches are held, implementations MUST NOT + distinguish between data sources when performing validation of + BGP announcements. + + + When a more-preferred cache becomes available, if resources + allow, it would be prudent for the client to start fetching + from that cache. + + + The client SHOULD attempt to maintain at least one set of data, + regardless of whether it has chosen a different cache or + established a new connection to the previous cache. + + + A client MAY drop the data from a particular cache when it is + fully in sync with one or more other caches. + + + See for details on what to do when the + client is not able to refresh from a particular cache. + + + If a client loses connectivity to a cache it is using or + otherwise decides to switch to a new cache, it SHOULD retain the + data from the previous cache until it has a full set of data + from one or more other caches. Note that this may already be + true at the point of connection loss if the client has + connections to more than one cache. + +
+ +
+ + For illustration, we present three likely deployment + scenarios: + + + The small multihomed end site may wish to outsource the + RPKI cache to one or more of their upstream ISPs. They + would exchange authentication material with the ISP using + some out-of-band mechanism, and their router(s) would + connect to the cache(s) of one or more upstream ISPs. The + ISPs would likely deploy caches intended for customer use + separately from the caches with which their own BGP + speakers peer. + + + A larger multihomed end site might run one or more caches, + arranging them in a hierarchy of client caches, each fetching + from a serving cache which is closer to the Global RPKI. They + might configure fallback peerings to upstream ISP caches. + + + A large ISP would likely have one or more redundant caches + in each major point of presence (PoP), and these caches + would fetch from each other in an ISP-dependent topology + so as not to place undue load on the Global RPKI. + + + + + Experience with large DNS cache deployments has shown that + complex topologies are ill-advised, as it is easy to make errors + in the graph, e.g., not maintain a loop-free condition. + + + Of course, these are illustrations, and there are other possible + deployment strategies. It is expected that minimizing load on + the Global RPKI servers will be a major consideration. + + + To keep load on Global RPKI services from unnecessary peaks, it + is recommended that primary caches which load from the + distributed Global RPKI not do so all at the same times, e.g., on + the hour. Choose a random time, perhaps the ISP's AS number + modulo 60, and jitter the inter-fetch timing. + +
+ +
+ + This section contains a preliminary list of error codes. The + authors expect additions to the list during development of + the initial implementations. There is an IANA registry where + valid error codes are listed; see . Errors + which are considered fatal MUST cause the session to be + dropped. + + + The receiver believes the received PDU to be corrupt in a + manner not specified by another error code. + + + The party reporting the error experienced some kind of + internal error unrelated to protocol operation (ran out of + memory, a coding assertion failed, et cetera). + + + The cache believes itself to be in good working order but + is unable to answer either a Serial Query or a Reset Query + because it has no useful data available at this time. This + is likely to be a temporary error and most likely indicates + that the cache has not yet completed pulling down an initial + current data set from the Global RPKI system after some kind + of event that invalidated whatever data it might have + previously held (reboot, network partition, et cetera). + + + The cache server believes the client's request to be + invalid. + + + The Protocol Version is not known by the receiver of the + PDU. + + + The PDU Type is not known by the receiver of the PDU. + + + The received PDU has Flag=0, but a matching record + ({Prefix, Len, Max-Len, ASN} tuple for an IPvX PDU or + {SKI, ASN, Subject Public Key} tuple for a Router Key PDU) + does not exist in the receiver's database. + + + The received PDU has Flag=1, but a matching record + ({Prefix, Len, Max-Len, ASN} tuple for an IPvX PDU or + {SKI, ASN, Subject Public Key} tuple for a Router Key PDU) + is already active in the router. + + + The received PDU has a Protocol Version field that differs + from the protocol version negotiated in + . + + + + +
+ +
+ + As this document describes a security protocol, many aspects of + security interest are described in the relevant sections. This + section points out issues which may not be obvious in other + sections. + + + In order for a collection of caches as described in + to guarantee a consistent view, + they need to be given consistent trust anchors to use in their + internal validation process. Distribution of a consistent + trust anchor is assumed to be out of band. + + + The router initiates a transport connection to a cache, which it + identifies by either IP address or fully qualified domain + name. Be aware that a DNS or address spoofing attack could + make the correct cache unreachable. No session would be + established, as the authorization keys would not match. + + + The RPKI relies on object, not server or transport, trust. + That is, the IANA root trust anchor is distributed to all + caches through some out-of-band means and can then be + used by each cache to validate certificates and ROAs all + the way down the tree. The inter-cache relationships are + based on this object security model; hence, the + inter-cache transport can be lightly protected. + + + However, this protocol document assumes that the routers cannot + do the validation cryptography. Hence, the last link, from + cache to router, is secured by server authentication and + transport-level security. This is dangerous, as server + authentication and transport have very different threat models + than object security. + + + So the strength of the trust relationship and the transport + between the router(s) and the cache(s) are critical. You're + betting your routing on this. + + + While we cannot say the cache must be on the same LAN, if + only due to the issue of an enterprise wanting to offload the + cache task to their upstream ISP(s), locality, trust, and + control are very critical issues here. The cache(s) really + SHOULD be as close, in the sense of controlled and protected + (against DDoS, MITM) transport, to the router(s) as possible. + It also SHOULD be topologically close so that a minimum of + validated routing data are needed to bootstrap a router's access + to a cache. + + + The identity of the cache server SHOULD be verified and + authenticated by the router client, and vice versa, before any + data are exchanged. + + + Transports which cannot provide the necessary authentication + and integrity (see ) must rely on + network design and operational controls to provide protection + against spoofing/corruption attacks. As pointed out in + , TCP-AO is the long-term plan. + Protocols which provide integrity and authenticity SHOULD be + used, and if they cannot, i.e., TCP is used as the transport, + the router and cache MUST be on the same trusted, controlled + network. + + + +
+ +
+ + This section only discusses updates required in the existing + IANA protocol registries to accommodate version 1 of this + protocol. See for IANA considerations + from the original (version 0) protocol. + + + All existing entries in the IANA "rpki-rtr-pdu" registry + remain valid for protocol version 0. All of the PDU types + allowed in protocol version 0 are also allowed in protocol + version 1, with the addition of the new Router Key PDU. To + reduce the likelihood of confusion, the PDU number used by the + Router Key PDU in protocol version 1 is hereby registered as + reserved (and unused) in protocol version 0. + + + The policy for adding to the registry is RFC Required per + ; the document must be either Standards Track or + Experimental. + + + The "rpki-rtr-pdu" registry has been updated as follows: + +
+ + Protocol PDU + Version Type Description + -------- ---- --------------- + 0-1 0 Serial Notify + 0-1 1 Serial Query + 0-1 2 Reset Query + 0-1 3 Cache Response + 0-1 4 IPv4 Prefix + 0-1 6 IPv6 Prefix + 0-1 7 End of Data + 0-1 8 Cache Reset + 0 9 Reserved + 1 9 Router Key + 0-1 10 Error Report + 0-1 255 Reserved + +
+ + All existing entries in the IANA "rpki-rtr-error" registry + remain valid for all protocol versions. Protocol version 1 + adds one new error code: + +
+ + Error + Code Description + ----- --------------------------- + 8 Unexpected Protocol Version + +
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + BGPsec Algorithms, Key Formats, and Signature Formats + + + + + + + + + + + + + + + + + + + + + + + + +
+ + The authors wish to thank + Nils Bars, + Steve Bellovin, + Tim Bruijnzeels, + Rex Fernando, + Richard Hansen, + Paul Hoffman, + Fabian Holler, + Russ Housley, + Pradosh Mohapatra, + Keyur Patel, + David Mandelberg, + Sandy Murphy, + Robert Raszuk, + Andreas Reuter, + Thomas C. Schmidt, + John Scudder, + Ruediger Volk, + Matthias Waehlisch, + and + David Ward. + Particular thanks go to Hannes Gredler for showing us the + dangers of unnecessary fields. + + + No doubt this list is incomplete. We apologize to any + contributor whose name we missed. + +
+ +
+ +