From 5deadcb09b653ffc481507c071a1985f48d951d1 Mon Sep 17 00:00:00 2001 From: Randy Bush Date: Mon, 5 Dec 2022 12:17:00 -0800 Subject: [PATCH] fetched from rfced version and hacked to compile --- draft-ymbk-9092update.xml | 1140 +++++++++++++++++++++++++++++++++++++ 1 file changed, 1140 insertions(+) create mode 100644 draft-ymbk-9092update.xml diff --git a/draft-ymbk-9092update.xml b/draft-ymbk-9092update.xml new file mode 100644 index 0000000..10c5452 --- /dev/null +++ b/draft-ymbk-9092update.xml @@ -0,0 +1,1140 @@ + + + + + + + + + + + + + + + Finding and Using Geofeed Data + + + + + IIJ & Arrcus +
+ + 5147 Crystal Springs + Bainbridge Island + Washington + 98110 + United States of America + + randy@psg.com +
+
+ + + NTT +
+ + Siriusdreef 70-72 + Hoofddorp + 2132 WT + Netherlands + + massimo@ntt.net +
+
+ + + Google +
+ + 1600 Amphitheatre Parkway + Mountain View + CA + 94043 + United States of America + + warren@kumari.net +
+
+ + + Vigil Security, LLC +
+ + 516 Dranesville Road + Herndon + VA + 20170 + United States of America + + housley@vigilsec.com +
+
+ + + +geolocation +geo-location +RPSL + + + + This document specifies how to augment the Routing Policy + Specification Language inetnum: class to refer specifically to + geofeed data comma-separated values (CSV) files and describes an + optional scheme that uses the Routing Public Key Infrastructure + to authenticate the geofeed data CSV files. + + + +
+ + +
+ Introduction + + Providers of Internet content and other services may wish to + customize those services based on the geographic location of the + user of the service. This is often done using the source IP + address used to contact the service. Also, infrastructure and + other services might wish to publish the locale of their + services. defines + geofeed, a syntax to associate geographic locales with IP + addresses, but it does not specify how to find the relevant + geofeed data given an IP address. + + + This document specifies how to augment the Routing Policy + Specification Language (RPSL) inetnum: class to refer specifically to + geofeed data CSV files and how to prudently use them. In all + places inetnum: is used, inet6num: should also be assumed . + + + The reader may find + and informative, and + certainly more verbose, descriptions of the inetnum: database + classes. + + + An optional utterly awesome but slightly complex means for + authenticating geofeed data is also defined. + +
+ Requirements Language + + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL + NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", + "MAY", and "OPTIONAL" in this document are to be interpreted as + described in BCP 14 when, and only when, they appear in all + capitals, as shown here. + + +
+
+
+ Geofeed Files + + Geofeed files are described in . They + provide a facility for an IP address resource "owner" to + associate those IP addresses to geographic locales. + + + + Content providers and other parties who wish to locate an IP address + to a geographic locale need to find the relevant geofeed data. In + , this document specifies how + to find the relevant geofeed + file given an IP address. + + + Geofeed data for large providers with significant horizontal + scale and high granularity can be quite large. The size of a + file can be even larger if an unsigned geofeed file combines + data for many prefixes, if dual IPv4/IPv6 spaces are represented, + etc. + + + Geofeed data do have privacy considerations (see ); this process makes bulk access + to those data easier. + + + This document also suggests an optional signature to strongly + authenticate the data in the geofeed files. + +
+
+ inetnum: Class + + + + The original RPSL specifications starting with , , and a trail of + subsequent documents were written by the RIPE community. The IETF + standardized RPSL in and . Since then, it has been modified and + extensively enhanced in the Regional Internet Registry (RIR) + community, mostly by RIPE . Currently, + change control effectively lies in the operator community. + + + + The RPSL, and and used by the + Regional Internet Registries (RIRs), specify the inetnum: + database class. Each of these objects describes an IP address + range and its attributes. The inetnum: objects form a hierarchy + ordered on the address space. + + + + Ideally, RPSL would be augmented to define a new RPSL geofeed: + attribute in the inetnum: class. Until such time, this document + defines the syntax of a Geofeed remarks: attribute, which contains an + HTTPS URL of a geofeed file. The format of the inetnum: geofeed + remarks: attribute MUST be as in this example, + "remarks: Geofeed ", where the token "Geofeed " MUST be + case sensitive, followed by a URL that will vary, but it + MUST refer only to a single geofeed file. + + + + + While we leave global agreement of RPSL modification to the relevant + parties, we specify that a proper geofeed: attribute in the inetnum: + class MUST be "geofeed:" and MUST be + followed by a single URL that will vary, but it MUST + refer only to a single geofeed file. + + + + Registries MAY, for the interim, provide a mix of the remarks: + attribute form and the geofeed: attribute form. + + + The URL uses HTTPS, so the WebPKI provides authentication, integrity, + and confidentiality for the fetched geofeed file. However, the WebPKI + can not provide authentication of IP address space assignment. In + contrast, the RPKI (see ) can + be used to authenticate IP space assignment; see optional + authentication in . + + + + Until all producers of inetnum: objects, i.e., the RIRs, state that they + have migrated to supporting a geofeed: attribute, consumers + looking at inetnum: objects to find geofeed URLs MUST be able to + consume both the remarks: and geofeed: forms. + + + The migration not only implies that the RIRs support the geofeed: + attribute, but that all registrants have migrated any inetnum: objects + from remarks: to geofeed: attributes. + + + + Any particular inetnum: object MUST have, at most, one geofeed + reference, whether a remarks: or a proper geofeed: attribute + when it is implemented. If there is more than one, all are + ignored. + + + If a geofeed CSV file describes multiple disjoint ranges of IP + address space, there are likely to be geofeed references from + multiple inetnum: objects. Files with geofeed references from + multiple inetnum: objects are not compatible with the signing + procedure in . + + + When geofeed references are provided by multiple inetnum: + objects that have identical address ranges, then the geofeed + reference on the inetnum: with the most recent last-modified: + attribute SHOULD be preferred. + + + As inetnum: objects form a hierarchy, geofeed references SHOULD + be at the lowest applicable inetnum: object covering the + relevant address ranges in the referenced geofeed file. When + fetching, the most specific inetnum: object with a geofeed + reference MUST be used. + + + It is significant that geofeed data may have finer granularity + than the inetnum: that refers to them. For example, an INETNUM + object for an address range P could refer to a geofeed file in + which P has been subdivided into one or more longer prefixes. + + + Currently, the registry data published by ARIN are not the same RPSL as + that of the other registries (see for a survey of the WHOIS Tower of Babel); + therefore, when fetching from ARIN via FTP , WHOIS , + the Registration Data Access Protocol (RDAP) , etc., the "NetRange" attribute/key + MUST be treated as "inetnum", and the "Comment" + attribute MUST be treated as "remarks". + + +
+
+ Authenticating Geofeed Data + + The question arises whether a particular geofeed data set is valid, i.e., is + authorized by the "owner" of the IP address space and is authoritative + in some sense. The inetnum: that points to the geofeed file provides some assurance. + Unfortunately, the RPSL in many repositories is weakly authenticated + at best. An approach where RPSL was signed per would be good, except it would have to be deployed + by all RPSL registries, and there is a fair number of them. + + + A single optional authenticator MAY be appended to a + geofeed file. It is a + digest of the main body of the file signed by the private key of the + relevant RPKI certificate for a covering address range. One needs a + format that bundles the relevant RPKI certificate with the signature + of the geofeed text. + + + The canonicalization procedure converts the data from their internal + character representation to the UTF-8 character encoding, and the <CRLF> sequence + MUST be used to denote the end of a line of text. A + blank line is represented solely by the <CRLF> sequence. For + robustness, any non-printable characters MUST NOT be + changed by canonicalization. Trailing blank lines MUST + NOT appear at the end of the file. That is, the file must not + end with multiple consecutive <CRLF> sequences. Any end-of-file + marker used by an operating system is not considered to be part of the + file content. When present, such end-of-file markers MUST + NOT be processed by the digital signature algorithm. + + + Should the authenticator be syntactically incorrect per the + above, the authenticator is invalid. + + + + + Borrowing detached signatures from , after file canonicalization, the Cryptographic + Message Syntax (CMS) would + be used to create a detached DER-encoded signature that is then padded + BASE64 encoded (as per ) and line wrapped to 72 or fewer + characters. The same digest algorithm MUST be used for + calculating the message digest on content being signed, which is the + geofeed file, and for calculating the message digest on the SignerInfo + SignedAttributes . The + message digest algorithm identifier MUST appear in both + the SignedData DigestAlgorithmIdentifiers and the SignerInfo + DigestAlgorithmIdentifier . + + + The address range of the signing certificate MUST cover all + prefixes in the geofeed file it signs. + + + An address range A "covers" address range B if the range of B is + identical to or a subset of A. "Address range" is used here because + inetnum: objects and RPKI certificates need not align on Classless + Inter-Domain Routing (CIDR) prefix + boundaries, while those of the CSV lines in a geofeed file do. + + + As the signer specifies the covered RPKI resources relevant to the + signature, the RPKI certificate covering the inetnum: object's address + range is included in the CMS + SignedData certificates field. + + + Identifying the private key associated with the certificate and + getting the department that controls the private key (which might be + trapped in a Hardware Security Module (HSM)) to sign the CMS blob is + left as an exercise for the implementor. On the other hand, verifying + the signature requires no complexity; the certificate, which can be + validated in the public RPKI, has the needed public key. + + The trust anchors for the RIRs are expected to already be + available to the party performing signature validation. + Validation of the CMS signature on the geofeed file + involves: +
  1. + Obtaining the signer's certificate from the CMS SignedData + CertificateSet . The certificate + SubjectKeyIdentifier extension + MUST match the SubjectKeyIdentifier in the CMS SignerInfo + SignerIdentifier . If the key + identifiers do not match, then validation MUST fail. + + Validation of the signer's certificate MUST ensure + that it is part of the current manifest and that the resources are covered by + the RPKI certificate. + + +
  2. + +
  3. + Constructing the certification path for the signer's certificate. + All of the needed certificates are expected to be readily + available in the RPKI repository. The certification path MUST + be valid according to the validation algorithm in and the additional checks specified in + associated with the IP Address + Delegation certificate extension and the Autonomous System + Identifier Delegation certificate extension. If certification + path validation is unsuccessful, then validation MUST fail. +
  4. +
  5. + Validating the CMS SignedData as specified in using the public key from the validated + signer's certificate. If the signature validation is + unsuccessful, then validation MUST fail. +
  6. +
  7. + Verifying that the IP Address Delegation certificate extension + covers all of the address ranges of + the geofeed file. If all of the address ranges are not + covered, then validation MUST fail. +
  8. + +
+ + All of these steps MUST be successful to consider the geofeed + file signature as valid. + + + As the signer specifies the covered RPKI resources relevant to the + signature, the RPKI certificate covering the inetnum: object's address + range is included in the CMS SignedData certificates field . + + + Identifying the private key associated with the certificate and + getting the department with the Hardware Security Module (HSM) to sign + the CMS blob is left as an exercise for the implementor. On the other + hand, verifying the signature requires no complexity; the certificate, + which can be validated in the public RPKI, has the needed public key. + + + The appendix MUST be hidden as a series of "#" comments at the + end of the geofeed file. The following is a cryptographically + incorrect, albeit simple, example. A correct and full example is + in . + + + + The signature does not cover the signature lines. + + + The bracketing "# RPKI Signature:" and "# End Signature:" + MUST be present following the model as shown. + Their IP address range MUST match that of the + inetnum: URL followed to the file. + + + describes + and provides code for a CMS profile for + a general purpose listing of checksums (a "checklist") for use with + the Resource Public Key Infrastructure (RPKI). It provides usable, + albeit complex, code to sign geofeed files. + + + describes + a CMS profile for a general purpose Resource Tagged Attestation (RTA) + based on the RPKI. While this is expected to become applicable in the + long run, for the purposes of this document, a self-signed root trust + anchor is used. + +
+
+ Operational Considerations + + + To create the needed inetnum: objects, an operator wishing to register + the location of their geofeed file needs to coordinate with their + Regional Internet Registry (RIR) or National Internet Registry (NIR) + and/or any provider Local Internet Registry (LIR) that has assigned + address ranges to them. RIRs/NIRs provide means for assignees to + create and maintain inetnum: objects. They also provide means of + assigning or sub-assigning IP address resources and allowing the + assignee to create WHOIS data, including inetnum: objects, thereby + referring to geofeed files. + + + The geofeed files MUST be published via and fetched using + HTTPS . + + + When using data from a geofeed file, one MUST ignore data + outside the referring inetnum: object's inetnum: attribute + address range. + + + If and only if the geofeed file is not signed per , then multiple inetnum: objects MAY + refer to the same geofeed file, and the consumer MUST + use only lines in the geofeed file where the prefix is covered by the + address range of the inetnum: object's URL it has followed. + + + If the geofeed file is signed, and the signer's certificate + changes, the signature in the geofeed file MUST be updated. + + + + It is good key hygiene to use a given key for only one purpose. + To dedicate a signing private key for signing a geofeed file, an + RPKI Certification Authority (CA) may issue a subordinate certificate exclusively for + the purpose shown in . + + + To minimize the load on RIR WHOIS services, use of the RIR's FTP services SHOULD be + used for large-scale access to gather geofeed URLs. This also + provides bulk access instead of fetching by brute-force search + through the IP space. + + + Currently, geolocation providers have bulk WHOIS data access at + all the RIRs. An anonymized version of such data is openly + available for all RIRs except ARIN, which requires an + authorization. However, for users without such authorization, + the same result can be achieved with extra RDAP effort. There is + open-source code to pass over such data across all RIRs, collect + all geofeed references, and process them . + + + + To prevent undue load on RPSL and geofeed servers, entity-fetching + geofeed data using these mechanisms MUST NOT do + frequent real-time lookups. suggests use of the HTTP Expires + header to signal when + geofeed data should be refetched. As the data change very + infrequently, in the absence of such an HTTP Header signal, collectors + SHOULD NOT fetch more frequently than weekly. It would + be polite not to fetch at magic times such as midnight UTC, the first + of the month, etc., because too many others are likely to do the same. + +
+
+ Privacy Considerations + + + + geofeed data may reveal the + approximate location of an IP address, which might in turn reveal the + approximate location of an individual user. Unfortunately, provides no privacy guidance on + avoiding or ameliorating possible damage due to this exposure of the + user. In publishing pointers to geofeed files as described in this + document, the operator should be aware of this exposure in geofeed + data and be cautious. All the privacy considerations of + apply to this document. + + + Where provided the ability + to publish location data, this document makes bulk access to those data + readily available. This is a goal, not an accident. + +
+
+ Security Considerations + + It is generally prudent for a consumer of geofeed data to also + use other sources to cross validate the data. All the security + considerations of apply here as well. + + + As mentioned in , many RPSL + repositories have weak, if any, authentication. This allows spoofing + of inetnum: objects pointing to malicious geofeed files. suggests an unfortunately complex + method for stronger authentication based on the RPKI. + + + + + For example, if an inetnum: for a wide address range (e.g., a + /16) points to an RPKI-signed geofeed file, a customer or + attacker could publish an unsigned equal or narrower (e.g., a + /24) inetnum: in a WHOIS registry that has weak authorization, + abusing the rule that the most-specific inetnum: object with a + geofeed reference MUST be used. + + + If signatures were mandatory, the above attack would be stymied, but + of course that is not happening anytime soon. + + + The RPSL providers have had to throttle fetching from their + servers due to too-frequent queries. Usually, they throttle by + the querying IP address or block. Similar defenses will likely + need to be deployed by geofeed file servers. + +
+
+ IANA Considerations + + IANA has registered object identifiers for one content + type in the "SMI Security for S/MIME CMS Content Type + (1.2.840.113549.1.9.16.1)" registry as follows: + + + + + + + + + + + + + + + + + +
DecimalDescriptionReferences
47id-ct-geofeedCSVwithCRLFRFC 9092
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Representation Of IP Routing Policies In The RIPE Database + + RIPE NCC + + + + + + + + Representation Of IP Routing Policies In A Routing Registry + + RIPE NCC + + + + + + + + + RIPE Database Documentation + + RIPE NCC + + + + + + + + Description of the INETNUM Object + + RIPE NCC + + + + + + + + Description of the INET6NUM Object + + RIPE NCC + + + + + + + + geofeed-finder + + + + + +commit 5f557a4 + + + +
+ Example + + This appendix provides an example that includes a trust anchor, a CA + certificate subordinate to the trust anchor, an end-entity + certificate subordinate to the CA for signing the geofeed, and a + detached signature. + + + + The trust anchor is represented by a self-signed certificate. As + usual in the RPKI, the trust anchor has authority over all IPv4 + address blocks, all IPv6 address blocks, and all Autonomous System (AS) numbers. + + + + + The CA certificate is issued by the trust anchor. This + certificate grants authority over one IPv4 address block + (192.0.2.0/24) and two AS numbers (64496 and 64497). + + + The end-entity certificate is issued by the CA. This + certificate grants signature authority for one IPv4 address block + (192.0.2.0/24). Signature authority for AS numbers is not needed for + geofeed data signatures, so no AS numbers are included in the + certificate. + + + The end-entity certificate is displayed below in detail. For + brevity, the other two certificates are not. + + + +To allow reproduction of the signature results, the end-entity +private key is provided. For brevity, the other two private +keys are not. + + +Signing of "192.0.2.0/24,US,WA,Seattle," (terminated by CR and LF) yields the +following detached CMS signature. + +
+ +
+ Acknowledgments + + Thanks to for CMS and detached + signature clue, for the first + and substantial external review, and + who was too shy to agree to coauthorship. Additionally, we express + our gratitude to early implementors, including ; ; ; , who + provided running code; and . Also, + thanks to the following geolocation providers who are consuming geofeeds with this + described solution: (ipdata.co), + (ipinfo.io), and (bigdatacloud.com). For an amazing number + of helpful reviews, we thank , + , , (INTDIR), + , + (SECDIR), , , (GENART), , and . The + authors also thank , the + awesome document shepherd. + +
+ +
+