diff --git a/draft-ietf-opsawg-9092-update.xml b/draft-ietf-opsawg-9092-update.xml new file mode 100644 index 0000000..a4edd87 --- /dev/null +++ b/draft-ietf-opsawg-9092-update.xml @@ -0,0 +1,1320 @@ + + + + + + + + + + + + + + + Finding and Using Geofeed Data + + + IIJ & Arrcus +
+ + 5147 Crystal Springs + Bainbridge Island + Washington + 98110 + United States of America + + randy@psg.com +
+
+ + + NTT +
+ + Veemweg 23 + Barneveld + 3771 MT + Netherlands + + massimo@ntt.net +
+
+ + + Google +
+ + 1600 Amphitheatre Parkway + Mountain View + CA + 94043 + United States of America + + warren@kumari.net +
+
+ + + Vigil Security, LLC +
+ + 516 Dranesville Road + Herndon + VA + 20170 + United States of America + + housley@vigilsec.com +
+
+ + + +geolocation +geo-location +RPSL + + + + This document specifies how to augment the Routing Policy + Specification Language inetnum: class to refer specifically to + geofeed data files and describes an optional scheme that uses + the Resource Public Key Infrastructure to authenticate the + geofeed datafiles. + + + +
+ + +
+ Introduction + + Providers of Internet content and other services may wish to + customize those services based on the geographic location of the + user of the service. This is often done using the source IP + address used to contact the service, which may not point to a + user, see , Section 14 in particular. + Also, infrastructure and other services might wish to publish + the locale of their services. defines geofeed, a syntax to associate + geographic locales with IP addresses, but it does not specify + how to find the relevant geofeed data given an IP address. + + + This document specifies how to augment the Routing Policy + Specification Language (RPSL) inetnum: class to refer specifically to + geofeed data files and how to prudently use them. In all places + inetnum: is used, inet6num: should also be assumed . + + + The reader may find + and informative, and + certainly more verbose, descriptions of the inetnum: database + classes. + + + An optional utterly awesome but slightly complex means for + authenticating geofeed data is also defined in . + + + This document obsoletes . Changes from + include the following: +
    +
  • + RIPE has implemented the geofeed: attribute. +
  • +
  • + Allow, but discourage, an inetnum: to have both a geofeed + remarks: attribute and a geofeed: attribute. +
  • +
  • + Geofeed file only UTF-8 CSV. +
  • +
  • + Stress that authenticating geofeed data is optional. +
  • +
  • + IP Address Delegation extensions must not use "inherit". +
  • +
  • + If geofeed data are present, ignore geographic location + hints in other data. +
  • +
+ +
+ +
+ Requirements Language + + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL + NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", + "MAY", and "OPTIONAL" in this document are to be interpreted as + described in BCP 14 when, and only when, they appear in all + capitals, as shown here. + + +
+
+
+ Geofeed Files + + Geofeed files are described in . They provide a facility for an IP address + resource "owner" to associate those IP addresses to geographic + locales. + + + + Per , geofeed files consist of CSVs + (Comma Separated Values) in UTF-8 text format; not HTML, + richtext, or other formats. + + + + Content providers and other parties who wish to locate an IP + address to a geographic locale need to find the relevant geofeed + data. In , this + document specifies how to find the relevant geofeed file given an IP address. + + + Geofeed data for large providers with significant horizontal + scale and high granularity can be quite large. The size of a + file can be even larger if an unsigned geofeed file combines + data for many prefixes, if dual IPv4/IPv6 spaces are + represented, etc. + + + Geofeed data do have privacy considerations (see ); this process makes bulk + access to those data easier. + + + This document also suggests an optional signature to strongly + authenticate the data in the geofeed files. + +
+
+ inetnum: Class + + + The original RPSL specifications starting with , , and a trail of subsequent documents were + written by the RIPE community. The IETF standardized RPSL in + and . Since then, it has been + modified and extensively enhanced in the Regional Internet + Registry (RIR) community, mostly by RIPE . Currently, change control effectively lies + in the operator community. + + + + The RPSL, and and + used by the Regional + Internet Registries (RIRs), specify the inetnum: database class. + Each of these objects describes an IP address range and its + attributes. The inetnum: objects form a hierarchy ordered on + the address space. + + + + Ideally, RPSL would be augmented to define a new RPSL geofeed: + attribute in the inetnum: class. Absent implementation of the + geofeed: attribute in a particular RIR database, this document + defines the syntax of a Geofeed remarks: attribute, which + contains an HTTPS URL of a geofeed file. The format of the + inetnum: geofeed remarks: attribute MUST be as in this example, + "remarks: Geofeed ", where the token "Geofeed " MUST be case + sensitive, followed by a URL that will vary, but it MUST refer + only to a single geofeed file. + + + + + While we leave global agreement of RPSL modification to the + relevant parties, we specify that a proper geofeed: attribute in + the inetnum: class MUST be "geofeed:" and + MUST be followed by a single URL that will vary, + but it MUST refer only to a single geofeed file. + + + + The URL uses HTTPS, so the WebPKI provides authentication, + integrity, and confidentiality for the fetched geofeed file. + However, the WebPKI can not provide authentication of IP address + space assignment. In contrast, the RPKI (see ) can be used to authenticate + IP space assignment; see optional authentication in . + + + + Until all producers of inetnum: objects, i.e., the RIRs, state + that they have migrated to supporting a geofeed: attribute, + consumers looking at inetnum: objects to find geofeed URLs + MUST be able to consume both the remarks: and + geofeed: forms. + + + + The migration not only implies that the RIRs support the + geofeed: attribute, but that all registrants have migrated any + inetnum: objects from remarks: to geofeed: attributes. + + + + Any particular inetnum: object SHOULD have, at + most, one geofeed reference, whether a remarks: or a proper + geofeed: attribute when it is implemented. If there is more + than one, the geofeed: attribute SHOULD be used. + + + For inetnum:s covering the same address range, or an inetnum: + with both remarks: and geofeed: attributes, a signed geofeed + file SHOULD be preferred over an unsigned file. + + + If a geofeed file describes multiple disjoint ranges of IP + address space, there are likely to be geofeed references from + multiple inetnum: objects. Files with geofeed references from + multiple inetnum: objects are not compatible with the signing + procedure in . + + + An unsigned, and only an unsigned, geofeed file MAY be + referenced by multiple inetnum:s and MAY contain prefixes from + more than one registry. + + + When geofeed references are provided by multiple inetnum: + objects that have identical address ranges, then the geofeed + reference on the inetnum: with the most recent last-modified: + attribute SHOULD be preferred. + + + As inetnum: objects form a hierarchy, geofeed references + SHOULD be at the lowest applicable inetnum: + object covering the relevant address ranges in the referenced + geofeed file. When fetching, the most specific inetnum: object + with a geofeed reference MUST be used. + + + It is significant that geofeed data may have finer granularity + than the inetnum: that refers to them. For example, an INETNUM + object for an address range P could refer to a geofeed file in + which P has been subdivided into one or more longer prefixes. + + +
+ +
+ Fetching Geofeed Data + + + This document is to provides a guideline for how interested + parties should fetch and read geofeed files. + + + + Historically, before geofeed files, this was done in varied + ways, at the discretion of the implementer, often without + consistent authentication, where data were mostly imported from + email without formal authorisation or validation. + + + + To minimize the load on RIRs' WHOIS + services, the RIR's FTP services SHOULD + be used for large-scale access to gather geofeed URLs. This + uses efficient bulk access instead of fetching via brute-force + search through the IP space. + + + + When an inetnum: with a geofeed file reference is identified, + the file MUST be downloaded using HTTPS. + + + + When reading data from the geofeed file, one MUST ignore data + outside the referring inetnum: object's address range. This is + to avoid importing data about ranges not under the control of + the operator. If geofeed files are fetched, other location + information from the inetnum: MUST be ignored. + + + + Given an address range of interest, the most specific inetnum: + object with a geofeed reference MUST be used to fetch the + geofeed file. For example, if the fetching party finds + the following inetnum: objects: + + and the file geofeed_1 contains geolocation data about + 192.0.2.0/29, this MUST be discarded because 192.0.2.0/24 is + within the more specific inetnum: covering the address range and + that inetnum: has a geofeed reference. + + + + If an inetnum: object has both remarks: with geofeed data and + also has a geofeed: attribute, the geofeed: attribute SHOULD be + used and the remarks: ignored. + + + + Hints in inetnum:s such as country:, geoloc:, etc. tend to be + administrative, and not deployment specific. Consider large, + possibly global, providers with headquarters very far from most + of their deployments. Therefore, if geofeed data are specified, + either as a geofeed: attribute or in a geofeed remarks: + attribute, other geographic hints such as country:, geoloc:, DNS + geoloc RRsets, etc., for that address range MUST be ignored. + + + + There is open-source code to traverse the RPSL data across all + of the RIRs, collect all geofeed references, and process them + . It implements the steps above + and of all the Operational Considerations described in , including caching. It produces a single geofeed + file, merging all the geofeed files found. This open-source + code can be run daily by a cronjob, and the output file can be + directly used. + +
+ +
+ Authenticating Geofeed Data (Optional) + + The question arises whether a particular geofeed data set is valid, i.e., is + authorized by the "owner" of the IP address space and is + authoritative in some sense. The inetnum: that points to the + geofeed file provides + some assurance. Unfortunately, the RPSL in some repositories is + weakly authenticated at best. An approach where RPSL was signed + per would be good, + except it would have to be deployed by all RPSL registries, and + there is a fair number of them. + + + A single optional authenticator MAY be appended + to a geofeed file. It + is a digest of the main body of the file signed by the private + key of the relevant RPKI certificate for a covering address + range. One needs a format that bundles the relevant RPKI + certificate with the signature of the geofeed text. + + + The canonicalization procedure converts the data from their + internal character representation to the UTF-8 character encoding, and the + <CRLF> sequence MUST be used to denote the + end of a line of text. A blank line is represented solely by + the <CRLF> sequence. For robustness, any non-printable + characters MUST NOT be changed by + canonicalization. Trailing blank lines MUST NOT + appear at the end of the file. That is, the file must not end + with multiple consecutive <CRLF> sequences. Any + end-of-file marker used by an operating system is not considered + to be part of the file content. When present, such end-of-file + markers MUST NOT be processed by the digital + signature algorithm. + + + Should the authenticator be syntactically incorrect per the + above, the authenticator is invalid. + + + + Borrowing detached signatures from , after file canonicalization, the + Cryptographic Message Syntax (CMS) would be used to create a detached + DER-encoded signature that is then padded BASE64 encoded (as per + ) and line wrapped to 72 or fewer characters. + The same digest algorithm MUST be used for + calculating the message digest on content being signed, which is + the geofeed file, and for calculating the message digest on the + SignerInfo SignedAttributes . The message digest algorithm identifier + MUST appear in both the SignedData + DigestAlgorithmIdentifiers and the SignerInfo + DigestAlgorithmIdentifier . + + + The address range of the signing certificate MUST + cover all prefixes in the geofeed file it signs. + + + An address range A "covers" address range B if the range of B is + identical to or a subset of A. "Address range" is used here + because inetnum: objects and RPKI certificates need not align on + Classless Inter-Domain Routing (CIDR) + prefix boundaries, while those of the lines in a geofeed file + do. + + + As the signer specifies the covered RPKI resources relevant to + the signature, the RPKI certificate covering the inetnum: + object's address range is included in the CMS SignedData certificates field. + + + Identifying the private key associated with the certificate and + getting the department that controls the private key (which + might be trapped in a Hardware Security Module (HSM)) to sign + the CMS blob is left as an exercise for the implementor. On the + other hand, verifying the signature requires no complexity; the + certificate, which can be validated in the public RPKI, has the + needed public key. + + The trust anchors for the RIRs are expected to already be + available to the party performing signature validation. + Validation of the CMS signature on the geofeed file + involves: +
  1. + + Obtaining the signer's certificate from the CMS SignedData + CertificateSet . The + certificate SubjectKeyIdentifier extension MUST match + the SubjectKeyIdentifier in the CMS SignerInfo + SignerIdentifier . + If the key identifiers do not match, then validation + MUST fail. + + Validation of the signer's certificate MUST + ensure that it is part of the current manifest and that the resources are covered + by the RPKI certificate. + +
  2. + +
  3. + Constructing the certification path for the signer's + certificate. All of the needed certificates are expected to + be readily available in the RPKI repository. The + certification path MUST be valid according to + the validation algorithm in and the additional checks specified in + associated with the + IP Address Delegation certificate extension and the Autonomous + System Identifier Delegation certificate extension. If + certification path validation is unsuccessful, then validation + MUST fail. +
  4. + +
  5. + Validating the CMS SignedData as specified in using the public key from + the validated signer's certificate. If the signature + validation is unsuccessful, then validation + MUST fail. +
  6. +
  7. + Verifying that the IP Address Delegation certificate extension + covers all of the + address ranges of the geofeed file. If all of the address + ranges are not covered, then validation MUST + fail. +
  8. +
+ + All of these steps MUST be successful to consider + the geofeed file signature as valid. + + + As the signer specifies the covered RPKI resources relevant to the + signature, the RPKI certificate covering the inetnum: object's address + range is included in the CMS SignedData certificates field . + + + An IP Address Delegation extension using "inherit" would + complicate processing. The implementation would have to build + the certification path from the end-entity to the trust anchor, + then validate the path from the trust anchor to the end-entity, + and then the parameter would have to be remembered when the + validated public key was used to validate a signature on a CMS + object. Having to remember things from certification path + validation for use with CMS object processing is too hard. And, + the certificates do not get that much bigger by repeating the + information. + + + Therefore an extension using "inherit" MUST NOT be used. This + is consistent with other RPKI signed objects. + + + Identifying the private key associated with the certificate and + getting the department with the Hardware Security Module (HSM) + to sign the CMS blob is left as an exercise for the implementor. + On the other hand, verifying the signature requires no + complexity; the certificate, which can be validated in the + public RPKI, has the needed public key. + + + The appendix MUST be hidden as a series of "#" comments at the + end of the geofeed file. The following is a cryptographically + incorrect, albeit simple, example. A correct and full example is + in . + + + + The signature does not cover the signature lines. + + + The bracketing "# RPKI Signature:" and "# End Signature:" + MUST be present following the model as shown. + Their IP address range MUST match that of the + inetnum: URL followed to the file. + + + describes + and provides code for a CMS profile for + a general purpose listing of checksums (a "checklist") for use with + the Resource Public Key Infrastructure (RPKI). It provides usable, + albeit complex, code to sign geofeed files. + + + describes + a CMS profile for a general purpose Resource Tagged Attestation (RTA) + based on the RPKI. While this is expected to become applicable in the + long run, for the purposes of this document, a self-signed root trust + anchor is used. + +
+
+ Operational Considerations + + + To create the needed inetnum: objects, an operator wishing to register + the location of their geofeed file needs to coordinate with their + Regional Internet Registry (RIR) or National Internet Registry (NIR) + and/or any provider Local Internet Registry (LIR) that has assigned + address ranges to them. RIRs/NIRs provide means for assignees to + create and maintain inetnum: objects. They also provide means of + assigning or sub-assigning IP address resources and allowing the + assignee to create WHOIS data, including inetnum: objects, thereby + referring to geofeed files. + + + The geofeed files MUST be published via and fetched using + HTTPS . + + + When using data from a geofeed file, one MUST ignore data + outside the referring inetnum: object's inetnum: attribute + address range. + + + If and only if the geofeed file is not signed per , then multiple inetnum: objects MAY + refer to the same geofeed file, and the consumer MUST + use only lines in the geofeed file where the prefix is covered by the + address range of the inetnum: object's URL it has followed. + + + If the geofeed file is signed, and the signer's certificate + changes, the signature in the geofeed file MUST + be updated. + + + + It is good key hygiene to use a given key for only one purpose. + To dedicate a signing private key for signing a geofeed file, an + RPKI Certification Authority (CA) may issue a subordinate + certificate exclusively for the purpose shown in . + + + + Harvesting and publishing aggregated geofeed data outside of + the RPSL model should be avoided as it can have the effect + that more specifics from one aggregatee could undesirably + affect the less specifics of a different aggregatee. The + validation model in Section handles this + issue within the RPSL model. + + + Currently, geolocation providers have bulk WHOIS data access at + all the RIRs. An anonymized version of such data is openly + available for all RIRs except ARIN, which requires an + authorization. However, for users without such authorization, + the same result can be achieved with extra RDAP effort. There is + open-source code to pass over such data across all RIRs, collect + all geofeed references, and process them . + + + + To prevent undue load on RPSL and geofeed servers, + entity-fetching geofeed data using these mechanisms MUST + NOT do frequent real-time lookups. suggests use of the HTTP Expires header to signal when geofeed data + should be refetched. As the data change very infrequently, in + the absence of such an HTTP Header signal, collectors + SHOULD NOT fetch more frequently than weekly. It + would be polite not to fetch at magic times such as midnight + UTC, the first of the month, etc., because too many others are + likely to do the same. + + +
+ +
+ Privacy Considerations + + + geofeed data may reveal the + approximate location of an IP address, which might in turn reveal the + approximate location of an individual user. Unfortunately, provides no privacy guidance on + avoiding or ameliorating possible damage due to this exposure of the + user. In publishing pointers to geofeed files as described in this + document, the operator should be aware of this exposure in geofeed + data and be cautious. All the privacy considerations of + apply to this document. + + + Where provided the ability + to publish location data, this document makes bulk access to those data + readily available. This is a goal, not an accident. + +
+ +
+ Implementation Status + + + Currently, the geofeed: attribute in inetnum objects has + been implemented in the RIPE database. + + + + Registrants in databases which do not yet support the geofeed: + attribute are using the remarks:, or equivalent, attribute. + + + + Currently, the registry data published by ARIN are not the same + RPSL as that of the other registries (see for a survey of the WHOIS Tower of Babel); + therefore, when fetching from ARIN via FTP , WHOIS , the Registration Data + Access Protocol (RDAP) , etc., the "NetRange" attribute/key must be + treated as "inetnum", and the "Comment" attribute must be + treated as "remarks". + + +
+ +
+ Security Considerations + + It is generally prudent for a consumer of geofeed data to also + use other sources to cross-validate the data. All the security + considerations of + apply here as well. + + + The consumer of geofeed data SHOULD fetch and process the data + themselves. Importing datasets produced and/or processed by a + third-party places ill-advised trust in the third-party. + + + As mentioned in , some + RPSL repositories have weak, if any, authentication. This + allows spoofing of inetnum: objects pointing to malicious + geofeed files. suggests + an unfortunately complex method for stronger authentication + based on the RPKI. + + + For example, if an inetnum: for a wide address range (e.g., a + /16) points to an RPKI-signed geofeed file, a customer or + attacker could publish an unsigned equal or narrower (e.g., a + /24) inetnum: in a WHOIS registry that has weak authorization, + abusing the rule that the most-specific inetnum: object with a + geofeed reference MUST be used. + + + If signatures were mandatory, the above attack would be stymied, but + of course that is not happening anytime soon. + + + The RPSL providers have had to throttle fetching from their + servers due to too-frequent queries. Usually, they throttle by + the querying IP address or block. Similar defenses will likely + need to be deployed by geofeed file servers. + +
+
+ IANA Considerations + + There are no new actions needed by the IANA. + +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Representation Of IP Routing Policies In The RIPE Database + + RIPE NCC + + + + + + + + Representation Of IP Routing Policies In A Routing Registry + + RIPE NCC + + + + + + + + + RIPE Database Documentation + + RIPE NCC + + + + + + + + Description of the INETNUM Object + + RIPE NCC + + + + + + + + Description of the INET6NUM Object + + RIPE NCC + + + + + + + + geofeed-finder + + + + + +commit 5f557a4 + + + +
+ Example + + This appendix provides an example that includes a trust anchor, a CA + certificate subordinate to the trust anchor, an end-entity + certificate subordinate to the CA for signing the geofeed, and a + detached signature. + + + + The trust anchor is represented by a self-signed certificate. As + usual in the RPKI, the trust anchor has authority over all IPv4 + address blocks, all IPv6 address blocks, and all Autonomous System (AS) numbers. + + + + + The CA certificate is issued by the trust anchor. This + certificate grants authority over one IPv4 address block + (192.0.2.0/24) and two AS numbers (64496 and 64497). + + + The end-entity certificate is issued by the CA. This + certificate grants signature authority for one IPv4 address block + (192.0.2.0/24). Signature authority for AS numbers is not needed for + geofeed data signatures, so no AS numbers are included in the + certificate. + + + The end-entity certificate is displayed below in detail. For + brevity, the other two certificates are not. + + + +To allow reproduction of the signature results, the end-entity +private key is provided. For brevity, the other two private +keys are not. + + +Signing of "192.0.2.0/24,US,WA,Seattle," (terminated by CR and LF) yields the +following detached CMS signature. + +
+ +
+ Acknowledgments + + Thanks to for CMS and detached + signature clue, for the + first and substantial external review, and who was too shy to agree to + coauthorship. Additionally, we express our gratitude to early + implementors, including ; + ; ; , who also found an + ASN.1 'inherit' issue; and . + Also, thanks to the following geolocation providers who are + consuming geofeeds with this described solution: (ipdata.co), (ipinfo.io), and + (bigdatacloud.com). For an amazing number of helpful reviews, + we thank , , , (INTDIR), + , , (SECDIR), , , , (GENART), , , and . + +
+ +
+