A Minor Update to Finding and Using Geofeed Data
IIJ & Arrcus
5147 Crystal Springs
Bainbridge Island
Washington
98110
United States of America
randy@psg.com
NTT
Siriusdreef 70-72
Hoofddorp
2132 WT
Netherlands
massimo@ntt.net
Google
1600 Amphitheatre Parkway
Mountain View
CA
94043
United States of America
warren@kumari.net
Vigil Security, LLC
516 Dranesville Road
Herndon
VA
20170
United States of America
housley@vigilsec.com
geolocation
geo-location
RPSL
This document specifies how to augment the Routing Policy
Specification Language inetnum: class to refer specifically to
geofeed data files and describes an optional scheme that uses
the Routing Public Key Infrastructure to authenticate the
geofeed datafiles.
Introduction
Providers of Internet content and other services may wish to
customize those services based on the geographic location of the
user of the service. This is often done using the source IP
address used to contact the service. Also, infrastructure and
other services might wish to publish the locale of their
services. defines
geofeed, a syntax to associate geographic locales with IP
addresses, but it does not specify how to find the relevant
geofeed data given an IP address.
This document specifies how to augment the Routing Policy
Specification Language (RPSL) inetnum: class to refer specifically to
geofeed data files and how to prudently use them. In all places
inetnum: is used, inet6num: should also be assumed .
The reader may find
and informative, and
certainly more verbose, descriptions of the inetnum: database
classes.
An optional utterly awesome but slightly complex means for
authenticating geofeed data is also defined.
This document obsoletes . Changes from
include the following:
-
It is no longer assumed that a geofeed file is a CSV, comma
separated value list.
-
RIPE has implemented the geofeed: attribute.
-
Allow, but discourage, an inetnum: to have both a geofeed
remarks: attribute and a geofeed: attribute.
-
Stress that authenticating geofeed data is optional.
-
IP Address Delegation extensions must not use "inherit".
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 when, and only when, they appear in all
capitals, as shown here.
Geofeed Files
Geofeed files are described in . They provide a facility for an IP address
resource "owner" to associate those IP addresses to geographic
locales.
Content providers and other parties who wish to locate an IP
address to a geographic locale need to find the relevant geofeed
data. In , this
document specifies how to find the relevant geofeed file given an IP address.
Geofeed data for large providers with significant horizontal
scale and high granularity can be quite large. The size of a
file can be even larger if an unsigned geofeed file combines
data for many prefixes, if dual IPv4/IPv6 spaces are
represented, etc.
Geofeed data do have privacy considerations (see ); this process makes bulk
access to those data easier.
This document also suggests an optional signature to strongly
authenticate the data in the geofeed files.
inetnum: Class
The original RPSL specifications starting with , , and a trail of subsequent documents were
written by the RIPE community. The IETF standardized RPSL in
and . Since then, it has been
modified and extensively enhanced in the Regional Internet
Registry (RIR) community, mostly by RIPE . Currently, change control effectively lies
in the operator community.
The RPSL, and and
used by the Regional
Internet Registries (RIRs), specify the inetnum: database class.
Each of these objects describes an IP address range and its
attributes. The inetnum: objects form a hierarchy ordered on
the address space.
Ideally, RPSL would be augmented to define a new RPSL geofeed:
attribute in the inetnum: class. Currently, this has been
implemented in only the RIPE Database. Until such time, this
document defines the syntax of a Geofeed remarks: attribute,
which contains an HTTPS URL of a geofeed file. The format of
the inetnum: geofeed remarks: attribute MUST be
as in this example, "remarks: Geofeed ", where the token
"Geofeed " MUST be case sensitive, followed by a
URL that will vary, but it MUST refer only to a
single geofeed file.
While we leave global agreement of RPSL modification to the
relevant parties, we specify that a proper geofeed: attribute in
the inetnum: class MUST be "geofeed:" and
MUST be followed by a single URL that will vary,
but it MUST refer only to a single geofeed file.
Registries MAY, for the interim, provide a mix of
the remarks: attribute form and the geofeed: attribute form.
The URL uses HTTPS, so the WebPKI provides authentication,
integrity, and confidentiality for the fetched geofeed file.
However, the WebPKI can not provide authentication of IP address
space assignment. In contrast, the RPKI (see ) can be used to authenticate
IP space assignment; see optional authentication in .
Until all producers of inetnum: objects, i.e., the RIRs, state
that they have migrated to supporting a geofeed: attribute,
consumers looking at inetnum: objects to find geofeed URLs
MUST be able to consume both the remarks: and
geofeed: forms.
The migration not only implies that the RIRs support the
geofeed: attribute, but that all registrants have migrated any
inetnum: objects from remarks: to geofeed: attributes.
Any particular inetnum: object SHOULD have, at
most, one geofeed reference, whether a remarks: or a proper
geofeed: attribute when it is implemented. If there is more
than one, the geofeed: attribute SHOULD be used.
For inetnum:s covering the same address range, or an inetnum:
with both remarks: and geofeed: attributes, a signed geofeed
file SHOULD be preferred over an unsigned file.
If a geofeed file describes multiple disjoint ranges of IP
address space, there are likely to be geofeed references from
multiple inetnum: objects. Files with geofeed references from
multiple inetnum: objects are not compatible with the signing
procedure in .
An unsigned, and only an unsigned, geofeed file MAY be
referenced by multiple inetnum:s and MAY contain prefixes from
more than one registry.
When geofeed references are provided by multiple inetnum:
objects that have identical address ranges, then the geofeed
reference on the inetnum: with the most recent last-modified:
attribute SHOULD be preferred.
As inetnum: objects form a hierarchy, geofeed references
SHOULD be at the lowest applicable inetnum:
object covering the relevant address ranges in the referenced
geofeed file. When fetching, the most specific inetnum: object
with a geofeed reference MUST be used.
It is significant that geofeed data may have finer granularity
than the inetnum: that refers to them. For example, an INETNUM
object for an address range P could refer to a geofeed file in
which P has been subdivided into one or more longer prefixes.
Currently, the registry data published by ARIN are not the same
RPSL as that of the other registries (see for a survey of the WHOIS Tower of Babel);
therefore, when fetching from ARIN via FTP , WHOIS , the Registration Data
Access Protocol (RDAP) , etc., the "NetRange" attribute/key
MUST be treated as "inetnum", and the "Comment"
attribute MUST be treated as "remarks".
Authenticating Geofeed Data (Optional)
The question arises whether a particular geofeed data set is valid, i.e., is
authorized by the "owner" of the IP address space and is
authoritative in some sense. The inetnum: that points to the
geofeed file provides
some assurance. Unfortunately, the RPSL in many repositories is
weakly authenticated at best. An approach where RPSL was signed
per would be good,
except it would have to be deployed by all RPSL registries, and
there is a fair number of them.
A single optional authenticator MAY be appended
to a geofeed file. It
is a digest of the main body of the file signed by the private
key of the relevant RPKI certificate for a covering address
range. One needs a format that bundles the relevant RPKI
certificate with the signature of the geofeed text.
The canonicalization procedure converts the data from their
internal character representation to the UTF-8 character encoding, and the
<CRLF> sequence MUST be used to denote the
end of a line of text. A blank line is represented solely by
the <CRLF> sequence. For robustness, any non-printable
characters MUST NOT be changed by
canonicalization. Trailing blank lines MUST NOT
appear at the end of the file. That is, the file must not end
with multiple consecutive <CRLF> sequences. Any
end-of-file marker used by an operating system is not considered
to be part of the file content. When present, such end-of-file
markers MUST NOT be processed by the digital
signature algorithm.
Should the authenticator be syntactically incorrect per the
above, the authenticator is invalid.
Borrowing detached signatures from , after file canonicalization, the
Cryptographic Message Syntax (CMS) would be used to create a detached
DER-encoded signature that is then padded BASE64 encoded (as per
) and line wrapped to 72 or fewer characters.
The same digest algorithm MUST be used for
calculating the message digest on content being signed, which is
the geofeed file, and for calculating the message digest on the
SignerInfo SignedAttributes . The message digest algorithm identifier
MUST appear in both the SignedData
DigestAlgorithmIdentifiers and the SignerInfo
DigestAlgorithmIdentifier .
The address range of the signing certificate MUST
cover all prefixes in the geofeed file it signs.
An address range A "covers" address range B if the range of B is
identical to or a subset of A. "Address range" is used here
because inetnum: objects and RPKI certificates need not align on
Classless Inter-Domain Routing (CIDR)
prefix boundaries, while those of the lines in a geofeed file
do.
As the signer specifies the covered RPKI resources relevant to
the signature, the RPKI certificate covering the inetnum:
object's address range is included in the CMS SignedData certificates field.
Identifying the private key associated with the certificate and
getting the department that controls the private key (which
might be trapped in a Hardware Security Module (HSM)) to sign
the CMS blob is left as an exercise for the implementor. On the
other hand, verifying the signature requires no complexity; the
certificate, which can be validated in the public RPKI, has the
needed public key.
The trust anchors for the RIRs are expected to already be
available to the party performing signature validation.
Validation of the CMS signature on the geofeed file
involves:
-
Obtaining the signer's certificate from the CMS SignedData
CertificateSet . The
certificate SubjectKeyIdentifier extension MUST match
the SubjectKeyIdentifier in the CMS SignerInfo
SignerIdentifier .
If the key identifiers do not match, then validation
MUST fail.
Validation of the signer's certificate MUST
ensure that it is part of the current manifest and that the resources are covered
by the RPKI certificate.
-
Constructing the certification path for the signer's
certificate. All of the needed certificates are expected to
be readily available in the RPKI repository. The
certification path MUST be valid according to
the validation algorithm in and the additional checks specified in
associated with the
IP Address Delegation certificate extension and the Autonomous
System Identifier Delegation certificate extension. If
certification path validation is unsuccessful, then validation
MUST fail.
-
Validating the CMS SignedData as specified in using the public key from
the validated signer's certificate. If the signature
validation is unsuccessful, then validation
MUST fail.
-
Verifying that the IP Address Delegation certificate extension
covers all of the
address ranges of the geofeed file. If all of the address
ranges are not covered, then validation MUST
fail.
All of these steps MUST be successful to consider
the geofeed file signature as valid.
As the signer specifies the covered RPKI resources relevant to the
signature, the RPKI certificate covering the inetnum: object's address
range is included in the CMS SignedData certificates field .
As an IP Address Delegation extension using "inherit" would
complicate processing, it MUST NOT be used. This
is consistent with other RPKI signed objects.
Identifying the private key associated with the certificate and
getting the department with the Hardware Security Module (HSM)
to sign the CMS blob is left as an exercise for the implementor.
On the other hand, verifying the signature requires no
complexity; the certificate, which can be validated in the
public RPKI, has the needed public key.
The appendix MUST be hidden as a series of "#" comments at the
end of the geofeed file. The following is a cryptographically
incorrect, albeit simple, example. A correct and full example is
in .
The signature does not cover the signature lines.
The bracketing "# RPKI Signature:" and "# End Signature:"
MUST be present following the model as shown.
Their IP address range MUST match that of the
inetnum: URL followed to the file.
describes
and provides code for a CMS profile for
a general purpose listing of checksums (a "checklist") for use with
the Resource Public Key Infrastructure (RPKI). It provides usable,
albeit complex, code to sign geofeed files.
describes
a CMS profile for a general purpose Resource Tagged Attestation (RTA)
based on the RPKI. While this is expected to become applicable in the
long run, for the purposes of this document, a self-signed root trust
anchor is used.
Operational Considerations
To create the needed inetnum: objects, an operator wishing to register
the location of their geofeed file needs to coordinate with their
Regional Internet Registry (RIR) or National Internet Registry (NIR)
and/or any provider Local Internet Registry (LIR) that has assigned
address ranges to them. RIRs/NIRs provide means for assignees to
create and maintain inetnum: objects. They also provide means of
assigning or sub-assigning IP address resources and allowing the
assignee to create WHOIS data, including inetnum: objects, thereby
referring to geofeed files.
The geofeed files MUST be published via and fetched using
HTTPS .
When using data from a geofeed file, one MUST ignore data
outside the referring inetnum: object's inetnum: attribute
address range.
If and only if the geofeed file is not signed per , then multiple inetnum: objects MAY
refer to the same geofeed file, and the consumer MUST
use only lines in the geofeed file where the prefix is covered by the
address range of the inetnum: object's URL it has followed.
If the geofeed file is signed, and the signer's certificate
changes, the signature in the geofeed file MUST be updated.
It is good key hygiene to use a given key for only one purpose.
To dedicate a signing private key for signing a geofeed file, an
RPKI Certification Authority (CA) may issue a subordinate certificate exclusively for
the purpose shown in .
To minimize the load on RIR WHOIS services, use of the RIR's FTP services SHOULD be
used for large-scale access to gather geofeed URLs. This also
provides bulk access instead of fetching by brute-force search
through the IP space.
Currently, geolocation providers have bulk WHOIS data access at
all the RIRs. An anonymized version of such data is openly
available for all RIRs except ARIN, which requires an
authorization. However, for users without such authorization,
the same result can be achieved with extra RDAP effort. There is
open-source code to pass over such data across all RIRs, collect
all geofeed references, and process them .
To prevent undue load on RPSL and geofeed servers, entity-fetching
geofeed data using these mechanisms MUST NOT do
frequent real-time lookups. suggests use of the HTTP Expires
header to signal when
geofeed data should be refetched. As the data change very
infrequently, in the absence of such an HTTP Header signal, collectors
SHOULD NOT fetch more frequently than weekly. It would
be polite not to fetch at magic times such as midnight UTC, the first
of the month, etc., because too many others are likely to do the same.
Privacy Considerations
geofeed data may reveal the
approximate location of an IP address, which might in turn reveal the
approximate location of an individual user. Unfortunately, provides no privacy guidance on
avoiding or ameliorating possible damage due to this exposure of the
user. In publishing pointers to geofeed files as described in this
document, the operator should be aware of this exposure in geofeed
data and be cautious. All the privacy considerations of
apply to this document.
Where provided the ability
to publish location data, this document makes bulk access to those data
readily available. This is a goal, not an accident.
Security Considerations
It is generally prudent for a consumer of geofeed data to also
use other sources to cross validate the data. All the security
considerations of apply here as well.
As mentioned in , many RPSL
repositories have weak, if any, authentication. This allows spoofing
of inetnum: objects pointing to malicious geofeed files. suggests an unfortunately complex
method for stronger authentication based on the RPKI.
For example, if an inetnum: for a wide address range (e.g., a
/16) points to an RPKI-signed geofeed file, a customer or
attacker could publish an unsigned equal or narrower (e.g., a
/24) inetnum: in a WHOIS registry that has weak authorization,
abusing the rule that the most-specific inetnum: object with a
geofeed reference MUST be used.
If signatures were mandatory, the above attack would be stymied, but
of course that is not happening anytime soon.
The RPSL providers have had to throttle fetching from their
servers due to too-frequent queries. Usually, they throttle by
the querying IP address or block. Similar defenses will likely
need to be deployed by geofeed file servers.
IANA Considerations
The IANA is requested to modify the References of the object
identifier for the content type in the "SMI Security for S/MIME
CMS Content Type (1.2.840.113549.1.9.16.1)" registry to the
following:
| Decimal |
Description |
References |
| 47 |
id-ct-geofeedCSVwithCRLF |
RFC8805 |
Representation Of IP Routing Policies In The RIPE Database
RIPE NCC
Representation Of IP Routing Policies In A Routing Registry
RIPE NCC
RIPE Database Documentation
RIPE NCC
Description of the INETNUM Object
RIPE NCC
Description of the INET6NUM Object
RIPE NCC
geofeed-finder
commit 5f557a4
Example
This appendix provides an example that includes a trust anchor, a CA
certificate subordinate to the trust anchor, an end-entity
certificate subordinate to the CA for signing the geofeed, and a
detached signature.
The trust anchor is represented by a self-signed certificate. As
usual in the RPKI, the trust anchor has authority over all IPv4
address blocks, all IPv6 address blocks, and all Autonomous System (AS) numbers.
The CA certificate is issued by the trust anchor. This
certificate grants authority over one IPv4 address block
(192.0.2.0/24) and two AS numbers (64496 and 64497).
The end-entity certificate is issued by the CA. This
certificate grants signature authority for one IPv4 address block
(192.0.2.0/24). Signature authority for AS numbers is not needed for
geofeed data signatures, so no AS numbers are included in the
certificate.
The end-entity certificate is displayed below in detail. For
brevity, the other two certificates are not.
To allow reproduction of the signature results, the end-entity
private key is provided. For brevity, the other two private
keys are not.
Signing of "192.0.2.0/24,US,WA,Seattle," (terminated by CR and LF) yields the
following detached CMS signature.
Acknowledgments
Thanks to for CMS and detached
signature clue, for the
first and substantial external review, and who was too shy to agree to
coauthorship. Additionally, we express our gratitude to early
implementors, including ;
; ; , who provided
running code; and . Also,
thanks to the following geolocation providers who are consuming
geofeeds with this described solution: (ipdata.co), (ipinfo.io), and
(bigdatacloud.com). For an amazing number of helpful reviews,
we thank , , , (INTDIR),
, (SECDIR), , , (GENART), , , and .