URI Resolution¶
rnginline
uses URI Resolution to map relative paths like
common/name.rng
to absolute URIs, such as
file:/a/b/proj/common/name.rng
which can be handled by a URL Handler.
Resolution 101¶
A quick crash course in URI resolution. Check out RFC 3986 if you want all the details.
URI Types¶
There are two types of URIs we need to know about. Absolute URIs and relative
URIs. In simple terms, Absolute URIs have a scheme, which is the part of a URI
before the first colon, e.g. in http://example.org
http
is the scheme.
Relative URIs don’t have a scheme. e.g. file.txt
, //example.org/foo
and
/tmp/file.txt
are all relative URIs.
Resolving¶
URI Resolution is the process of merging two URIs. A base URI which is always absolute, and a reference URI which can be absolute or relative.
There are two aspects of URI resolution we need to know about. Resolving schemes and resolving paths.
Schemes¶
If we resolve a relative reference URI against an absolute base URI, the resulting URI has the scheme of the base URI, with other parts overridden by the reference URI:
>>> from rnginline import uri
>>> uri.resolve('file:', 'somefile.txt')
'file:somefile.txt'
Resolving an absolute reference URI results in the base being replaced by the reference:
>>> uri.resolve('file:somefile.txt', 'other:blah')
'other:blah'
Paths¶
The path component of a relative reference URI is resolved against the path component of the base URI:
>>> uri.resolve('file:/some/dir/', 'other/dir/file.txt')
'file:/some/dir/other/dir/file.txt'
If the reference URI’s path is absolute (starts with a /
) then it replaces
the base URIs path:
>>> uri.resolve('file:/some/dir/', '/tmp/foo.txt')
'file:/tmp/foo.txt'
If the reference URI is absolute, its path replaces the base URIs path, regardless of whether or not the reference URIs path is absolute or not:
>>> uri.resolve('file:/some/dir/', 'file:file.txt')
'file:file.txt'
Also, note that trailing slashes are significant for path resolution. Without a trailing slash, the base’s final path segment is replaced when resolving paths:
>>> uri.resolve('file:/some/dir', 'other/dir/file.txt')
'file:/some/other/dir/file.txt'
URI Resolution in rnginline¶
When the Inliner sees a URI like common/name.rng
, it needs to resolve it to
to absolute URIL, such as file:/a/b/proj/common/name.rng
in order to
determine the location it points to, and which URL Handler can fetch the URL.
To do this, rnginline
uses a heirachy of URIs:
- The default base URI — The catch-all, top-most URI. This has to be
absolute. By default it’s
file:<current-dir>
but can be set to anything.
- The current document’s base URI — The location the current document was
loaded from.
- Any
xml:base
attributes on, or on ancestors of the an <inline>
/<externalRef>
XML element.
- Any
- The URI value of the
href
attribute of the<inline>
/<externalRef>
element.
- The URI value of the
The absolute URI of an href
attribute is resolved by resolving 2 against 1,
then 3 against the result of that, then 4 against the result of that.
Because the default base URI is file:<current-dir>
, relative paths/URIs
passed to inline()
get resolved to file:
URLs, which are handled by
the filesystem handler without having to specify the file:
scheme on the
input to inline()
.
Similarly, following what we’ve learnt above, if an input’s base URI is
pydata://my.pkg/some/dir/a.rng
and an href
attribute in a.rng
contains the value b.rng
, it will be resolved to the absolute URL
pydata://my.pkg/some/dir/b.rng
, and therefore handled by the pydata handler,
not the filesystem handler.