For users of legacy client (tuf/client/) this is purely a security fix
release with no API or functionality changes. For ngclient and Metadata
API, some API changes are included.
All users are advised to upgrade.
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
_fileinfo_has_changed() and _update_fileinfo() have been unused internal
methods since 2016. Remove them.
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
The original commit 051b8229 handled the loading and saving metadata
cases but the legacy client actually checks for the files existence
in various other places:
* _update_versioninfo() never reads the file but operates differently
depending on whether the file exists or not
* _move_current_to_previous() that copies files around
* MultiRepoUpdater initialization: this only handle root.json so
is still correct
* _update_fileinfo() which is dead code
Fix the first two of these cases.
If an attacker manages to create arbitrary rolenames they could trick
the client into writing metadata files into unexpected locations:
To avoid directory traversal and writing files into unexpected
locations, encode the rolename before using it as filename.
If a client has delegated targets metadata with rolenames that have
percent-encoded characters in them, these metadata will now not be
found in local metadata cache and must be re-downloaded.
Note that this does not mean using rolenames that get encoded is
advisable (as forming the download URLs still has issues with them),
this just means the client will not do unsafe writes when it encounters
rolenames like this.
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
If an attacker manages to create arbitrary rolenames they could trick
the client into writing metadata files into unexpected locations:
To avoid directory traversal and writing files into unexpected
locations, encode the rolename before using it as filename.
If a client has delegated targets metadata with rolenames that have
percent-encoded characters in them, these metadata will now not be
found in local metadata cache and must be re-downloaded.
Note that this does not mean using rolenames that get encoded is
advisable (as forming the download URLs still has issues with them),
this just means the client will not do unsafe writes when it encounters
rolenames like this.
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
urljoin considers the second URL to override the base URL if the second
one contains e.g. hostname: this could lead to ngclient downloading
from the wrong host entirely. Doing that would not compromise the
security of the system as the metadata would still need to be verified,
but would definitely be unexpected and a bug.
Note that we're still not encoding the rolename, it's just inserted into
the URL as is.
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
If you do the following steps:
1. call Updater.refresh() and load, verify and cache all metadata files
2. modify timestamp snapshot meta information:
(One or more of hashes or length for snapshot changes here)
3. call Updater.refresh() again
4. root and timestamp will be updated to their latest versions
5. local snapshot will be loaded, but hashes/length will be different
than the ones in timestamp.snapshot_meta and that will prevent loading
6. remote snapshot is loaded and verification starts
then when executing step 6 the rollback checks will not be done because
the old snapshot was not loaded on step 5.
In order to resolve this issue, we are introducing the idea of trusted and
untrusted snapshot.
Trusted snapshot is the locally available cached version. This version has
been verified at least once meaning hashes and length were already checked
against timestamp.snapshot_meta hashes and length.
That's why we can allow loading a trusted snapshot version even if there is a
mismatch between the current timestamp.snapshot_meta hashes/length and
hashes/length inside the trusted snapshot.
Untrusted snapshot is the one downloaded from the web. It hasn't been verified
before and that's why we mandate that timestamp.snapshot_meta hashes and length
should match the hashes and legth calculated on this untrusted version of
snapshot.
As the TrustedMetadataSet doesn't have information which snapshot is trusted or
not, so possibly the best solution is to add a new argument "trusted"
to update_snapshot.
Even though this is ugly as the rest of the update functions doesn't
have such an argument, it seems the best solution as it seems to work
in all cases:
- when loading a local snapshot, we know the data has at some point been
trusted (signatures have been checked): it doesn't need to match hashes
now
- if there is no local snapshot and we're updating from remote, the
remote data must match meta hashes in timestamp
- if there is a local snapshot and we're updating from remote, the remote
data must match meta hashes in timestamp
Lastly, I want to point out that hash checks for metadata files are not
essential to TUF security guarantees: they are just an additional layer of
security that allows us to avoid even parsing json that could be malicious -
we already know the malicious metadata would be stopped at metadata
verification after the parsing.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
The purpose of this config was to ensure blocking
inside the download loop and releasing CPU resources.
To our best knowledge the network stack currently used
in RequestsFetcher will always block inside the loop
and the issue cannot be reproduced.
'chunk_size' and 'socket_timeout' are currently the
settings provided by RequestsFetcher to tweak
CPU usage and download granularity.
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
The definition of consistent targets in the spec is ambiguous:
"consistent target files should be written to non-volatile storage
as digest.filename.ext"
Additionally, the specification describes consistent targets when the
client builds the download URL as follows:
"The filename is of the form HASH.FILENAME.EXT".
The issue is about how we interpreted those quotes.
The legacy updater has decided this means a target path "a/b" will
translate to a download url path "a/{HASH}.b".
The ngclient however translates the target path "a/b" to a download url
path "{HASH}.a/b".
We decided we want to follow the same approach taken from the legacy
updater and thus change how we construct the consistent targets.
Additionally, we want to make sure we test for cases when the TARGETPATH
is an empty string or points to a directory.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Doing so is not always safe and has various other issues
(like target paths "a/../b" and "b" ending up as the same
local path).
Instead URL-encode the target path to make it a plain filename. This
removes any opportunity for path trickery and removes the need to create
the required sub directories (which we were not doing currently, leading
to failed downloads). URL-encoding encodes much more than we really need
but doing so should not hurt: the important thing is that it encodes
all path separators.
Return the actual filepath as return value. I would like to modify the
arguments so caller could decide the filename if they want to. But I
won't do it now because updated_targets() (the caching mechanism)
relies on filenames being chosen by TUF. The plan is to make it
possible for caller to choose the filename though.
This is clearly a "filesystem API break" for anyone depending on the
actual target file names, and does not make sense if we do not plan to
go forward with other updated_targets()/download_target() changes
listed in #1580.
This is part of bigger plan in #1580Fixes#1571
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
This may have been required by a linter at some point, but isn't
anymore: Not annotating makes the documentation look better.
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
Situation before
* constructor args are not documented
* object attributes are documented
* sphinx cannot show object attribute type annotations
* attribute docs take a lot of vertical space
Now:
* constructor args are documented
* sphinx can show annotated types of constructor args
* class docstring now explains the attributes are the same as
constructor args (and attributes are not explicitly documented)
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
RequestsFetcher now handles connect/read timeout
errors on session.get() as it does when reading data.
Adding a test which uses unittest.mock to patch the
session.get method instead of simulating the timeout
with a server.
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
This is an API change to the exceptions thrown in Root.add_key()
and Root.remove_key().
The reason for that change is that in my opinion the correct exceptions
in these cases should be "ValueError" instead of "KeyError" as
the problems are in the given values - role doesn't exist or
key is not used by a particular role.
Additionally, document the thrown exceptions in "Root.add_key" and
add a test which invokes that exception.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Root class has the functionality to add and remove keys for delegated
metadata (add_key()/remove_key()) but the other delegator Targets does
not.
It should provide the same/similar functionality.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
The spec does not say anything about role name uniqueness in a
delegations object, but I believe we cannot safely allow multiple roles
with the same role name in the roles array of a delegations object.
If we did then the roles could have different keyids, and then we would
end up in a situation where metadata may be both a valid delegation
and an invalid delegation at the same time, depending on how the role
gets chosen and that does not seem like the intention of the design.
There is an issue open in the specification with number 167 about
that issue.
Regardless of the Metadata API, I think we should enforce role name
uniqueness.
I chose to change the data structure containing roles to
OrderedDict, where keys are role names and values are DelegatedRole
instances.
This made sense to me as role names are the unique identifier of a role
and their order is important to the way they are traversed afterward.
Note: we can't use OrderedDict as type annotation until we drop support
for Python 3.6:
https://docs.python.org/3/library/typing.html#typing.OrderedDict
That's why I used quotes around "OrderedDict" annotation, because I
can't import it.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
In Timestamp, the only valid "meta" value is the dictionary representing
meta information for the snapshot file. This makes the API unnecessarily
complicated and requires validation that only information about snapshot
is available inside "meta".
Together with the python-tuf maintainers, we decided that snapshot meta
information will not be represented by a "meta" dictionary but instead
by a MetaFile instance and with this it will diverge from the
specification.
Additionally, to prevent confusion, I will rename the "meta" attribute
to "snapshot_meta" as this attribute will be related only to meta
information about snapshot.
This decision is coherent with ADR9 and the rationale
behind it is to provide easier, safer, and direct access to the
snapshot meta information.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Simplify the code by removing some of the uses of "global" by using
alternativies to the assignements.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
New pylint warnings appeared with code "W0602: Using global for X but
no assignment is done (global-variable-not-assigned)" where X is a
global variable.
In the lines pointed by the warnings, "global" was used inside a
function scope in order to enable assignment of the global variable to
a new value in the function.
The reason for these warnings was that pylint noticed we are using
"global" for the variable X inside a function scope without actually
assigning it to a new value.
Overall that was true for the reported cases and I removed the use of
"global", but there were only two cases where pylint reported a
false-positive. Then we were using "global", but pylint didn't
understand there were assignments.
In those cases, I disabled the warnings.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Clarify the purpose of metadata API and that it's a low-level API
and as such it doesn't use concepts like "repository" or
"trusted collection of metadata" and don't implement the repository
logic or client updater workflow.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
This is a repository tooling use case but also helpful when testing.
It could be useful when we need to update the targets object.
Signed-off-by: Velichka Atanasova <avelichka@vmware.com>
The rollback checks themselves work, but they create a situation
where Updater does not realize that it needs to download e.g. a new
snapshot because the local snapshot is valid as _intermediate_ snapshot
(that can be used for rollback protection but nothing else), but is not
valid as final snapshot.
Raise in the end of update_snapshot and update_timestamp if the files
are not valid final metadata: this way the intermediate metadata does
get loaded but Updater also knows it is not the final metadata.
This modifies the existing tests but does not yet test the situation
described in the first paragraph.
Fixes#1563
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
The v0.18.0 release was made with the changes from #1566, resulting in
a release with sources which don't match the git tag. Rectify this with
a brown bag point release.
Signed-off-by: Joshua Lock <jlock@vmware.com>
The pylint warning W0703:broad-except was raised only
when six was used and python 2 was still supported.
The warning is no longer raised, the exceptions are
handled/raised correctly and the disabling can be removed.
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
For targetpath: we don't want to support corner cases such as
file paths starting with separator.
Why this case should be threated specially than any other case where
you have multiple "/" for example "foo//bar/tar.gz"?
For pathpattern: it's recommended that the separator in the pathpattern
should be "/":
see https://theupdateframework.github.io/specification/latest/#targetpath
I believe it could lead to issues for a client implementation if it
supports arbitrary separators - every implementation needs to choose one
and stick with it.
Then, if we decide that "/" is our separator using lstrip on "os.sep" is
wrong, because the os separator from the server could be different that
the one used in the client.
Because of the above arguments, it makes sense to just remove
lstrip on os separators.
Additionally, document that the target_filepath and the DelegatedRole
paths are expected to be in their canonical forms and only "/" is
supported as target path separator.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
in the public API that we only support "/" as a
separator and don't handle corner cases such as leading separators
in either pathpattern or target_filepath.
If a file has a bigger size than expected, RequestFetcher.fetch
downloads it up to max_length without any errors. Only the
consecutive hash check raises exception. A better behaviour
in such case would be to raise a DownloadLengthMismatchError.
For this reason this commit:
- removes max_length from the fetch() definition,
- modifies download_file to check the downloaded length after
each chunk and raise an error accordingly.
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Add an additional private method for loading the initial
trusted root metadata. The public method update_root() is
now used only externally for updating the intiial root.
The 'root' property is used only after its initialization
in the constructor and is not longer optional which makes
mypy happy.
This split results in cleaner code and the ability to annotate
the 'root' property as non-optional at the cost of some code
duplication.
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
By explicitly denoting the expected type of Metadata.signed
we help mypy understand our intentions and correctly figure
out types. This is entirely a typing feature and has no
runtime effect.
Modify the return type of Metadata.from_dict to match the
other factory methods (from_*).
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
SlowRetrievalError is raised from RequestsFetcher where
average_download_speed is not calculated.
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>