This commit performs restructuring on the recently added metadata
class model architecture, which shall be part of a new simple TUF
API.
The key change is that the Metadata class is now used as container
for inner TUF metadata (Root, Timestamp, Snapshot, Targets) instead
of serving as base class for these, that means we use 'composition'
instead of 'inheritance'. Still, in order to aggregate common
attributes of the inner Metadata (expires, version, spec_version),
we use a new baseclass 'Signed', which also corresponds to the
signed field of the outer metadata container.
Based on prior observations in TUF's sister project in-toto, this
architecture seems to more closely represent the metadata model as
it is defined in the specification (see in-toto/in-toto#98 and
in-toto/in-toto#142 for related discussions).
Note that the proposed changes require us to now access some
attributes/methods via the signed attribute of a Metadata object
and not directly on the Metadata object, but it would be possible
to add short-cuts. (see todo notes in doc header).
Further changes include:
- Add minimal doc header with TODO notes
- Make attributes that correspond to fields in TUF JSON metadata
public again. There doesn't seem to be a good reason to protect
them with leading underscores and use setters/getters instead, it
just adds more code.
- Generally try to reduce code.
- Remove keyring and consistent_snapshot attributes from metadata
class. As discussed in #1060 they are a better fit for extra
management code (also see #660)
- Remove sslib schema checks (see TODO notes about validation in
doc header)
- Drop usage of build_dict_conforming_to_schema, it seems a lot
simpler and more explicit to just code this here.
- ... same goes for make_metadata_fileinfo
- Adapt tests accordingly
TODO: Document!!!
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
* classmethod for init RAMKey from file
* private class variables
* more typing for methods
* better names for arguments
Signed-off-by: Trishank Karthik Kuppusamy <trishank.kuppusamy@datadoghq.com>
dateutil provides an interface which is much easier to reason about for
users, i.e. it provides an interface for year deltas which automatically
handles leap years. Add some tests to try and ensure that, even though it
uses standard library functionality, the metadata API can accept
dateutil.relativedelta and do the right thing.
Signed-off-by: Joshua Lock <jlock@vmware.com>
All of the functionality we need is available from the standard library
which reduces our dependency footprint. Having minimal dependencies is
especially important for update clients which often have to vendor their
dependencies.
However, dateutil.relativedelta is richer than timedelta and helps to
provide a clearer API. For example, with relativedelta it's possible
to specify a delta in years *and* dateutil will do the right thing for
leap years.
Signed-off-by: Joshua Lock <jlock@vmware.com>
We don't capture timezone information in the metadata, therefore we should
not capture it in the interfaces. Ensure we remove timezone information
from any datetime objects when they are assigned to the expiration
property of a Metadata object.
Signed-off-by: Joshua Lock <jlock@vmware.com>
This was erroneously absent in PR 1024, which added support for abstract
files and directories. Resolve by adding a storage_backend argument to
generate_timestamp_metadata() and using it so that the fileinfo (hashes
and length) for the snapshot file can be generated for a snapshot
metadata file on any supported storage.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Add a class implementing StorageBackendInterface for testhing which
mutates filenames on put()/get(), such that trying to read the expected
file paths for TUF metadata from the local filesystem doesn't find the
files.
Use this class when creating a repository and writing metadata to test
abstract files and directories support for metadata writing.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Utilise the abstract files and directories support to enable generating
targets metadata for files which aren't necessarily locally accessible,
rather than requiring that metadata for non-local files be provided via
existing fileinfo structures.
Signed-off-by: Joshua Lock <jlock@vmware.com>
The specification lists four fundamental roles: root, targets, snapshot
and timestamp. Loading a repository where those roles are not present
should not be supported, therefore convert debug messages on the absence
of metadata files for these fundamental roles into a RepositoryError
exception.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Switch to using the new abstract files and directories support in
securesystemslib by taking an object which implements
securesystemslib.storage.StorageBackendInterface in the Repository
constructor, passed in by tuf.repository_tool.create_new_repository() and
tuf.repository_tool.load_repository()
The Updater class in tuf.client.updater does not specify a storage backend
and instead allows the functions in securesystemslib to perform the
default action of instantiating a LocalFilesystemBackend, that is the
updater does not currently support abstract filesystem backends and always
defaults to using local storage.
Finally we drop support for tuf.settings.CONSISTENT_METHOD as it's not as
clear how different copying modes should work when the details of the
underlying storage are abstracted away.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Support for compressed files was removed in tuf v0.10.x leaving behind
some vestiges like the test logic in test_repository_lib, which is
duplicated below and carries a redundant comment, and setting compression
on in generate_project_data.py
Signed-off-by: Joshua Lock <jlock@vmware.com>
After a discussion with Joshua Lock, we agreed that for
Windows users it would be good to provide the option to use
SimpleHTTPRequestHandler, but still leave a warning about it,
knowing that this caused an error before.
See: 7dbb30ae10
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Passing a pipe to the subprocess, but not reading from it
conceals helpful error messages.
As the code redirects all of the stderr from the subprocess
to nowhere, the error output of the process is never read.
If we remove the PIPEs from the tests we should see some
error messages on the console/logger that can
help us understand what went wrong.
On another hand, when we stop passing stderr=subprocess.PIPE arg
to the subprocess.Popen function call there are a lot of
HTTP messages together with the helpful error messages.
One decision is to make QuietHTTPRequestHandler
the default. That way we receive the helpful error messages
without the HTTP messages.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Fixes issue: https://github.com/theupdateframework/tuf/issues/1010
When running the tests this error appears
"[Errno 111] Connection refused". After some digging Lukas
found another error "No module named tests.simple_server"
which is the root case for error 111.
With the help of Joshua Lock, we found out that the simple_server.py
was not being found because subprocess.Popen was being passed
a cwd kwarg which moved the current working directory
away from tuf/tests.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Use a hard-coded unix separator ('/') so that an
exception is also raised for paths starting with '/'
when executing on Windows systems.
Update test_check_path to explicitly test invalid paths
starting with Windows style separator.
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Improve the coding style in TUTORIAL in the case
where absolute path to a file is needed to perform file system
access and at the same time is rejected by Targets methods.
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
- add a test for _check_path() method of Targets class.
- update all tests calling _check_path() respectively
- update test_tutorial
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
delegate_hashed_bins() has a number_of_bins parameter which defaults to
1024. add_target_to_bin() and remove_target_from_bin() both have a
number_of_bins parameter with no default. This means that in the
(somewhat) unlikely case that someone is using the default
number_of_bins when creating hashed bins they will need know what that
default value is and pass it to add_target_to_bin() and
remove_target_from_bin().
In order to be consistent and simpler to use define the default number
of bins as a module level constant and use it as the default value for
the number_of_bins argument for each of:
* delegate_hashed_bins()
* add_target_to_bin()
* remove_target_from_bin()
Signed-off-by: Joshua Lock <jlock@vmware.com>
Add some additional checks to test_add_target_to_bin to ensure the code
to add a target passing a fileinfo is tested.
Signed-off-by: Joshua Lock <jlock@vmware.com>
When testing delegate_hashed_bins to ensure that hash_path_prefixes
map to the generated name of the bin, also check to ensure that at least
one of the delegations contains one or more path_hash_prefixes.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Test the newly added functionality to:
* add a target to the repository without access to the target file on disk
* write targets metadata without access to target files on disk, by using
the existing fileinfo data from the roledb
Signed-off-by: Joshua Lock <jlock@vmware.com>
Merge the logger calls reporting information about the hashed bin
delegations into a single logger.info() call to ensure the messages
will be grouped together even when integrated into a logging system
with multiple parallel sources.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Add an additional optional parameter to add_target() and
add_target_to_bin() which is a fileinfo object matching
tuf.formats.FILEINFO_OBJECT
This parameter and the custom parameter are mutually exclusive and
thus cannot be passed at the same time.
Signed-off-by: Joshua Lock <jlock@vmware.com>
The file isn't strictly needed on-disk at the time add_target() and
add_targets() are called and this duplicates the check for the file's
presence in write[_all]()
By removing this check we allow extra versatility in adding targets.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Vastly simplify the implementation, using the _get_hash() and
_find_bin_for_hash() helpers added in earlier commits.
Furthermore, enable passing of the custom parameter to
add_target_to_bin() to better match add_target()
Signed-off-by: Joshua Lock <jlock@vmware.com>
Add a helper function to determine the name of a bin that a hashed
targetfile will be delegated to.
Based sketches by Lukas Puehringer in issues #994 & #995
Signed-off-by: Joshua Lock <jlock@vmware.com>
As we are adding and removing items from the hashed bins and checking
for their presence/absence it's simplest if we being with the hashed
bins initially empty.
If we pass a list of targets when we call delgate_hashed_bins() the
delegated roles have an initial set of targets delegated to them,
which complicates testing of adding then removing a target to a
delegated bin.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Add test to ensure delegated bin names are consistent with the hash
prefixes that are delegated to the role.
This is an implicit assumption of the current implementation, the
testing of which will enable us to modify the code with greater
confidence.
Signed-off-by: Joshua Lock <jlock@vmware.com>
One of the created target files has its file permissions encoded in the
targets metadata via the custom attribute of the add_target() function.
On Linux-based OS the umask value of the environment the script is run
in can result in different octal permissions for the created file, i.e.
on Fedora the default umask is 0002 (default permissions 664) whereas
on Debian/Ubuntu the default umask is 0022 (default permissions 644).
Explicitly chown 'file1' to octal permissions 644 so that the generated
data has the same custom attributes for targets regardless of which
Linux host they are generated on.
Signed-off-by: Joshua Lock <jlock@vmware.com>
* Fix the path referenced in the Purpose
* Change add_target() calls to pass file paths relative to targets dir
Signed-off-by: Joshua Lock <jlock@vmware.com>
Replace hard-coded logger names with __name__. For the most part this just uses
the standard conventions to create the same logger hierarchy as existed before.
The only real difference is that loggers created for printing during tests are
no longer part of the 'tuf' hierarchy.
Signed-off-by: Joshua Lock <jlock@vmware.com>
tests/simple_server.py was copied to tuf/scripts/ to "make testing
easier" (cf84d3f51f), although with
the current test setup the original (and recently patched to fix an
Windows/Py2 test issue) test simple_server.py can be used just as
well.
This commit:
- removes tuf/scripts/simple_server.py
Note: that version slightly differed from the original test
server, probably due to demands by the linter that is only executed
on the tuf core code and not on the tests. However, for the testing
purposes of simple_server.py these changes (i.e., `SystemRandom()`,
`if __name__ =='__main__':`) are not necessary.
- updates the tests that used tuf.scripts.simple_server to instead
use tests.simple_server,
- updates setup.py to not install the simple_server module as
script, when installing tuf, as it is only a testing script and
not meant for end-user usage.
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Since #885 the tests in TestUpdater and TestKeyRevocation fail on
Appveyor Python 2.7 builds. After some live debugging, it turns out
that the tests fail due to the extra amount of http requests to
the simple http server (see tests/simple_server.py) that were
added in #885.
The simple server runs in a subprocess and is re-used for the
entire TestCase. After a certain amount of requests it becomes
unresponsive. Note that neither the subprocess exits (ps -W), nor
does the port get closed (netstat -a). It just doesn't serve the
request, making it time out and fail the test.
The following script can be used to reproduce the issue (run in
tests directory):
```python
import subprocess
import requests
import random
counter = 0
port = random.randint(30000, 45000)
command = ['python', 'simple_server.py', str(port)]
server_process = subprocess.Popen(command, stderr=subprocess.PIPE)
url = 'http://localhost:'+str(port) + '/'
sess = requests.Session()
try:
while True:
sess.get(url, timeout=3)
counter +=1
finally:
print(counter)
server_process.kill()
```
It fails repeatedly on the 69th request, but only if
`stderr=subprocess.PIPE` is passed to Popen. Given that for each
request the simple server writes about ~60 characters to stderr,
e.g. ...
```
127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 -
```
... it looks a lot like a full pipe buffer of size 4096. Note that the
`bufsize` argument to Popen does not change anything.
As a simple work around we silence the test server on
Windows/Python2 to not fill the buffer.
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Prior to this commit metadadata signature verification as provided
by `tuf.sig.verify()` and used e.g. in `tuf.client.updater` counted
multiple signatures with identical authorized keyids each
separately towards the threshold. This behavior practically
subverts the signature thresholds check.
This commit fixes the issue by counting identical authorized keyids
only once towards the threshold.
The commit further clarifies the behavior of the relevant functions
in the `sig` module, i.e. `get_signature_status` and `verify` in
their respective docstrings. And adds tests for those functions and
also for the client updater.
---
NOTE: With this commit signatures with different authorized keyids
still each count separately towards the threshold, even if the
keyids identify the same key. If this behavior is not desired, I
propose the following fix instead. It verifies uniqueness of keys
(and not keyids):
```
diff --git a/tuf/sig.py b/tuf/sig.py
index ae9bae15..5392e596 100755
--- a/tuf/sig.py
+++ b/tuf/sig.py
@@ -303,7 +303,14 @@ def verify(signable, role, repository_name='default', threshold=None,
if threshold is None or threshold <= 0: #pragma: no cover
raise securesystemslib.exceptions.Error("Invalid threshold: " + repr(threshold))
- return len(good_sigs) >= threshold
+ # Different keyids might point to the same key
+ # To be safe, check against unique public key values
+ unique_good_sig_keys = set()
+ for keyid in good_sigs:
+ key = tuf.keydb.get_key(keyid, repository_name)
+ unique_good_sig_keys.add(key["keyval"]["public"])
+
+ return len(unique_good_sig_keys) >= threshold
```
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>