This commit better separates the Metadata class model from the
Metadata wireline format, by tailoring the constructors
towards class-based parameters and adding an additional
factory classmethod that creates Metadata objects based on the
wireline json/dictionary metadata representation. (pythonic
way of constructor overloading).
This 'from_dict' factory method recurses into the 'from_dict'
methods of each contained complex field/attribute that is also
represented by a class. Currently 'signed' is the only such
attribute.
This commit further:
- Changes optional constructor keyword arguments to mandatory
positional arguments: Reduces code and simplifies usage by
restricting it. For now, users are unlikely to call
constructor directly anyway, but the 'from_dict' factory (or
its 'from_json_file' wrapper) instead.
- Removes Signed.__expiration (datetime) vs. Signed.expires
(datestring) dichotomy: Keeping only one representation of the
same attribute in memory makes the interface simpler and less
ambiguous. We choose the datetime object, because it is more
convenient to modify. Transformation from and to the string
format required by the tuf wireline format is performed in the
corresponding metadata de/serialization methods, i.e.
('to_dict' and 'from_dict').
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Consistenly rename de/serialization interface methods, using
a 'from_' and 'to_' prefix.
read_from_json -> from_json_file
write_to_json -> to_json_file
as_json -> to_json
as_dict -> to_dict
signed_bytes -> to_canonical_bytes
The latter is also changed from a property to a method for
consistency with the other serialization methods.
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Remove metadata factory on Signed class, for the sake of API
simplicity/non-ambiguity, i.e. it's enough to have one
way of loading any Metadata, that is:
Metadata.read_from_json
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Add simple methods to create or verify signatures of the
canonical_signed property of a Metadata object.
See corresponding docstrings for behavior and design
considerations.
The commit also adds tests and updates the test setup to load
some test keys into memory.
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Add simple as_json Metadata method and use it instead of repository
lib's internal _get_written_metadata function in write_to_json.
This commit further adds code documentation and the possibility to
write compact json by excluding whitespace to write_to_json, and
also removes a call to the sign method from write_to_json.
The commit also adds tests.
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Add generic read from json class method that returns a Metadata
object with a signed field that contains the appropriate Signed
subclass, based on the signed._type field of the read metadata.
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
The new metadata module uses constructs that are only available
on Python >= 3.6 (typing, f-format strings, etc.).
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Add metadata module with container classes for TUF role metadata, including
methods to read/serialize/write from and to JSON, perform TUF-compliant
metadata updates, and create and verify signatures.
The 'Metadata' class provides a container for inner TUF metadata objects (Root,
Timestamp, Snapshot, Targets) (i.e. OOP composition)
The 'Signed' class provides a base class to aggregate common attributes (i.e.
version, expires, spec_version) of the inner metadata classes. (i.e. OOP
inheritance). The name of the class also aligns with the 'signed' field of
the outer metadata container.
Based on prior observations in TUF's sister project in-toto, this architecture
seems to well represent the metadata model as it is defined in the
specification (see in-toto/in-toto#98 and in-toto/in-toto#142 for related
discussions).
This commits also adds tests.
**TODO: See doc header TODO list**
**Additional design considerations**
(also in regards to prior sketches of this module)
- Aims at simplicity, brevity and recognizability of the wireline metadata
format.
- All attributes that correspond to fields in TUF JSON metadata are public.
There doesn't seem to be a good reason to protect them with leading
underscores and use setters/getters instead, it just adds more code, and
impedes recognizability of the wireline metadata format.
- Although, it might be convenient to have short-cuts on the Metadata class
that point to methods and attributes that are common to all subclasses of
the contained Signed class (e.g. Metadata.version instead of
Metadata.signed.version, etc.), this also conflicts with goal of
recognizability of the wireline metadata. Thus we won't add such short-cuts
for now. See:
https://github.com/theupdateframework/tuf/pull/1060#discussion_r452906629
- Signing keys and a 'consistent_snapshot' boolean are not on the targets
metadata class. They are a better fit for management code. See:
https://github.com/theupdateframework/tuf/pull/1060#issuecomment-660056376,
and #660.
- Does not use sslib schema checks (see TODO notes about validation in
doc header)
- Does not use existing tuf utils, such as make_metadata_fileinfo,
build_dict_conforming_to_schema, if it is easy and more explicit to
just re-implement the desired behavior on the metadata classes.
- All datetime's are treated as UTC. Since timezone info is not captured in
the wireline metadata format it should not be captured in the internal
representation either.
- Does not use 3rd-party dateutil package, in order to minimize dependency
footprint, which is especially important for update clients which often have
to vendor their dependencies.
However, compatibility between the more advanced dateutil.relativedelta (e.g
handles leap years automatically) and timedelta is tested.
- Uses PEP8 indentation (4 space) and Google-style doc string instead of
sslab-style. See
https://github.com/secure-systems-lab/code-style-guidelines/issues/20
- Does not support Python =< 3.5
Co-authored-by: Trishank Karthik Kuppusamy <trishank.kuppusamy@datadoghq.com>
Co-authored-by: Joshua Lock <jlock@vmware.com>
Co-authored-by: Teodora Sechkova <tsechkova@vmware.com>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
This was erroneously absent in PR 1024, which added support for abstract
files and directories. Resolve by adding a storage_backend argument to
generate_timestamp_metadata() and using it so that the fileinfo (hashes
and length) for the snapshot file can be generated for a snapshot
metadata file on any supported storage.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Add a class implementing StorageBackendInterface for testhing which
mutates filenames on put()/get(), such that trying to read the expected
file paths for TUF metadata from the local filesystem doesn't find the
files.
Use this class when creating a repository and writing metadata to test
abstract files and directories support for metadata writing.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Utilise the abstract files and directories support to enable generating
targets metadata for files which aren't necessarily locally accessible,
rather than requiring that metadata for non-local files be provided via
existing fileinfo structures.
Signed-off-by: Joshua Lock <jlock@vmware.com>
The specification lists four fundamental roles: root, targets, snapshot
and timestamp. Loading a repository where those roles are not present
should not be supported, therefore convert debug messages on the absence
of metadata files for these fundamental roles into a RepositoryError
exception.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Switch to using the new abstract files and directories support in
securesystemslib by taking an object which implements
securesystemslib.storage.StorageBackendInterface in the Repository
constructor, passed in by tuf.repository_tool.create_new_repository() and
tuf.repository_tool.load_repository()
The Updater class in tuf.client.updater does not specify a storage backend
and instead allows the functions in securesystemslib to perform the
default action of instantiating a LocalFilesystemBackend, that is the
updater does not currently support abstract filesystem backends and always
defaults to using local storage.
Finally we drop support for tuf.settings.CONSISTENT_METHOD as it's not as
clear how different copying modes should work when the details of the
underlying storage are abstracted away.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Support for compressed files was removed in tuf v0.10.x leaving behind
some vestiges like the test logic in test_repository_lib, which is
duplicated below and carries a redundant comment, and setting compression
on in generate_project_data.py
Signed-off-by: Joshua Lock <jlock@vmware.com>
After a discussion with Joshua Lock, we agreed that for
Windows users it would be good to provide the option to use
SimpleHTTPRequestHandler, but still leave a warning about it,
knowing that this caused an error before.
See: 7dbb30ae10
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Passing a pipe to the subprocess, but not reading from it
conceals helpful error messages.
As the code redirects all of the stderr from the subprocess
to nowhere, the error output of the process is never read.
If we remove the PIPEs from the tests we should see some
error messages on the console/logger that can
help us understand what went wrong.
On another hand, when we stop passing stderr=subprocess.PIPE arg
to the subprocess.Popen function call there are a lot of
HTTP messages together with the helpful error messages.
One decision is to make QuietHTTPRequestHandler
the default. That way we receive the helpful error messages
without the HTTP messages.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Fixes issue: https://github.com/theupdateframework/tuf/issues/1010
When running the tests this error appears
"[Errno 111] Connection refused". After some digging Lukas
found another error "No module named tests.simple_server"
which is the root case for error 111.
With the help of Joshua Lock, we found out that the simple_server.py
was not being found because subprocess.Popen was being passed
a cwd kwarg which moved the current working directory
away from tuf/tests.
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Use a hard-coded unix separator ('/') so that an
exception is also raised for paths starting with '/'
when executing on Windows systems.
Update test_check_path to explicitly test invalid paths
starting with Windows style separator.
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Improve the coding style in TUTORIAL in the case
where absolute path to a file is needed to perform file system
access and at the same time is rejected by Targets methods.
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
- add a test for _check_path() method of Targets class.
- update all tests calling _check_path() respectively
- update test_tutorial
Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
delegate_hashed_bins() has a number_of_bins parameter which defaults to
1024. add_target_to_bin() and remove_target_from_bin() both have a
number_of_bins parameter with no default. This means that in the
(somewhat) unlikely case that someone is using the default
number_of_bins when creating hashed bins they will need know what that
default value is and pass it to add_target_to_bin() and
remove_target_from_bin().
In order to be consistent and simpler to use define the default number
of bins as a module level constant and use it as the default value for
the number_of_bins argument for each of:
* delegate_hashed_bins()
* add_target_to_bin()
* remove_target_from_bin()
Signed-off-by: Joshua Lock <jlock@vmware.com>
Add some additional checks to test_add_target_to_bin to ensure the code
to add a target passing a fileinfo is tested.
Signed-off-by: Joshua Lock <jlock@vmware.com>
When testing delegate_hashed_bins to ensure that hash_path_prefixes
map to the generated name of the bin, also check to ensure that at least
one of the delegations contains one or more path_hash_prefixes.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Test the newly added functionality to:
* add a target to the repository without access to the target file on disk
* write targets metadata without access to target files on disk, by using
the existing fileinfo data from the roledb
Signed-off-by: Joshua Lock <jlock@vmware.com>
Merge the logger calls reporting information about the hashed bin
delegations into a single logger.info() call to ensure the messages
will be grouped together even when integrated into a logging system
with multiple parallel sources.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Add an additional optional parameter to add_target() and
add_target_to_bin() which is a fileinfo object matching
tuf.formats.FILEINFO_OBJECT
This parameter and the custom parameter are mutually exclusive and
thus cannot be passed at the same time.
Signed-off-by: Joshua Lock <jlock@vmware.com>
The file isn't strictly needed on-disk at the time add_target() and
add_targets() are called and this duplicates the check for the file's
presence in write[_all]()
By removing this check we allow extra versatility in adding targets.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Vastly simplify the implementation, using the _get_hash() and
_find_bin_for_hash() helpers added in earlier commits.
Furthermore, enable passing of the custom parameter to
add_target_to_bin() to better match add_target()
Signed-off-by: Joshua Lock <jlock@vmware.com>
Add a helper function to determine the name of a bin that a hashed
targetfile will be delegated to.
Based sketches by Lukas Puehringer in issues #994 & #995
Signed-off-by: Joshua Lock <jlock@vmware.com>
As we are adding and removing items from the hashed bins and checking
for their presence/absence it's simplest if we being with the hashed
bins initially empty.
If we pass a list of targets when we call delgate_hashed_bins() the
delegated roles have an initial set of targets delegated to them,
which complicates testing of adding then removing a target to a
delegated bin.
Signed-off-by: Joshua Lock <jlock@vmware.com>
Add test to ensure delegated bin names are consistent with the hash
prefixes that are delegated to the role.
This is an implicit assumption of the current implementation, the
testing of which will enable us to modify the code with greater
confidence.
Signed-off-by: Joshua Lock <jlock@vmware.com>
One of the created target files has its file permissions encoded in the
targets metadata via the custom attribute of the add_target() function.
On Linux-based OS the umask value of the environment the script is run
in can result in different octal permissions for the created file, i.e.
on Fedora the default umask is 0002 (default permissions 664) whereas
on Debian/Ubuntu the default umask is 0022 (default permissions 644).
Explicitly chown 'file1' to octal permissions 644 so that the generated
data has the same custom attributes for targets regardless of which
Linux host they are generated on.
Signed-off-by: Joshua Lock <jlock@vmware.com>
* Fix the path referenced in the Purpose
* Change add_target() calls to pass file paths relative to targets dir
Signed-off-by: Joshua Lock <jlock@vmware.com>
Replace hard-coded logger names with __name__. For the most part this just uses
the standard conventions to create the same logger hierarchy as existed before.
The only real difference is that loggers created for printing during tests are
no longer part of the 'tuf' hierarchy.
Signed-off-by: Joshua Lock <jlock@vmware.com>
tests/simple_server.py was copied to tuf/scripts/ to "make testing
easier" (cf84d3f51f), although with
the current test setup the original (and recently patched to fix an
Windows/Py2 test issue) test simple_server.py can be used just as
well.
This commit:
- removes tuf/scripts/simple_server.py
Note: that version slightly differed from the original test
server, probably due to demands by the linter that is only executed
on the tuf core code and not on the tests. However, for the testing
purposes of simple_server.py these changes (i.e., `SystemRandom()`,
`if __name__ =='__main__':`) are not necessary.
- updates the tests that used tuf.scripts.simple_server to instead
use tests.simple_server,
- updates setup.py to not install the simple_server module as
script, when installing tuf, as it is only a testing script and
not meant for end-user usage.
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Since #885 the tests in TestUpdater and TestKeyRevocation fail on
Appveyor Python 2.7 builds. After some live debugging, it turns out
that the tests fail due to the extra amount of http requests to
the simple http server (see tests/simple_server.py) that were
added in #885.
The simple server runs in a subprocess and is re-used for the
entire TestCase. After a certain amount of requests it becomes
unresponsive. Note that neither the subprocess exits (ps -W), nor
does the port get closed (netstat -a). It just doesn't serve the
request, making it time out and fail the test.
The following script can be used to reproduce the issue (run in
tests directory):
```python
import subprocess
import requests
import random
counter = 0
port = random.randint(30000, 45000)
command = ['python', 'simple_server.py', str(port)]
server_process = subprocess.Popen(command, stderr=subprocess.PIPE)
url = 'http://localhost:'+str(port) + '/'
sess = requests.Session()
try:
while True:
sess.get(url, timeout=3)
counter +=1
finally:
print(counter)
server_process.kill()
```
It fails repeatedly on the 69th request, but only if
`stderr=subprocess.PIPE` is passed to Popen. Given that for each
request the simple server writes about ~60 characters to stderr,
e.g. ...
```
127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 -
```
... it looks a lot like a full pipe buffer of size 4096. Note that the
`bufsize` argument to Popen does not change anything.
As a simple work around we silence the test server on
Windows/Python2 to not fill the buffer.
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>