8.1. Node#

A node in its simplest would retrieve a task from the central server by an API call, run this task and finally return the results to the central server again.

The node application runs four threads:

Main thread: Checks the task queue and run the next task if there is one available.
Listening thread: Listens for incoming websocket messages. Among other functionality, it adds new tasks to the task queue.
Speaking thread: Waits for tasks to finish. When they do, return the results to the central server.
Proxy server thread: Algorithm containers are isolated from the internet for security reasons. The local proxy server provides an interface to the central server for algorithm containers to create subtasks and retrieve their results.

The node connects to the server using a websocket connection. This connection is mainly used for sharing status updates. This avoids the need for polling to see if there are new tasks available.

Below you will find the structure of the classes and functions that comprise the node. A few that we would like to highlight:

Node: the main class in a vantage6 node.
NodeContext and DockerNodeContext: classes that handle the node configuration. The latter inherits from the former and adds some properties for when the node runs in a docker container.
DockerManager: Manages the docker containers and networks of the vantage6 node.
DockerTaskManager: Start a docker container that runs an algorithm and manage its lifecycle.
VPNManager: Sets up the VPN connection (if it is configured) and manages it.
vnode-local commands: commands to run non-dockerized (development) instances of your nodes.

8.1.1. vantage6.node.Node#

class Node(ctx)#

Authenticates to the central server, setup encryption, a websocket connection, retrieving task that were posted while offline, preparing dataset for usage and finally setup a local proxy server..

Parameters:: ctx (NodeContext | DockerNodeContext) – Application context object.

__listening_worker()#

Listen for incoming (websocket) messages from the server.

Runs in a separate thread. Received events are handled by the appropriate action handler.

Return type:: None

__proxy_server_worker()#

Proxy algorithm container communcation.

A proxy for communication between algorithms and central server.

Return type:: None

__speaking_worker()#

Sending messages to central server.

Routine that is in a seperate thread sending results to the server when they come available.

Return type:: None

__start_task(task_incl_run)#

Start the docker image and notify the server that the task has been started.

Parameters:: task_incl_run (dict) – A dictionary with information required to run the algorithm
Return type:: None

authenticate()#

Authenticate with the server using the api-key from the configuration file. If the server rejects for any reason -other than a wrong API key- serveral attempts are taken to retry.

Return type:: None

connect_to_socket()#

Create long-lasting websocket connection with the server. The connection is used to receive status updates, such as new tasks.

Return type:: None

get_task_and_add_to_queue(task_id)#

Fetches (open) task with task_id from the server. The task_id is delivered by the websocket-connection.

Parameters:: task_id (int) – Task identifier
Return type:: None

initialize()#

Initialization of the node

Return type:: None

kill_containers(kill_info)#

Kill containers on instruction from socket event

Parameters:: kill_info (dict) – Dictionary received over websocket with instructions for which tasks to kill
Returns:: List of dictionaries with information on killed task (keys: run_id, task_id and parent_id)
Return type:: list[dict]

private_key_filename()#

Get the path to the private key.

Return type:: Path

run_forever()#

Keep checking queue for incoming tasks (and execute them).

Return type:: None

setup_encryption()#

Setup encryption if the node is part of encrypted collaboration

Return type:: None

setup_squid_proxy(isolated_network_mgr)#

Initiates a Squid proxy if configured in the config.yml

Expects the configuration in the following format:

whitelist:
    domains:
        - domain1
        - domain2
    ips:
        - ip1
        - ip2
    ports:
        - port1
        - port2

Parameters:: isolated_network_mgr (NetworkManager) – Network manager for isolated network
Returns:: Squid proxy instance
Return type:: Squid

setup_ssh_tunnels(isolated_network_mgr)#

Create a SSH tunnels when they are defined in the configuration file. For each tunnel a new container is created. The image used can be specified in the configuration file as ssh-tunnel in the images section, else the default image is used.

Parameters:: isolated_network_mgr (NetworkManager) – Manager for the isolated network
Return type:: list[SSHTunnel]

setup_vpn_connection(isolated_network_mgr, ctx)#

Setup container which has a VPN connection

Parameters:

isolated_network_mgr (NetworkManager) – Manager for the isolated Docker network
ctx (DockerNodeContext | NodeContext) – Context object for the node

Returns:

Manages the VPN connection

Return type:

VPNManager

share_node_details()#

Share part of the node’s configuration with the server.

This helps the other parties in a collaboration to see e.g. which algorithms they are allowed to run on this node.

Return type:: None

sync_task_queue_with_server()#

Get all unprocessed tasks from the server for this node.

Return type:: None

class DockerNodeContext(*args, **kwargs)#

Node context for the dockerized version of the node.

static instance_folders(instance_type, instance_name, system_folders)#: Log, data and config folders are always mounted. The node manager should take care of this.

set_folders(instance_type, instance_name, system_folders)#: In case of the dockerized version we do not want to use user specified directories within the container.

8.1.2. vantage6.node.docker.docker_base#

class DockerBaseManager(isolated_network_mgr, docker_client=None)#

Base class for docker-using classes. Contains simple methods that are used by multiple derived classes

get_isolated_netw_ip(container)#

Get address of a container in the isolated network

Parameters:: container (Container) – Docker container whose IP address should be obtained
Returns:: IP address of a container in isolated network
Return type:: str

8.1.3. vantage6.node.docker.docker_manager#

class DockerManager(ctx, isolated_network_mgr, vpn_manager, tasks_dir, client, proxy=None)#

Bases: DockerBaseManager

Wrapper for the docker-py module.

This class manages tasks related to Docker, such as logging in to docker registries, managing input/output files, logs etc. Results can be retrieved through get_result() which returns the first available algorithm result.

cleanup()#

Stop all active tasks and delete the isolated network

Note: the temporary docker volumes are kept as they may still be used by a parent container

Return type:: None

cleanup_tasks()#

Stop all active tasks

Returns:: List of information on tasks that have been killed
Return type:: list[KilledRun]

create_volume(volume_name)#

Create a temporary volume for a single run.

A single run can consist of multiple algorithm containers. It is important to note that all algorithm containers having the same job_id have access to this container.

Parameters:: volume_name (str) – Name of the volume to be created
Return type:: None

get_column_names(label, type_)#

Get column names from a node database

Parameters:

label (str) – Label of the database
type (str) – Type of the database

Returns:

List of column names

Return type:

list[str]

get_result()#

Returns the oldest (FIFO) finished docker container.

This is a blocking method until a finished container shows up. Once the container is obtained and the results are read, the container is removed from the docker environment.

Returns:: result of the docker image
Return type:: Result

is_docker_image_allowed(docker_image_name, task_info)#

Checks the docker image name.

Against a list of regular expressions as defined in the configuration file. If no expressions are defined, all docker images are accepted.

Parameters:

docker_image_name (str) – uri to the docker image
task_info (dict) – Dictionary with information about the task

Returns:

Whether docker image is allowed or not

Return type:

bool

is_running(run_id)#

Check if a container is already running for <run_id>.

Parameters:: run_id (int) – run_id of the algorithm container to be found
Returns:: Whether or not algorithm container is running already
Return type:: bool

kill_selected_tasks(org_id, kill_list=None)#

Kill tasks specified by a kill list, if they are currently running on this node

Parameters:

org_id (int) – The organization id of this node
kill_list (list[ToBeKilled]) – A list of info about tasks that should be killed.

Returns:

List with information on killed tasks

Return type:

list[KilledRun]

kill_tasks(org_id, kill_list=None)#

Kill tasks currently running on this node.

Parameters:

org_id (int) – The organization id of this node
kill_list (list[ToBeKilled] (optional)) – A list of info on tasks that should be killed. If the list is not specified, all running algorithm containers will be killed.

Returns:

List of dictionaries with information on killed tasks

Return type:

list[KilledRun]

link_container_to_network(container_name, config_alias)#

Link a docker container to the isolated docker network

Parameters:

container_name (str) – Name of the docker container to be linked to the network
config_alias (str) – Alias of the docker container defined in the config file

Return type:

None

login_to_registries(registries=[])#

Parameters:: registries (list) – list of registries to login to
Return type:: None

run(run_id, task_info, image, docker_input, tmp_vol_name, token, databases_to_use)#

Checks if docker task is running. If not, creates DockerTaskManager to run the task

Parameters:

run_id (int) – Server run identifier
task_info (dict) – Dictionary with task information
image (str) – Docker image name
docker_input (bytes) – Input that can be read by docker container
tmp_vol_name (str) – Name of temporary docker volume assigned to the algorithm
token (str) – Bearer token that the container can use
databases_to_use (list[str]) – Labels of the databases to use

Returns:

Returns a tuple with the status of the task and a description of each port on the VPN client that forwards traffic to the algorithm container (None if VPN is not set up).

Return type:

TaskStatus, list[dict] | None

class Result(run_id: int, task_id: int, logs: str, data: str, status: str, parent_id: int | None)#

Data class to store the result of the docker image.

Variables:

run_id (int) – ID of the current algorithm run
logs (str) – Logs attached to current algorithm run
data (str) – Output data of the algorithm
status_code (int) – Status code of the algorithm run

8.1.4. vantage6.node.docker.task_manager#

class DockerTaskManager(image, docker_client, vpn_manager, node_name, run_id, task_info, tasks_dir, isolated_network_mgr, databases, docker_volume_name, alpine_image=None, proxy=None, device_requests=None)#

Bases: DockerBaseManager

Manager for running a vantage6 algorithm container within docker.

Ensures that the environment is properly set up (docker volumes, directories, environment variables, etc). Then runs the algorithm as a docker container. Finally, it monitors the container state and can return it’s results when the algorithm finished.

cleanup()#

Cleanup the containers generated for this task

Return type:: None

get_results()#

Read results output file of the algorithm container

Returns:: Results of the algorithm container
Return type:: bytes

is_finished()#

Checks if algorithm container is finished

Returns:: True if algorithm container is finished
Return type:: bool

pull(local_exists)#

Pull the latest docker image.

Parameters:: local_exists (bool) – Whether the image already exists locally
Raises:: PermanentAlgorithmStartFail – If the image could not be pulled and does not exist locally
Return type:: None

report_status()#

Checks if algorithm has exited successfully. If not, it prints an error message

Returns:: logs – Log messages of the algorithm container
Return type:: str

run(docker_input, tmp_vol_name, token, algorithm_env, databases_to_use)#

Runs the docker-image in detached mode.

It will will attach all mounts (input, output and datafile) to the docker image. And will supply some environment variables.

Parameters:

docker_input (bytes) – Input that can be read by docker container
tmp_vol_name (str) – Name of temporary docker volume assigned to the algorithm
token (str) – Bearer token that the container can use
algorithm_env (dict) – Dictionary with additional environment variables to set
databases_to_use (list[str]) – List of labels of databases to use in the task

Returns:

Description of each port on the VPN client that forwards traffic to the algo container. None if VPN is not set up.

Return type:

list[dict] | None

8.1.5. vantage6.node.docker.vpn_manager#

class VPNManager(isolated_network_mgr, node_name, node_client, vpn_volume_name, vpn_subnet, alpine_image=None, vpn_client_image=None, network_config_image=None)#

Bases: DockerBaseManager

Setup a VPN client in a Docker container and configure the network so that the VPN container can forward traffic to and from algorithm containers.

connect_vpn()#

Start VPN client container and configure network to allow algorithm-to-algoritm communication

Return type:: None

exit_vpn(cleanup_host_rules=True)#

Gracefully shutdown the VPN and clean up

Parameters:: cleanup_host_rules (bool, optional) – Whether or not to clear host configuration rules. Should be True if they have been created at the time this function runs.
Return type:: None

forward_vpn_traffic(helper_container, algo_image_name)#

Setup rules so that traffic is properly forwarded between the VPN container and the algorithm container (and its helper container)

Parameters:

algo_helper_container (Container) – Helper algorithm container
algo_image_name (str) – Name of algorithm image that is run

Returns:

Description of each port on the VPN client that forwards traffic to the algo container. None if VPN is not set up.

Return type:

list[dict] | None

get_vpn_ip()#

Get VPN IP address in VPN server namespace

Returns:: IP address assigned to VPN client container by VPN server
Return type:: str

has_connection()#

Return True if VPN connection is active

Returns:: True if VPN connection is active, False otherwise
Return type:: bool

static is_isolated_interface(ip_interface, vpn_ip_isolated_netw)#

Return True if a network interface is the isolated network interface. Identify this based on the IP address of the VPN client in the isolated network

Parameters:

ip_interface (dict) – IP interface obtained by executing ip –json addr command
vpn_ip_isolated_netw (str) – IP address of VPN container in isolated network

Returns:

True if this is the interface describing the isolated network

Return type:

bool

send_vpn_ip_to_server()#

Send VPN IP address to the server

Return type:: None

8.1.6. vantage6.node.docker.exceptions#

Below are some custom exception types that are raised when algorithms cannot be executed successfully.

exception AlgorithmContainerNotFound#: Algorithm container was lost. Potentially running it again would resolve the issue.

exception PermanentAlgorithmStartFail#: Algorithm failed to start and should not be attempted to be started again.

exception UnknownAlgorithmStartFail#: Algorithm failed to start due to an unknown reason. Potentially running it again would resolve the issue.

8.1.7. vantage6.node.proxy_server#

This module contains a proxy server implementation that the node uses to communicate with the server. It contains general methods for any routes, and methods to handle tasks and results, including their encryption and decryption.

(!) Not to be confused with the squid proxy that allows algorithm containers to access other places in the network.

decrypt_result(run)#

Decrypt the result from a run dictonary

Parameters:: run (dict) – Run dict
Returns:: Run dict with the result decrypted
Return type:: dict

get_method(method)#

Obtain http method based on string identifier

Parameters:: method (str) – Http method requested
Returns:: HTTP method
Return type:: function

get_response_json_and_handle_exceptions(response)#

Obtain json content from request response

Parameters:: response (requests.Response) – Requests response object
Returns:: Dict containing the json body
Return type:: dict | None

make_proxied_request(endpoint)#

Helper to create proxies requests to the central server.

Parameters:: endpoint (str) – endpoint to be reached at the vantage6 server
Returns:: Response from the vantage6 server
Return type:: requests.Response

make_request(method, endpoint, json=None, params=None, headers=None)#

Make request to the central server

Parameters:

method (str) – HTTP method to be used
endpoint (str) – endpoint of the vantage6 server
json (dict, optional) – JSON body
params (dict, optional) – HTTP parameters
headers (dict, optional) – HTTP headers

Returns:

Response from the vantage6 server

Return type:

requests.Response

proxy(central_server_path)#

Generalized http proxy request

Parameters:: central_server_path (str) – The endpoint on the server to be reached
Returns:: Contains the server response
Return type:: requests.Response

proxy_result()#

Obtain and decrypt all results to belong to a certain task

Parameters:: id (int) – Task id from which the results need to be obtained
Returns:: Reponse from the vantage6 server
Return type:: requests.Response

proxy_results(id_)#

Obtain and decrypt the algorithm result from the vantage6 server to be used by an algorithm container.

Parameters:: id (int) – Id of the result to be obtained
Returns:: Response of the vantage6 server
Return type:: requests.Response

proxy_task()#

Proxy to create tasks at the vantage6 server

Returns:: Response from the vantage6 server
Return type:: requests.Response

8.1.8. vantage6.node.cli.node#

This contains the vnode-local commands. These commands are similar to the v6 node CLI commands, but they start up the node outside of a Docker container, and are mostly intended for development purposes.

Some commands, such as vnode-local start, are used within the Docker container when v6 node start is used.

vnode-local#

Command vnode-local.

vnode-local [OPTIONS] COMMAND [ARGS]...

files#

Print out the paths of important files.

If the specified configuration cannot be found, it exits. Otherwise it returns the absolute path to the output.

vnode-local files [OPTIONS]

Options

-n, --name <name>#: Configuration name

--system#: Use configuration from system folders (default)

--user#: Use configuration from user folders

list#

Lists all nodes in the default configuration directories.

vnode-local list [OPTIONS]

new#

Create a new configation file.

Checks if the configuration already exists. If this is not the case a questionaire is invoked to create a new configuration file.

vnode-local new [OPTIONS]

Options

-n, --name <name>#: Configuration name

--system#: Use configuration from system folders (default)

--user#: Use configuration from user folders

start#

Start the node instance.

If no name or config is specified the default.yaml configuation is used. In case the configuration file not exists, a questionaire is invoked to create one.

vnode-local start [OPTIONS]

Options

-n, --name <name>#: Configuration name

-c, --config <config>#: Absolute path to configuration-file; overrides “name”

--system#: Use configuration from system folders (default)

--user#: Use configuration from user folders

--dockerized, -non-dockerized#: Whether to use DockerNodeContext or regular NodeContext (default)

version#

Returns current version of vantage6 services installed.

vnode-local version [OPTIONS]