When using Privacy-Preserving Matching (formerly known as Senate Matching), your organization will need to install an application called a Contributor Node in your environment. This article will guide you through the installation and configuration process and will cover the following:

Prerequisites

Select a target infrastructure

Before starting to set up your Contributor Node, you'll need to decide where you'd like to install it. Most importantly, you'll need to decide whether you'd like an on-premise or cloud-hosted Linux server.

This is ultimately your decision, however, the following questions may aid your decision:

  • Do we currently have our own server available to install software to with an internet connection?

  • Does that server support the necessary operating system, software and specs required to run Privacy-Preserving Matching?

  • Do we currently have a cloud provider available? What is the policy on what data is permitted to be sent to the cloud provider, and what conditions apply?

  • Will our IT policies and process make it easier to stand up Privacy-Preserving Matching in an on-premises or cloud-hosted environment?

It's important to note that whether you run your node on-premise or in the cloud Privacy-Preserving Matching has been designed so that PII never leaves your organization. Even in a cloud-hosted configuration, plain-text PII is never sent outside your IT environment. All PII is hashed prior to being sent to the Privacy-Preserving Matching network outside of your organization (and is also encrypted in transit via SSL when being sent).

Prepare the server

First ensure that you have your Linux server ready and that it meets the necessary operating system, software and specs.

The next step is to make sure Docker and Docker Compose are properly installed and configured (see below).

Note that even in Linux distributions where Docker is included as standard, Docker Compose usually requires an additional package installation.

Install Docker and Docker Compose

  • During this step, you'll install Docker in your environment, which will manage and run a set of containers for your Contributor Node.

  • Data Republic support will provide you with the following script, which will install the necessary version of Docker and other dependencies.

  • We recommend creating a Unix group called docker and add related users to it after you finished installing Docker and Docker Compose. This is to avoid any potential permission related errors. You can find more details here.

Instruction for Ubuntu

prepare-contributor-ubuntu.sh

#!/usr/bin/env bash 

# The following script installs docker and docker compose, and makes the necessary permission
# changes to execute within the docker compose binary folder

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" sudo apt-get update
apt-cache policy docker-ce
sudo apt-get install -y docker-ce=18.03.1~ce-0~ubuntu
sudo systemctl status docker

# Download and install docker-compose
sudo curl -L https://github.com/docker/compose/releases/download/1.24.1/docker-compose-$(uname -s)-$(uname -m) -o /usr/local/bin/docker-compose
sudo chmod 555 /usr/local/bin/docker-compose

Instructions for Redhat

prepare-contributor-redhat.sh

# Installation For DR Contributor Nodes. 
# required Dependencies.

sudo yum -y update

# Docker Repo
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

# Install and enable ce
sudo yum -y install docker-ce

sudo systemctl start docker

sudo systemctl enable docker

# Download and install docker-compose
sudo curl -L https://github.com/docker/compose/releases/download/1.24.1/docker-compose-$(uname -s)-$(uname -m) -o /usr/local/bin/docker-compose sudo sudo chmod 555 /usr/local/bin/docker-compose

Instructions for Amazon Linux 2

prepare-contributor-amazonlinux2.sh

# Installation For DR Contributor Nodes. 
# See also https://docs.aws.amazon.com/AmazonECS/latest/developerguide/docker-basics.html

sudo yum -y update

# Docker Repo
sudo amazon-linux-extras install docker

# Start docker and allow non-root user. Change "ec2-user" below to account that will run contributor node.
sudo systemctl enable docker
sudo service docker start
sudo usermod -a -G docker ec2-user

# You may need to log out and log back in to get updated group permissions

# Download and install docker-compose
sudo curl -L https://github.com/docker/compose/releases/download/1.24.1/docker-compose-$(uname -s)-$(uname -m) -o /usr/local/bin/docker-compose
sudo chmod 555 /usr/local/bin/docker-compose

# Note that AL2 does not have /usr/local/bin in the secure path for sudo by default.
# If you install docker-compose in /usr/local/bin, add it to the "secure_path" defaults in /etc/sudoers

Check Ports & Network Access

See System requirements for Privacy-Preserving Matching for the network ports and hostnames that your Contributor Node will need to initiate connections to.

See also Proxy Support (below) if you'd prefer your node to connect via a HTTPS proxy.

Configure Contributor Node

When you are ready to proceed, contact support@datarepublic.com and request a Contributor Node installation package.

Data Republic support will provide you with a pre-configured package, which has all necessary environment variables and docker configuration preset. Place all files from the package in the same directory.

The structure of the package is similar to below:

├── certs 
│ ├── cert.pem
│ └── key.pem
├── contributor.sh
├── docker-compose.yaml
├── fluentd.conf
├── fluentd.crt
├── tls.crt
└── wait-for-it.sh
  • Contributor Node installation package comes with meaningful default values for all fields to minimize your configuration effort. However, there are two fields that we recommend to use your own settings. There are also a few other fields you can customize to suit your internal requirements and IT policies. You can find details below.

  • Most of the customization could be done by editing and saving contributor.sh (an example can be found in the References section below). This script defines all necessary environmental variables and starts all docker containers required by Contributor Node. It should be made executable if it has not.

  • Detailed inline comments are provided as a guide. Please read these comments carefully and please refrain from modifying any lines after the comment "These are supplied by DR. Do not change."

Change your Contributor Node UI password/API key - Required

  • This is the password used to access Contributor Node Web UI and API. It is defined by variable HITCH_BASICAUTHPASSWORD in contributor.sh.

  • A randomly generated UUID is used by default. This is a required field, thus can not be left empty and replacing it with a strong password owned by your organization is highly recommended.

  • Your Contributor Node also supports Single Sign-on.

Use your own Nginx TLS server certificate (Recommended)

  • The two files in the certs folder – cert.pem and key.pem, are the TLS server certificate and related private key used by Nginx which serves Contributor Node Web UI.

  • The default pair of certificate and key shipped with the package is ready for use.

  • We recommend to replace them with your own certificate and key pair. You can purchase a server certificate from a trusted certificate authority (CA), or you can create your own internal CA with an OpenSSL library and generate your own certificate.
    The server certificate together with a private key should be saved in PEM format and named cert.pem and key.pem respectively.

Configure outbound HTTP proxy

  • Optionally, you can configure your Contributor Node to use an HTTPS Proxy for its outbound connections. This is done by setting the environment variable HTTPS_PROXY in the contributor.sh script (see below) to the proxy endpoint.

  • Occasionally, you may also need to configure Contributor Node to use HTTP 1.1 instead of HTTP/2 if your proxy service does not support HTTP/2. Contributor Node uses HTTP/2 by default when talking to the Matcher Node Network. That is because Privacy-Preserving Matching uses gRPC which depends on HTTP/2. To fall back to HTTP 1.1, set the variable HITCH_HTTP1COMPATIBILITYMODE to true.

Example

# Set proxy to use for outbound connections. 
Set the # protocol to "https" and your internal hostname or IP address for your
# proxy. Ports can be set with ":portno" after the host.
# Using https:// scheme in the URL will imply that the client is expecting a client-hello certificate response: the proxy is supposed to do SSL proxying.
# If the proxy does not intend to do SSL Proxying, one can just drop the protocol or simply use http://.

export HTTPS_PROXY=https://proxy.internal.organization:4430

# If your proxy only supports HTTP 1 then you can force the Contributor Node
# to fall back to HTTP 1 instead of the default HTTP 2.

export HITCH_HTTP1COMPATIBILITYMODE=true

By default, there is no proxy between Contributor Node and the external part of the Privacy-Preserving Matching system.

Fallback to HTTP 1

Contributor Node uses HTTP/2 by default. In some cases where HTTP/2 is not available in your network(e.g. Legacy proxy service supports HTTP 1.1 only) you can set HITCH_HTTP1COMPATIBILITYMODE in contributor.sh to true to force your node to fall back to HTTP 1.1.

Allow non-hashed records via API

  • Your Contributor Node has a REST API. By default, the API only accepts pre-hashed data. This is the recommended configuration.

  • However, if you want to upload plain-text data via the API, you must set HITCH_HASHEDRECORDSONLY to false.

  • Note that, when uploading data with the Web UI, the data will always be hashed in the browser before sending to the the API.

Enable SSO with SAML

  • From release 1.7.0 onwards, your Contributor Node can be configured using SAML 2.0 to support single sign on (SSO). This feature is currently in beta.

  • You can find a step-by-step guide here.

  • By default, SSO is not enabled and you authenticate with a username and password. The default username is api and password is defined by variable HITCH_BASICAUTHPASSWORD in contributor.sh.

Start Contributor Node

Now you are ready to start your Contributor Node. To do so, run the following command from where the files are located:

Run Contributor Node

sh contributor.sh up -d

If the current user is not in the Linux group named docker, you may run into some permission related issue. Either add your user to that group and try again or run the same command with sudo.

If it is the first time running your Contributor Node, you will notice several docker images being downloaded and it may take a while.

When it returns, all docker containers required by Contributor Node should be running.

Verify the installation

Now the Contributor Node is running. You can now verify the installation.

Check all docker containers are running

All docker containers should now be in status 'running'. You can run the following command to verify that:

docker ps | grep contributor

You should see 5 running containers - similar to this:

Check Web UI is running

  • You can then access your Contributor Node Web UI using https://<node's IP address or internal hostname>:<contributor ui port>/

  • The Contributor Node UI listens on port 9057 by default. If you have changed it to a custom port, please use that instead

  • If SSO is enabled → you will be redirected to the login page of your identity provider (IdP)

  • If SSO is not enabled → you will be authenticated with a username and password. The default username is api and password is defined by variable HITCH_BASICAUTHPASSWORD in contributor.sh.

References

Example of contributor.sh

contributor.sh example

#!/usr/bin/env bash 
#
# This script exports the necessary environment variables for running a Senate
# Matching Contributor Node.
#
# Confidential information prepared for [TODO:COMPANY_NAME]. Internal use only.
#
# You may change any of these variables according to your IT policies
# You set this to your access password for this node. Initial value will be
# randomly generated by DR and should be changed.
#
export HITCH_BASICAUTHPASSWORD=XXX

# You may set this to your internal proxy service. Your contributor node will then use
# this to communicate with the rest of the Senate Matching network
#
export HTTPS_PROXY=
export HTTP_PROXY=

# DR recommends that you configure your node to only accept pre-hashed data.
# However during test it might be convenient to use plain-text synthetic data.
# Comment this line out if you want to be able to send non-hashed data via
# the API. Note: The browser UI interface always hashes data before calling
# APIs.
#
export HITCH_HASHEDRECORDSONLY=true

# Uncomment this line if your HTTP proxy does NOT support HTTP2, and you need
# to force your node to communicate over HTTP/1.1. HTTP2 is recommended if
# possible.
#
#export HITCH_HTTP1COMPATIBILITYMODE=true

# You may change these from the defaults, but generally won't need to when
# first installing.

# These variables specify the Docker image tag to be installed. DR will advise
# if they need updating and whether updates include important security fixes.
#
export HITCH_DOCKER_IMAGE_TAG=1.7.1
export HITCH_UI_DOCKER_IMAGE_TAG=1.7.1

# This is the name the Docker container will use to refer to itself. Can
# change to an internal hostname as required.
#
export HITCH_LOCALSERVERCOMMONNAME=hitch-contributor-XXX

# You set this for the node to configure its internal MySQL and Redis DB.
# Defaults are randomly generated by DR.
#
export HITCH_DB_PASSWORD=XXXX
export HITCH_DB_ROOT_PASSWORD=XXXX
export HITCH_REDISPASSWORD=XXXX

# HITCH_PORT is the port the API will listen on for HTTPS requests.
# HITCH_UI_PORT is the port the browser should connect to if you want to use
# the browser UI.
#
export HITCH_PORT=9054
export HITCH_UI_PORT=9055

# FluentD is a logging service. If enabled, some error and reporting
# logs are forwarded to DR for monitoring and support. This is optional
# and requires TCP port 24224 to be opened outbound to the host below. #
export FLUENT_SHARED_KEY=XXXXXX

# These are supplied by DR. They are specific for your organisation.
# Do not change.

# KMS token and node ID
export HITCH_KMSTOKEN=XXXX
export HITCH_POLICYSERVICETOKEN=XXXX
export HITCH_LOCALNODEID=XXXX
export HITCH_DOCKER_REGISTRY=registry.fpims.datarepublic.com.au

# Senate Matching region configuration
export HITCH_KMSADDRESS=https://vault.hitch.prod-au.datarepublic.io:8200
export HITCH_LOGFORMAT=json
export HITCH_POLICYSERVICEADDRESS=https://consul.hitch.prod-au.datarepublic.io:4430
export JAEGER_AGENT_HOST_PORT=
export JAEGER_ENABLED=false
export FLUENT_PROXY_ADDRESS=logs-dr.ops-au.datarepublic.io

echo "XXXX" | docker login -u XXXX --password-stdin registry.fpims.datarepublic.com.au

exec docker-compose -f docker-compose.yaml -p contributor $@

Other files in the package

You would normally NOT required to change other files unless advised by Data Republic support.

Here is a brief introduction of what they are used for:

  • docker-compose.yaml: defines all docker containers required by Contributor Node(i.e. API, UI, Database, etc.).

  • fluentd.conf: configuration file of Fluentd agent.

  • fluentd.crt: a certificate used by Fluentd agent to validate its server.

  • tls.crt: a certificate used by Contributor Node to validate Privacy-Preserving Matching system. Do not modify or delete.

  • wait-for-it.sh: a utility script used by some containers to make sure a dependency is ready.

Related Articles

Did this answer your question?