Overview

Here is a step-by-step guide for analysts to follow when moving through a Senate Matching Project.

Prerequisites

  1. You have installed your Contributor Node
  2. You have received your Senate login
  3. You have reset your password upon first login

There are 3 main work streams that occur in Senate Matching Projects:

  1. Tokenization: where PII is prepared and tokens are generated to replace PII (using the Contributor Node) 
  2. Data preparation: where data is uploaded and prepared for projects (using the Manage Data menu), and 
  3. Project creation (and management): where organizations can collaborate and manage their data sharing (using the Project menu)

Note:

  • Tokenization must be completed before the Data Preparation
  • Data Preparation  can be started by each organization independently to Project Creation. 
  • Project creation (and management) is facilitated by the organization who is providing the Workspace

Tokenization Work Stream:

In this work stream, analysts will be preparing their PII data internally

0. Data Preparation (completed in each contributing organization's environment):

  • Refer to your Project Manager for more information on what data needs to be prepared. 
  • Make sure data is cleaned (i.e. no duplicate rows, removal of null value rows, standardised formatting for fields, etc.) to ensure that the match will be as accurate as possible. 
  • Data should be prepared according to what has been agreed between organizations. 

1. Load PI into the Contributor Node (CN): We recommend using API for upload.

2. Download tokens from the CN: Click the button to download the tokens and Person ID's into your environment.

3. Prepare your tokens for Senate: Remember to remove personid from the table you upload to Senate. 

  • Note: If you have attribute data for your project, you will need to first append your tokens to your attribute table for upload to allow for deeper analysis later (use personid as the primary key to join). 

Data Preparation Work Stream:

In this workstream, each organization focuses on loading their data into Senate, and packaging data for Projects.

0. Data Preparation (From the Tokenization Workstream):

  • Refer to your Project Manager for more information on what data needs to be prepared. 
  • Make sure data is cleaned (i.e. no duplicate rows, removal of null value rows, standardized formatting for fields, etc.) to ensure that the analysts can analyze the data as soon as possible. 
  • Data should be prepared according to what has been agreed between organizations. 

1. Upload Files: Drag and drop files less than 100MB or use SFTP for larger files.

2. Create Database: If you would like to query structured data (using View Builder) or allow approved analysts to query your data in a Redshift database, you will need to create a database.

3. Create Table and Load Data: Once you have a database ready, you can create tables for your data, using the databases to group similar tables together.

4. Create a View of your Data in View Builder (optional): This is optional, in the instance where you would like to share only a subset of your data for specific projects.

5. Create a Package: Specify the tables and files that you would like to contribute to a Project. A package contains meta-data only (i.e. file/table names).

Project Creation Work Stream:

This is where you collaborate with other organizations for a Project.

1. Add People: Add people who are needed for the Project (who have Senate user accounts).

2. Have Conversations: Use the Conversations to converse with everyone in the Project, or create subgroups (i.e. your organization and DR) to have all communications in one place.

3. Select a Legal Framework: Participating organizations can use Data Republic's Common Legal Framework for data exchange (if this has been executed by parties involved) or create and select their own legal framework to use in Projects (if an organization has signed the Third Party Module).

4. Add Packages (from the Data Preparation work stream): If you will be contributing data to a Project, you will need to add Packages to the Project (to specify which files and/or tables you will be sharing).

5. Submit the Data License: A Data License should be drafted while the Data is being prepared, as organizations will need to allocate enough time to discuss, negotiate, agree on terms for permitted use of data and approve the data license. 

6. Request a Workspace and Request a Match:  Once the Data License is approved, a Workspace can be created, people can be added, and the match can be triggered.

Data Analysis and Project Completion

Once data analysis is underway in the Project, data can be explored and outputs can be created if the Data License allows for this. Upon project completion, the Workspace will be terminated.

1. Analyse Data and Request Outputs (if allowed by the Data License): 

  • Output extract requests will be reviewed by the Senate Platform Administrator for approval and will be rejected if it does not align to Data License terms. 
  • If you need to re-identify tokens off platform, please contact your Data Republic Project Lead who will provide you with unmasked tokens.

2. Terminate Workspace: Once the Project is complete and permitted outputs have been extracted, the Workspace can be terminated, and all of the data inside deleted.

Related Articles:

Overview of Senate Matching Projects

Did this answer your question?