Once a Workspace virtual machine (VM) has been approved and is in a 'Running' state, you may request to load data to the Workspace. Anyone added to the project can make a data load request.
In this article you will learn how to:
- Load data packages to a workspace
- Load a token match table to a workspace (for Privacy Preserving Matching projects)
Additionally, there is an FAQ if you have further questions.
- you have created a project
- you have added people to your project
- a data package has been added to your project
- your licence has been approved by all parties
- you workspace has been created and approved
1. Load data packages to your workspace
Note: All package load requests to a workspace are tracked and versioned for audit purposes.
To request a package load to a workspace:
- In the Workspaces tab, click on a workspace name to view workspace details.
- Click Load data packages to workspace
If this is your first data package load request and there is no data in the workspace:
A list of approved data packages from your license will be displayed:
- Select the packages (checkbox) you want to load and unselect the packages that have loaded previously (if you do not unselect previously loaded packages, the load of that package will occur again)
- Click add packages
- Review your request and click Submit.
If you have existing data in the workspace and need to load a new package:
1. Request a new load by clicking on Create Version. The load request is versioned for audit purposes.
2. Check the box next to the packages you want to load:
- You can re-load the same package or select a new package to load.
- Loading of new packages: If you load a new package, this will not remove any existing data in the workspace.
- Re-loading of same package(s): Any tables in the package will replace existing tables by the same name in the workspace (this means data in the table will be replaced, rather than appended); Any files in the package will be copied to a new folder in the local directory of the VM.
3. Click Update to confirm your selection
4. Review your request and click Submit
Approval of data package load requests
- When Data Republic approves the request, the request will change from for review to processing and then approved. This may take a few minutes.
- The user who submitted the request will receive an email notification of the approved package and a link to log into the Data Republic platform and view the workspace tab in their project:
2. Load a token match table to your workspace
***For Matching projects only***
Request to add a token match table so you can perform a join between tokenized data tables in your workspace.
Note: You can make this request before or after a package is loaded. The data package must already be configured for Matching.
To request a token match table for your workspace:
- Click Load tokens to workspace
- Select the data license that governs the token match request, and the field to match against; if you select All fields, your token match table will display additional information on which fields two tokens have matched against.
- Click Add tokens
- Click Submit.
Approval of token load requests
- When Data Republic approves a request, the request will change from for review to processing and then approved.
- The user who submitted the token match request will be notified when the token match table is available in the workspace.
An example of a token match table (where all fields were selected for matching) is below.
- The token match table can be viewed by selecting the redshift database schema from SQL Workbench
- The token match table displays which fields two tokens have matched on
What data can I load?
- Data packages: load any data package listed or referred to in the 'permitted use' section of your approved data license
- A token match table (for data matching projects): load a token match table if you are using the Matching service to execute a data match on the Data Republic platform.
Can I update a package before loading?
Yes, you can update a package and load it to a VM in the workspace; however, your permitted use in the data license must allow for package updates / new versions to be added.
- To add/remove files or tables: Create a new version of the package to modify file or table references. Add the new package version to your project, and request a load to the workspace. Note: any existing data in the workspace will remain.
- To update table data only: Load a new file to the table using Manage Data. In your project, request a package load to the workspace (a new package version does not need to be created). Note: If there are existing tables in your workspace by the same name, data will be replaced (not appended).
Can I load a new package to the workspace?
Yes, however, the permitted use in your data license must allow for such data to be loaded to the workspace. To load a new package to the workspace, it must be added to the project first. Once your package is listed in the packages tab of your project, proceed to step 1 below to request a package load to your workspace.
How is data loaded to multiple VM's in a workspace?
If multiple virtual machines (VM's) have been created in a workspace:
- A data package containing tables can be loaded to a single VM. All VM's within the workspace will have access to the same redshift database schema; therefore, the same package does not need to be loaded to other VM's (to avoid the creation of duplicate database schemas).
- A data package containing files (e.g. documents) can be loaded to all VM's if access is required by users across all VM's. A new folder in the local directory is created for each load request and a copy of the file(s) will be placed in the folder.