This article shows you how to create and modify data packages for exchange.
Once created, data packages can be added to:
- a project for approval and exchange, or to
- a data listing to enable discoverability and visibility of the meta data
Access to a data package by users is always controlled and must be approved by a data custodian of the organization (i.e. data license approver).
What is a data package?
A data package references all files associated with a data exchange, not just data tables and views. You can reference images and documents in packages, as well as SQL code and your end-user agreement.
Note: Prior to using adding or using a package in a project, if you make any changes to files or tables (such as renaming, deleting or moving them to another directory), you must update the file/table references in the package to avoid any referencing errors later on. You can make changes to the meta-data/references in a package by creating a new version.
What if I'm not ready to create the package yet?
- You can create an empty package as a place holder (containing no references to any tables/views/files). This will enable you to create a draft license and continue your discussions about permitted use with the data custodian. Note: only packages in approved status can be added to a project.
- Prior to submitting a data license for approval, package owners must update the license and add a new version of the data package (containing correct references to tables/views/files).
In this article you will learn about
- Creating a package
- Adding files to a package
- Adding tables/views to a package
- Configuring a package for Senate Matching
- Modifying packages
- Data package FAQs
You will have
- Uploaded data onto the Senate platform;
- Created a database and tables; and/or
- Loaded data into the tables; and/or
- Created a data view
Creating a package
From the menu, select Manage Data.
On the Manage Data screen, click the Packages tab, then click Create new package.
On the Create new package pop-up:
- Give the package a name.
- Give it a description.
- Click Create new package.
The package is created in draft status and a message displays to indicate the new package has been created.
Click the package name to edit the contents.
Should I add a table or file?
This depends on whether you are creating a data package for a listing in the Catalog and/or how the data recipient will want to work with the data in a project and what is allowed in the data license:
- If direct download of the data is permitted after a license is approved in a project, add the data as a file.
- If the data license only permits access to the data via a Discovery Workspace for analysis, it may be better to add the data as a table in the package. In a workspace, the user will be able to connect to a redshift database to query the data table using SQL Workbench, Python, R, Tableau, etc. If the user prefers to work with applications such as excel, you can also include the data as csv file in the package. You can also include in the package supporting documentation in other file formats, such as pdf, .doc, etc.
- If you are creating a data package to add to a Data Listing, we recommend adding a table to the package. If you do not add a table to the package, the meta data (table schema) will not be visible in the data listing. Analysts browsing a listing will often view the meta data to determine whether the dataset may be useful in solving their data problems.
Adding tables or views
Click Tables & Views to add some tables to the package.
To add a table or view to the package:
- Click the arrow next to a database to expand it.
- Select tables and views as required. They are listed lower down the screen as you select them.
To add files to the package:
- Click the arrow next to a folder to expand it and select a file. They are listed lower down the screen as you select them.
- Alternatively, select a folder to add all the files in that folder.
When you have added everything you need for the package, click Save Package.
Note: any tables/files added to the package are listed under 'Package inclusions'.
If you need to configure your package for Senate Matching, proceed to the next step below.
Alternatively, if you do not need to configure your package for matching and no further changes are needed, click Submit. From here, any changes will require a new version to be created.
The status of the data package will automatically change from 'draft' to ‘for review’ and then ‘approved’.
The package can now be added to a project or data listing for exchange.
Note: a term sheet is required to be filled out first.
Configuring a package for Senate Matching
On the packages screen, click Edit token link.
Refer to Configuring data packages for Senate Matching for next steps.
Note: Configuration changes can only be made if a package is in draft status. If your package is not in draft status, click the 'Edit as new version' button on the packages screen to see the 'Edit token link' button.
Click Submit when you are ready to approve this version of the package.
The status of the data package will automatically change from 'draft' to ‘for review’ and then ‘approved’ within a few minutes.
The approved package can now be added to a project or data listing for exchange.
Note: If you haven't already completed a contributor term sheet for the dataset (database), you must complete this first before adding the package to a project or listing.
Modify an approved package
To update data in a table only:
- Simply, load a new data file to the table in the Databases tab from the Manage Data screen. If required, re-load the package to your workspace if you need to work with the updated table (data in the table will be replaced rather than appended). Note: you do not need to create a new version of the package to update data in a table within the package.
To modify files, tables or view references (i.e. meta data):
- A 'new version' of the package must be created. You will also need to create a new version to update the package if the name/location of a file, table or view has changed to avoid any referencing errors later.
To create a new version of a package (i.e. to modify file/table/view references):
- From the 'Manage Data' screen, click the 'Packages' tab and go into the package you need to modify.
- Note: the option to edit as a new version is only available if you navigate to Manage > Packages tab; it is not available if you access the package screen via a project. Only the latest version can be edited, the option to create a new version will not be available for older versions which have been superseded.
- On the packages screen, Click 'Edit as new version' to create a new version to edit.
- Modify the package as required, select or unselect any files, tables or views, then click 'Save package'.
- When you are ready to lock down the changes, click 'Submit'. The data package status will automatically change to 'Approved' within a few minutes.
- Note: If the package does not automatically change to approved status, please contact Data Republic Support. Once the package is approved, you can add the package in a project or listing.
Data package FAQs
How are data packages used in Senate?
Data packages are created for data exchange on Senate.
- They can be added to projects so that organizations participating in an exchange can negotiate the permitted use of data for their project. Within a project, a data license must requested and approved by participating organizations before access to data can be granted.
- Data packages can also be listed in a Catalog to enable discoverability by other organizations or authorized users within your own organization. Authorized users can request access to the data package via its listing and negotiate a data license for permitted use of the data package in their project.
How do users access data in a package?
Users can only access the data package if a license for permitted use of data has been approved in a project. Access to data can be provided in two ways:
- If a data license permits direct download of the contents of the package e.g. a data file, Data Republic will provide the data licensee (data recipient) with a link to download the dataset.
- If a data license permits access to the data package via a Discovery Workspace only, then a copy of the file, table or view referenced in the package is created and loaded to a Discovery Workspace for analysis.
Within a Discovery Workspace:
- Users can connect to a redshift database (using SQL workbench, R, python, Tableau, Power BI, etc) to query data tables
- Any files referenced in the package can be found in a local directory in the workspace.
Note: Any data output requested from a workspace must align with the permitted use and extraction rules agreed to in the data license.