This guide outlines the expected level of details that a Data License should contain.
A Data License governs the terms of your data exchange, so it is important for the success of your project that all relevant details are included.
Please note: the details in this article are not definitive, nor exhaustive, and extra details can be included if needed.
In drafting the terms for your Data License, consider the following:
- OBJECTIVE: Summarise what you would like to do and achieve
- DATA-INPUT: Describe what data you require from each Data Contributor
- DATA-USE ON PLATFORM: Describe analytics to be conducted in the workspace
- DATA-OUTPUT: Elaborate on the data extracts proposed for this project
- OUTPUT USE OFF PLATFORM: Describe any further analytics or processes to be applied to any data extracted outside of the Workspace
- SPECIAL CONDITIONS: Describe any additional data sharing terms for agreement
- REQUIREMENTS FOR SENATE MATCHING: Outline the details of your data match
1. Summarise what you'd like to do in this project
This should give the Data Contributor(s) an understanding of what you are trying to accomplish using their data.
What is the goal of this project?
- data exploration to determine next steps
- understand the overlap of customers
- create new data outputs (e.g. an analytical model, new or derived customer attributes, etc)
Example: 'This is a data exploration project aimed towards understanding how the data exchange process works, and exploring how the Data Contributor’s data can be used internally for future projects. We plan on exploring the relationship between types of attributes with aggregate tables.'
DATA - INPUT
2. Describe the relevant data you require from each Data Contributor within this project, including any data from your own organisation or from open data sources. Include specific detail on the requirement, such as date range, geographic area or other, so the Data Contributor can prepare and filter their data appropriately for your project.
A few points to take into account for this question are:
- Describing the data that will be used for this project
You should include if possible a description of the attribute data for each contributing Organization. This is including any required filtering, such as specific date ranges, or a subset of geographies. Also, if you are adding tokens for a Senate Matching Project, then you can add that information as well.
Example: 'Data ranges from Feb 2019 - Feb 2020
Data only includes data of customers in Australia.
The attributes that Org A is contributing is:
- customer_flag 1
- customer_flag 2
- customer area code
- customer gender
The attributes that Org B is contributing is:
- customer data
- transaction data'
Both Org A and Org B will be supplying tokens for this project.'
- Specifying if the license allows for Open Source data (Yes/No)
This provides agreement on where the data comes from, and so that all parties involved understand what information will be combined with their data. An example would be
Example: 'Data from the Australian Bureau of Statistics (ABS) may be combined with data from contributors in this project.'
- Specifying whether or not updated versions of packages will be permitted under this license
This is important to note because updated package versions enable a Data Contributor to update the data in their package. This could include, for example, adding new data or a data dictionary for their data.
Example: 'New versions of data packages can be made available to allow for inclusion of scripts containing code for analytics, data dictionaries and new data as agreed by the data contributor.'
- Confirming if the data will require regular refreshes (Yes/No)
Specify the time period for the refreshes (i.e. monthly), or how many refreshes the analyst can request
'Yes, data refreshes are required monthly from Jan 2020 to Dec 2020'
'Yes, up to 6 data extracts are permitted from Jan 2020 to Jun 2020'
- Clarifying whether any organization can contribute any code/models to this project for use in the Workspace (Yes/No). If yes, then please specify which organizations.
Example: 'Organisation XYZ can contribute code/models for use in the Workspace'.
DATA - USE ON PLATFORM
3. Describe and explain the analytics you will be conducting within the Workspace.
Elaborate on what type of analysis will occur in the Workspace
- Creating a model?
- Looking at market share, or any other types of analysis?
- Creating customer profiles?
Detailed descriptions of End Product:
(Please be detailed as possible as this may drive the project value and cost of data)
- Aggregated insights?
- List of tokens (if matching)?
Neither organization should be able to re-identify the individuals on platform represented by tokens after matching in the Workspace.
Example: 'The following outputs will be permitted to be extracted from the Workspace:
- Aggregate tables based on XY insights.
- List of matched customers for use in marketing campaign
- Modelling scripts for internal use X.'
DATA - OUTPUT
4. Data Contributors need to understand what data extracts are proposed during your project. This will provide Data Republic with clear guidance on whether to approve or decline an extract request. Please outline:
a. Which extracts are anticipated at what volume and frequency;
- No extracts?
- 1 extract only?
- 1 per specified time period?
- Multiple extracts?
- Does the frequency differ for the type of extract required? (e.g. you require only 1 extract for scripts, but monthly aggregate data extracts)
Example: 'We anticipate 1 extract a month for the duration of this License.'
b. What is the purpose of each extract?
- Is this for a data product? E.g. a data product delivering economic insights
- Internal purposes? E.g. to show upper management value of Senate
- Suppression wash? E.g. suppression wash to inform marketing campaign targets
'The aggregate table extracts will be used for a data product.
Scripts for extraction will be used for internal use.'
OUTPUT USE OFF PLATFORM
5. Describe any further analytics or processes to be applied to any data extracted from the workspace. If there is an output recipient external to your organization, explain how the extract will be incorporated into the final deliverable. If there is no external output recipient, please explain how the extract will be consumed internally.
How will the extract be consumed?
- Will this data product be provided to another organisation (other than the Data Licensee)?
- Is there a EULA signed from the End User? (This is required if you will be sub-licensing the data product created)
Example: 'The data will be used for reporting to Org X to aid them with Y'
6. Describe any additional terms for agreement:
Some questions to consider are:
- Which organizations will access the workspace (not including DR)?
- Should there be no extraction of data from the Senate workspace (including screenshots)?
- Will you need to re-identify individuals off platform? (Data Republic will need to unmask any tokens provided in the extract so that you can match the original tokens back to your customer database)
- Should the workspace be terminated immediately after this license has expired?
Special Conditions for any outputs from the Workspace:
Some data contributors may have special rules for output extractions. Any outputs will be checked against these rules prior to approval data
Example: 'Data from no less than 5 customers (Data Licensee to define as…) must be aggregated in any one output cell'
REQUIREMENTS FOR SENATE MATCHING
7. What terms should be inserted into the Senate Matching, “Special Conditions” section?
Describe the customer data to be tokenized for Senate Matching (list the attributes for matching to align between both organizations)
Example: 'Each Org in this project will be matching on:
- First Name
- Last Name
Tips for drafting your Data License
- Fill in the Data License as best as you can with the information you have.
- Review the draft together with the organization(s) you are collaborating with. It may be helpful to arrange a conference call and screen share while you draft the Data License together. Send a copy for everyone to review after the call.
- Follow your internal process for review. Many organizations have an internal process to review the Data License prior submission on platform.
- Schedule a follow-up session if needed to agree on final details before submitting the Data License for approval. Note: The organization listed as the Data Licensee will need to click 'submit'.