A Workspace is a quarantined analytical environment, created within a governed project space after a data license has been approved by all project participants.
Depending on the level of governance selected for your project, data owners may request for sensitive data to be accessed only via a secure Workspace and for all outputs to be manually checked prior to approval for extraction. This ensures that any output created in a workspace complies with the permitted use and extraction rules agreed to in the data license.
In this article you will learn about:
you have created a project
you have added people to your project
your data license has been approved by all parties
Unlike generic solutions, our solution is specifically optimized to support ML and AI R&D and evaluation.
Workspaces are Windows or Linux virtual machines (VM) where approved datasets are loaded for analysis. Each governed project space may consist of one or more VMs.
The Data Republic platform offers a variety of workspace configurations that come preconfigured with common analytical tools so analysts can choose the configuration most suited to the analytics they are conducting.
What workspace configurations are available on the Data Republic platform?
The Data Republic platform offers both Windows and Linux based virtual machines in a variety of compute and processing configurations each with varying costs associated. To learn more about the cost of the different workspaces, please contact your Customer Success Manager or email@example.com.
Each new Workspace is allocated by default with:
200GB of local storage space
550GB of redshift database storage
5GB of Project Drive storage space
What is installed in a Windows Workspace?
Each Workspace comes preconfigured with common analytical tools for analysis. Workspaces with Windows operating systems come with the software listed below installed by default:
Please note: Software licenses are required if applicable, for e.g. Excel, Tableau, etc.
Python versions for Windows operating systems
Windows workspaces run Anaconda3 (Python 3.7) 2020.02 by default which includes Jupyter Notebook and QT Console. Anaconda in the standard Workspace currently comes with any packages ticked on this page:
To downgrade your environment please contact firstname.lastname@example.org.
What is installed in a Linux Workspace?
Each Workspace comes preconfigured with common analytical tools for analysis. Workspaces with Linux operating systems come with the software listed below installed by default:
Please note: Software licenses are required if applicable for e.g. PyCharm Professional Edition.
Python versions for Linux operating systems
Linux workspaces run Python 3.6.6 by default and the base software installs are made to Python 3.6.6. Anaconda in the standard Linux Workspace currently comes with any packages ticked on this page: https://docs.anaconda.com/anaconda/packages/py3.6_linux-64/
To access Python 3.7 run
'source activate python37' from the Workspace terminal command line. In addition, ipykernal for Jupyter is installed to run operations from the Python 3.7 shell.
If you require software installs to the Python 3.7 shell please contact email@example.com.
Workspace users can now bring their own Docker containers and run them in Linux CPU workspaces. This allows users to supercharge their analysis by extending their ability to bring in custom code or models and more easily manage their dependencies.
Note: Currently available for Linux CPU type workspaces only
What Workspace customizations are available?
Analysts are able to install library dependencies without needing Data Republic's help, giving you more control over your Workspace.
Please note: packages that have external binary dependencies, or other components not directly installable from PyPi, PythonHosted, CRAN, or Anaconda will require a support request to be installed.
To enable you to install your own Python and R libraries without compromising data protection, all installation requests go through our special proxy service. The proxy only allows "read only" requests to install standard packages from the specific repositories that DR has whitelisted.
R packages from CRAN within RStudio (Linux) can be installed by users to Workspaces.
Only the authorized Workspace users can pull the desired libraries from within that Workspace, and all access is logged and monitored.
You can also request further customization of your existing workspace. This article goes into more detail of what additional applications or installations can be accommodated.
How much does a Workspace cost?
Pricing for the Workspace is determined based on workspace size and associated costs. Additionally, you can stop a workspace when it is not in use to reduce costs. Your Workspace will be invoiced monthly. Please contact your Data Republic account manager for more information.
How do users access the Workspace?
Users can access the workspace in their web browser by logging into the Data Republic platform and navigating to the workspace tab in their project.
The request to add users to a workspace is part of the workspace set up form. Once a workspace request is approved by Data Republic, a new version of the workspace request form will need to be submitted if new users need access to the workspace.
Can multiple users access the Workspace at the same time?
Multiple users can be given access to the same Workspace (just like multiple people can log into a single computer), however, only one user can connect to the Workspace at any time. If you require multiple users in your project to access approved datasets in the Workspace at the same time, you’ll need to request for multiple Workspaces to be created.
Can two or more Workspaces be set up for a project? How will this work?
This collaborative arrangement allows users working in different Workspaces to connect to the same database at the same time. Any changes made to a table will be visible to users in all Workspaces.
If there are files in an approved data package, all Workspaces will get a copy of these files stored locally; any locally stored files in a Workspace cannot not shared with other Workspaces. This means that any changes made to a file in one Workspace will not be reflected in other Workspaces.
Any output request is subject to Data Republic approval and must align with the permitted use agreed to in the data license. If approved, output can be extracted from any Workspace once users have completed their analysis.