Before Senate can connect to an organization's S3 bucket, an organization must configure the policy for the S3 bucket to allow Data Republic to read and access the contents of the folder. The policy configuration is done outside of Senate, using your AWS Management Console.
Applying the policy to your S3 Bucket
The team responsible for your organization's AWS / S3 infrastructure must apply the below policy to your S3 Bucket.
This will enable Senate to crawl your data to build the data catalog (this is field/column-level information - no data is moved).
If there is already a policy against the S3 bucket, all policies should be merged. This will involve appending the below script to the bottom of the existing policy.
Data Republic does not store any credentials.
In your Policy, for each S3 bucket please specify:
- at the root level what folder(s) DR has access to in the s3 bucket
- the URL path to the specified folder(s)
- which data does DR have access to
Amazon guide on how to apply policy to S3 bucket:
Note: At all times, your organization has full control over Data Republic access to the S3 bucket. This means that your organization can update the policy at any time if needed.
Below is the policy that must be applied to any S3 bucket that is used as a local Data Source.
- DR_AWS_ACCOUNT_ID: The Data Republic AWS Account ID. Providing this means that we can securely lock down access to your S3 bucket from the Senate Platform.
- READ_BUCKET_NAME: The name of the S3 bucket you’d like Senate to read from.
- OPTIONAL/PREFIX/ : We recommend that you list all resources separately to restrict access specifically to the data you wish to share.