After you have uploaded your data, create a database. Databases contain one or more tables of data.
This article shows you how to prepare a database.
Why do I need to create a database?
Create a database so you can create a table and load your data to the table. Upon data license approval in a project, a copy of tables referenced in a data package can be provided in a Workspace for analysis by authorized users. Users in a workspace can connect to a redshift database to query tables and analyze data using tools such as SQL Workbench, python or R.
In this article, you will learn how to:
You will have:
- Uploaded your data onto the Data Republic platform
Create a database
On the Manage Data screen, click the Database tab.
Click Create Database.
On the Create Database screen:
- Give the database a name. The name must only have lowercase letters, digits and if you need a separator, use the underscore character.
- Optionally, add a comment so that others know the purpose of the database.
- Click Create Database.
A new database has been created.
Note: Your new database does not contain any data or tables. Tables will need to be created in the database before data can be loaded.
From the Database Screen, click Create table.
There are two parts to the Add Table screen. The top part is where you enter basic information about the table, while the bottom part is where you create your table columns.
On the Add Table screen:
- Give the table a name. The name must only have lowercase letters, digits and if you need a separator, use the underscore character.
- Type a description of the table.
- Scroll down to Add Columns.
In the Add Columns section:
- Type the name of the first column. Column names must only have lowercase letters, digits.
- Give it a description.
- Select a data type from the list.
Important things to note when creating columns:
- When specifying column names, do not use words reserved for databases, such as ‘date’.
- When adding column descriptions, do not use any special characters, including apostrophes or quotes.
- Make sure the column data type matches the data type in the file you intend to load, otherwise there will be errors when loading data to the table. If in doubt, and if applicable, select the data type ‘string’ to prevent errors when loading data to table.
Add more columns as required.
Click Create table when you have finished adding columns.
The new table is added to the database.
It doesn’t have any rows because we haven’t loaded any data yet.
At this stage, you could add more tables if you want to.
Tips for creating a table
- Use CSV rather than TXT files
- Use STRING rather than VARCHAR or CHAR as the data type.
- Check that columns are added in the correct order, as they appear in the CSV
Can I edit a table?
Once a table is created, the columns cannot be edited. If any changes need to be made to the column name or description, a new table needs to be created to load the data to, and the old table can be deleted if it is no longer required.
If an incorrect file has been loaded to the table, you can empty the table and proceed to load a new file to the table.
Emptying or deleting a table
To delete or empty a table:
Go to the Manage Data section, click Databases.
Click View or Edit Database.
Select ‘Delete table’ or ‘Empty table’ and follow the prompts.
Load data to the table
This screen shows the newly created table. It has no data in it as indicated by the zero row count.
From the Database Screen, click Load table.
On the Select Data to Load screen:
- Either select a single data file (such as CSV or TSV, either in plain text or gzipped); or select a folder. When uploading a folder, every file will be loaded into the table. This is the preferred method for loading a large number of files into a single table.
- Note: Compressed files should have the extension .csv.gz. Uncompressed files should have the extension .csv.
- Select how the data is separated (e.g. comma, tab, pipe).
- Select 'append'. Note: 'replace' is not currently an option.
- Important! If there are header rows (e.g. for field names), select how many rows comprise the header. These rows are skipped. Note: When loading a folder, all files should have zero header rows.
- When you have finished, click Done, Load Data.
A message pops up informing you that your job has been queued.
How long it takes for your data to be loaded will vary according to how much data is involved.
Click Go to Load Jobs.
PRO TIP: If you are loading multiple files to a table, you can select the folder during the load process and that will load all files within that folder to the table. It is important that none of the files contain header rows.
The job is listed with a message to indicate that it has been loaded successfully.
If there were errors, as shown in the second example, it is most likely due to a mismatch between the columns in the table and the data load. Click Show Errors to find the cause and then go back to create a new table. For additional tips, refer to tips for creating a table.
Note: Once a table has been created, it cannot be edited. If you experienced errors when loading data to the table, you will need to create a new table to fix any errors, and load the data again. Once complete, you can delete any tables with errors.