Data Source Directory Requirements
- You can create multiple folders in your S3 Bucket.
- Each directory (folder) may consist of folders or files. Within the next level, a folder cannot contain both folders and files.
- All files within the same folder must have the same schema.
- All files within the same folder must be of the same format, either Parquet (preferred) or CSV*. You cannot have both formats in the same folder.
*Ensure all CSV files, including token tables, have a column with non-string type identifier. This could be as simple as adding a column of numbers ascending sequentially i.e. 1, 2, 3.
As shown in the diagram, multiple folders for a data source can be created (see level 1). Within a folder from level 1, you can create folders or files (see level 2). Within a folder, files must have the same schema and format, either CSV or Parquet.