Data Source Directory Requirements

  • You can create multiple folders in your S3 Bucket. 
  • Each directory (folder) may consist of folders or files. Within the next level, a folder cannot contain both folders and files.  
  • All files within the same folder must have the same schema. 
  • All files within the same folder must be of the same format, either Parquet (preferred) or CSV*. You cannot have both formats in the same folder.    

*Ensure all CSV files, including token tables, have a column with non-string type identifier. This could be as simple as adding a column of numbers ascending sequentially i.e. 1, 2, 3. 

As shown in the diagram, multiple folders for a data source can be created (see level 1). Within a folder from level 1, you can create folders or files (see level 2). Within a folder, files must have the same schema and format, either CSV or Parquet. 

Related articles

Setting up local data sources and deploying Workspaces with CCS
Configure your AWS S3 bucket policy
Connect to local data sources and crawl meta data in Senate

Did this answer your question?