To File - An Overview
The To File process in SYNTASA is used to export data from the SYNTASA environment into external file formats, such as CSV or plain text, and store it in connected cloud storage solutions like Amazon S3. This is particularly useful when you want to make the processed data available outside SYNTASA for reporting, data sharing, or downstream consumption.
The TO FILE process works seamlessly with output in event stores generated from other processes like From DB, any code process like Spark processor, BQ process etc. It allows not only data export but also some light transformations such as renaming columns before writing the output file.
Supported Output File Formats
The To File process supports two primary file formats:
- Standard Format (CSV File)
- Generates a comma-separated values (CSV) file.
- Suitable for tools and systems expecting tabular data.
- Key-Value Format (Text File)
- Produces a text file where each row contains delimited key-value pairs.
- Often used for flat file integrations or custom ingestion tools that expect such formats.
- Each row may look like:
city=Boston|population=675000|state=MA
Common Use Cases
Here are some practical use cases where the To File process fits well:
-
Exporting Transformed Data to Cloud Storage
You have some data in a database and you want to save it as a CSV file after transformation. This can be done by first pulling the data into SYNTASA using the FROM DB process. You can then apply transformations using a CODE process (e.g., Spark Processor or BQ Processor), and finally send the transformed data as a file to cloud storage. Example pipeline:
Postgres Database → FROM DB → Event Store (Output) → Spark Processor → Event Store (Output) → TO FILE → S3/GCS Connection
-
Incremental Upload with Partitioned Data
If the input data is partitioned (e.g., by date), you can configure the To File process to export only specific partitions. This ensures efficient incremental data export rather than rewriting the entire dataset on each run.'
-
Integration with External Systems
Many external systems require input data in specific formats. TO FILE supports both CSV and key-value formats, enabling smooth integration into those workflows.
Key Features
- Supports CSV and Key-Value Outputs: Choose based on your integration needs.
- Flexible Column Mapping: Easily rename columns in the output.
- Cloud Storage Integration: Works with S3, GCS, Azure Storage, SFTP and other supported destinations.
- Partition Based Processing: Write only the required partitions to reduce redundancy.
- **File Name Customization with Parameters:**Use dynamic variables in filenames such as @DATE to substitutes current date, @ROWS to show total rows in output, @COLUMN to show total columns in output