From File
📄️ Getting Started with From File
This guide serves as a table of contents for getting started with the From File process in Syntasa. Each section provides an in-depth look at the different aspects of using From File to ingest and process external data. Use the links below to navigate to the relevant topics for a detailed explanation and practical use cases.
📄️ From File - Overview
The From File process in Syntasa is designed to enable seamless ingestion of data from external files for analysis and processing. It allows users to import structured or unstructured data from various file formats, such as CSV, TSV, Apache logs, and more. The primary purpose of the From File process is to facilitate the extraction, transformation, and loading (ETL) of data into Syntasa for further processing in automated workflows. This process ensures that data from external files can be efficiently parsed, cleaned, and integrated into your data pipelines for actionable insights.
📄️ Input - 'From File' Process
The Input screen provides Syntasa with crucial details needed to understand the source connection path of the files and specific information about the files scheduled for ingestion.
📄️ Schema - 'From File' process
In the previous article, we learned how to configure the fields to locate the file for ingestion, set the data type (such as delimiter or Apache Logs), and utilize the incremental load feature to process files uploaded daily on an hourly basis. In this article, we will focus on configuring the schema
📄️ Output - 'From File' Process
In the previous article, we learned how to configure the input and schema screen. After configuring these screens, we can also configure the output screen to meet our requirements.
📄️ Understanding Date Manipulation in 'From File' Process
In some data processing workflows, especially in e-commerce applications, data for an entire day is generated and stored on the following day. As a result, while the data pertains to a specific date (yesterday), the file name contains today’s date. By default, when processing such files, the system assigns the current date to the output. However, there may be scenarios where you need to store the extracted data under its actual corresponding date rather than the date in the file name.
📄️ Ingesting Delimited files in 'From File' process
The From File process in Syntasa allows users to ingest structured data files, typically in delimited formats such as CSV or TSV. To successfully configure the ingestion, users need to define the delimiter, quote character, and escape character. This guide explains how to set up these parameters correctly for seamless file processing.
📄️ Ingesting Zip or Tar files in 'From File' process
Ingesting a Zip file in Syntasa follows a similar process as ingesting any other text file, with the additional step of specifying which files inside the Zip need to be picked up for processing. The system will first detect the Zip file, and then identify the data files within it based on the file pattern provided.