processor-batchify

Converts any CSV file with a header from /data/in/files or /data/in/tables into several batches of CSV files with a header. Ignores manifest files.

Usage

Sample configuration

{  
    "definition": {
        "component": "revolt-bi.processor-batchify"
    },
    "parameters": {
        "batch_size": 200
    }
}

Parameters

batch_size

The only parameter of this processor - the number of lines (excluding the header line) of one CSV.

Example

With batch_size = 100, the processor converts a table /data/in/files/example.csv of 2019+1 rows into 21 tables /data/in/files/example_1_1.csv ... /data/in/files/example_1_21.csv. The first 20 tables consist of 100 rows and a header, the last table consists of 19 rows and a header.