File Readers

The commons configuration for Connect File Pulse.

The connector can be configured with a specific FileInputReader. The FileInputReader is used by tasks to read scheduled source files.

RowFileInputReader (default)

The RowFileInputReader reads files from the local file system line by line. This reader creates one record per row. It should be used for reading delimited text files, application log files, etc.

The following provides usage information for io.streamthoughts.kafka.connect.filepulse.reader.RowFileInputReader (source code)

Configuration

ConfigurationDescriptionTypeDefaultImportance
file.encodingThe text file encoding to useStringUTF_8High
buffer.initial.bytes.sizeThe initial buffer size used to read input files.String4096Medium
min.read.recordsThe minimum number of records to read from file before returning to task.Integer1Medium
skip.headersThe number of rows to be skipped in the beginning of file.Integer0Medium
skip.footersThe number of rows to be skipped at the end of file.Integer0Medium
read.max.wait.msThe maximum time to wait in milliseconds for more bytes after hitting end of file.Long0Medium

BytesArrayInputReader

The BytesArrayInputReader create a single byte array record from a source file.

The following provides usage information for io.streamthoughts.kafka.connect.filepulse.reader.BytesArrayInputReader (source code)

AvroFileInputReader

The AvroFileInputReader is used to read Avro files.

The following provides usage information for io.streamthoughts.kafka.connect.filepulse.reader.AvroFileInputReader (source code)

XMLFileInputReader

The XMLFileInputReader is used to read XML files.

The following provides usage information for io.streamthoughts.kafka.connect.filepulse.reader.XMLFileInputReader (source code)

Configuration

ConfigurationDescriptionTypeDefaultImportance
xpath.expressionThe XPath expression used extract data from XML input filesString/High
xpath.result.typeThe expected result type for the XPath expression in [NODESET, STRING]StringNODESETHigh
force.array.on.fieldsThe comma-separated list of fields for which an array-type must be forcedList-High

FileInputMetadataReader

The FileInputMetadataReader is used to send a single record per file containing metadata (i.e: name, path, hash, lastModified, size, etc)

The following provides usage information for io.streamthoughts.kafka.connect.filepulse.reader.FileInputMetadataReader (source code)