HDFS2

HDFS2

Description: Camel HDFS support with Hadoop 2.x libraries
Scheme: hdfs2
Syntax: hdfs2:hostName:port/path
Maven: org.apache.camel/camel-hdfs2/2.16.1
Name Kind Group Required Default Type Enum Description
hostName path common true java.lang.String HDFS host to use
port path common 8020 int HDFS port to use
path path common true java.lang.String The directory path to use
blockSize parameter common 67108864 long The size of the HDFS blocks
bufferSize parameter common 4096 int The buffer size used by HDFS
checkIdleInterval parameter common 500 int How often (time in millis) in to run the idle checker background task. This option is only in use if the splitter strategy is IDLE.
chunkSize parameter common 4096 int When reading a normal file, this is split into chunks producing a message per chunk.
compressionCodec parameter common DEFAULT org.apache.camel.component.hdfs2.HdfsCompressionCodec DEFAULT
GZIP
BZIP2
The compression codec to use
compressionType parameter common NONE org.apache.hadoop.io.SequenceFile.CompressionType The compression type to use (is default not in use)
connectOnStartup parameter common true boolean Whether to connect to the HDFS file system on starting the producer/consumer. If false then the connection is created on-demand. Notice that HDFS may take up till 15 minutes to establish a connection, as it has hardcoded 45 x 20 sec redelivery. By setting this option to false allows your application to startup, and not block for up till 15 minutes.
fileSystemType parameter common HDFS org.apache.camel.component.hdfs2.HdfsFileSystemType LOCAL
HDFS
Set to LOCAL to not use HDFS but local java.io.File instead.
fileType parameter common NORMAL_FILE org.apache.camel.component.hdfs2.HdfsFileType NORMAL_FILE
SEQUENCE_FILE
MAP_FILE
BLOOMMAP_FILE
ARRAY_FILE
The file type to use. For more details see Hadoop HDFS documentation about the various files types.
keyType parameter common NULL org.apache.camel.component.hdfs2.HdfsWritableFactories.WritableType The type for the key in case of sequence or map files.
openedSuffix parameter common opened java.lang.String When a file is opened for reading/writing the file is renamed with this suffix to avoid to read it during the writing phase.
owner parameter common java.lang.String The file owner must match this owner for the consumer to pickup the file. Otherwise the file is skipped.
readSuffix parameter common read java.lang.String Once the file has been read is renamed with this suffix to avoid to read it again.
replication parameter common 3 short The HDFS replication factor
splitStrategy parameter common java.lang.String In the current version of Hadoop opening a file in append mode is disabled since it's not very reliable. So, for the moment, it's only possible to create new files. The Camel HDFS endpoint tries to solve this problem in this way:
  • If the split strategy option has been defined, the hdfs path will be used as a directory and files will be created using the configured UuidGenerator.
  • Every time a splitting condition is met, a new file is created.
The splitStrategy option is defined as a string with the following syntax:
splitStrategy=ST:value,ST:value,...
where ST can be:
  • BYTES a new file is created, and the old is closed when the number of written bytes is more than value
  • MESSAGES a new file is created, and the old is closed when the number of written messages is more than value
  • IDLE a new file is created, and the old is closed when no writing happened in the last value milliseconds
valueType parameter common BYTES org.apache.camel.component.hdfs2.HdfsWritableFactories.WritableType The type for the key in case of sequence or map files
delay parameter consumer 1000 long The interval (milliseconds) between the directory scans.
initialDelay parameter consumer long For the consumer, how much to wait (milliseconds) before to start scanning the directory.
pattern parameter consumer * java.lang.String The pattern used for scanning the directory
append parameter producer boolean Append to existing file. Notice that not all HDFS file systems support the append option.
overwrite parameter producer true boolean Whether to overwrite existing files with the same name
exchangePattern parameter advanced InOnly org.apache.camel.ExchangePattern InOnly
RobustInOnly
InOut
InOptionalOut
OutOnly
RobustOutOnly
OutIn
OutOptionalIn
Sets the default exchange pattern when creating an exchange
synchronous parameter advanced false boolean Sets whether synchronous processing should be strictly used, or Camel is allowed to use asynchronous processing (if supported).

hdfs2 consumer