Apache Flume is a reliable, distributed, and available service for collecting, aggregating, and moving a huge amount of data. It has a simple architecture which is based on streaming data flows. It has robust and faults tolerant with tunable reliability mechanism and recovery mechanism. It also uses a simple extensible data that allows for online analytic application.
In this module you will be learning flume agent creation, integrating flume with Kafka broker.
- What is Apache Flume
- Flume architecture and aggregation flow
- Understanding Flume components like data Sources and Sinks
- Flume channels to buffer events
- Aggregating streams using Fan-in
- Separating streams using Fan-out
- Internals of the agent architecture
- Production architecture of Flume
- Collecting data from different sources to Hadoop HDFS
- Flume & Kafka integration
- Flume & HDFS Agent creation
- Flume & Kafka Agent creation