Catatan Seekor: Kafka

Apache Kafka adalah distributed streaming platform yang dirancang untuk high-throughput, fault-tolerant handling of real-time data feeds.

Fundamental

Kafka Concepts

  • Producer: Aplikasi yang mengirim data ke Kafka

  • Consumer: Aplikasi yang membaca data dari Kafka

  • Topic: Kategori atau feed name untuk menyimpan data

  • Partition: Topic dibagi menjadi beberapa partition

  • Broker: Server Kafka yang menyimpan data

  • Cluster: Kumpulan dari beberapa broker

Kafka Architecture

┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│  Producer   │───▶│    Topic    │───▶│  Consumer   │
└─────────────┘    └─────────────┘    └─────────────┘


                   ┌─────────────┐
                   │  Partition  │
                   └─────────────┘


                   ┌─────────────┐
                   │    Broker   │
                   └─────────────┘

Topic and Partition

Topic Configuration

Partition Strategy

Producer

Basic Producer

Producer with Configuration

Consumer

Basic Consumer

Consumer Groups

Streams API

Word Count Example

Connect API

File Source Connector

JDBC Sink Connector

Configuration

Broker Configuration

Producer Configuration

Consumer Configuration

Monitoring

JMX Metrics

Kafka Manager

Best Practices

Performance Tuning

  • Partition Count: Choose appropriate number of partitions

  • Replication Factor: Use 3 for production environments

  • Batch Size: Optimize producer batch size

  • Buffer Memory: Configure adequate buffer memory

  • Compression: Enable compression for better throughput

Reliability

  • Acks: Use 'all' for critical data

  • Retries: Configure retry mechanism

  • Replication: Ensure adequate replication

  • Monitoring: Monitor lag and throughput

  • Backup: Regular backup of critical topics

Security

  • Authentication: Enable SASL authentication

  • Authorization: Configure ACLs for topics

  • Encryption: Enable SSL/TLS encryption

  • Audit: Enable audit logging

  • Network: Restrict network access

References

Kafka Resources

Additional Resources

  • Apache Kafka Documentation: https://kafka.apache.org/documentation/

  • Confluent Platform: https://docs.confluent.io/

  • Kafka Streams: https://kafka.apache.org/documentation/streams/

  • Kafka Connect: https://kafka.apache.org/documentation/#connect

Last updated