Kafka vs RabbitMQ

This is third part to the already published Part I  and II which can be found below

Kafka vs RabbitMQ
Are RabbitMQ and Kafka same ?
Kafka vs RabbitMQ
What is the difference between Kafka and RabbitMQ?

We have seen how RabbitMQ and Kafka are two different things which are often confused.

Now let us see the common use cases associated with them.

RabbitMQ

RabbitMQ can be used when web servers need to quickly respond to requests. This eliminates the need to perform resource-intensive activities while the user waits for a result. RabbitMQ is also used to convey a message to various recipients for consumption or to share loads between workers under high load (20K+ messages/second).

Scenarios that RabbitMQ can be used for:

  • Applications that need to support legacy protocols, such as STOMP, MQTT, AMQP, 0-9-1.
  • Granular control over consistency/set of guarantees on a per-message basis
  • Complex routing to consumers
  • Applications that need a variety of publish/subscribe, point-to-point request/reply messaging capabilities.

RabbitMQ is the obvious option if you want a simple/traditional pub-sub message broker. If my requirements were easy enough to deal with device communication across channels/queues and retention and streaming were not a requirement, I would have preferred RabbitMQ.

There are two main situations where we would choose RabbitMQ; For long-running tasks, when we need to run reliable background jobs. And for communication and integration between microservices/applications, where a system simply needs to notify another part of the system to start to work on a task, like ordering handling in a e-commerce (order placed, update order status, send order, payment, etc.).

LONG-RUNNING TASKS

Message queues enable asynchronous processing, meaning that they allow you to put a message in a queue without processing it immediately. RabbitMQ is ideal for long-running tasks.

An example can be where users upload details and the application generates PDF and sends email. This can be resource intensive and time consuming. So we can use the asynchronous processing using RabbitMQ.

Many a times RabbitMQ queues serve as event buses allowing web servers to respond quickly to requests instead of being forced to perform computationally intensive tasks on the spot.

MIDDLEMAN IN A MICROSERVICE ARCHITECTURES

RabbitMQ is also used for microservice architecture, where it serves as a means of communicating between applications, avoiding bottlenecks passing messages.

An example can be e-commerce apps where different microservices are orchestrated using queues likes RabbitMQ.


Apache Kafka

Supports use cases such as metrics, activity tracking, log aggregation, stream processing, commit logs and event sourcing.

The following messaging scenarios are especially suited for Kafka:

  • Streams with complex routing, throughput of 100K/sec events or more, with at least once partitioned ordering
  • Applications requiring a stream history, delivered in at least once partitioned ordering. Clients can see a replay of the event stream.
  • Event sourcing, modeling changes to a system as a sequence of events.
  • Stream processing data in multi-stage pipelines. The pipelines generate graphs of real-time data flows

In general, if you want a framework for storing, reading (re-reading), and analyzing streaming data, use Apache Kafka. It's perfect for audited systems or those who need to keep messages for a long time. This can also be divided into two categories: data analysis (tracking, ingestion, logging, security, and so on) and real-time processing.

DATA ANALYSIS: TRACKING, INGESTION, LOGGING, SECURITY

In all these cases, large amounts of data need to be collected, stored, and handled. Companies that need to gain insights into data, provide search features, auditing or analysis of tons of data justify the use of Kafka.

According to the creators of Apache Kafka, the original use case for Kafka was to track website activity including page views, searches, uploads or other actions users may take. This kind of activity tracking often requires a very high volume of throughput, since messages are generated for each action and for each user. Many of these activities - in fact, all of the system activities - can be stored in Kafka and handled as needed.

Producers of data only need to send their data to a single place while a host of backend services can consume the data as required. Major analytics, search and storage systems have integrations with Kafka.

REAL-TIME PROCESSING

Kafka acts as a high-throughput distributed system; source services push streams of data into the target services that pull them in real-time.

Kafka could be used in systems handling many producers in real-time with a small number of consumers; i.e. financial IT systems monitoring stock data.

Streaming services like Spotify publish information in real-time over Kafka. The ability to handle high-throughput in real-time empowers applications., making these applications more powerful than ever before.


Hope you liked this series on RabbitMQ vs Kafka.

Tirthankar Kundu

Tirthankar Kundu