RabbitMQ Summary_Still on

rabbitmq, cluster, messagequeue, queue · 25 January 2024

Goal

Learn how to install RabiitMQ as a cluster and its features.

About RabbitMQ

It’s an open source Message Queue system.

Features

The message in the queue can be used only once.

Terms

Produce : Sending a message.
Producer : Anyone who produces.
Consume : Receiving a message.
Consumer : Anyone who consumes.
Queue : Where messages are stored. It’ll be bounded by the host’s memory and disk limits. Many producers can produce to the one queue. And many consumers can consume from the one queue.

Install

Where To Install

Installing way differs from the OS you are installing.
I’m going to install on the Rocky 9.3, so I’ll follow the directions for RPM-based Linux.

Requirements

User Privilege : this requires sudo privilege to install. Unless, consider using the generic binary build.
Erlang : You can install erlang itself, or use minimum package which supports only features to make the RabbitMQ up built by RabbitMQ, named Zero-dependency Erlang from RabbitMQ.
1. CentOS-derivative repositories offer Erlang versions which are out of date. So use zero-dependency Erlang from RabbitMQ, up-to-date Erlang provided by Fedora, openSuse provided version or Erlang Solutions.
2. Zero-dependency Erlang can be downloaded from github or yum repository.

Dependencies

erlang : If you are going to follow the way described in “Use rpm” then you don’t need to do this.

# https://github.com/rabbitmq/erlang-rpm
sudo vi /etc/yum.repos.d/modern_erlang.repo # enter repo info specified at "Latest Erlang Version from a Cloudsmith Mirror"
sudo dnf update -y 
sudo dnf install -y erlang

socat
```
sudo dnf install -y socat
```
logrotate
```
sudo dnf install -y logrotate
```

Ways To Install

Use Cloudsmith Mirror Yum Repository

Install RabbitMQ and Cloudsmith Signing Keys
Add Yum REpositories for RabbitMQ and Modern Erlang
Install Packages with dnf(yum)
Install Packages with Zypper
(Optional but Recommended) Package Version Locking in On RPM-based Distributions

Use rpm

# import the signing key for repositories
## primary RabbitMQ signing key
rpm --import 'https://github.com/rabbitmq/signing-keys/releases/download/3.0/rabbitmq-release-signing-key.asc'
## modern Erlang repository
rpm --import 'https://github.com/rabbitmq/signing-keys/releases/download/3.0/cloudsmith.rabbitmq-erlang.E495BB49CC4BBE5B.key'
## RabbitMQ server repository
rpm --import 'https://github.com/rabbitmq/signing-keys/releases/download/3.0/cloudsmith.rabbitmq-server.9F4587F226208342.key'

# Add Yum Repositories for RabbitMQ and Modern Erlang
## Follow "Red Hat 9, CentOS Stream 9, Rocky Linux 9, Alma Linux 9, Modern Fedora Releases" from https://www.rabbitmq.com/install-rpm.html
sudo vi /etc/yum.repos.d/rabbitmq.repo
sudo dnf update -y
sudo dnf install -y socat logrotate
sudo dnf install -y erlang rabbitmq-server

Direct Downloads

You can check Github (2024.1.25) I’m gonna skip for now

Run RabbitMQ

To run the server by daemon,

sudo systemctl enable rabbitmq-server

Configuring RabbitMQ

You can check most of configs by rabbitmq-diagnostics status`

By default, the config file would not have beens set, so check this

Managing the Service

Log Files and Management

Setting A Cluster Up

There are many ways to form a cluster, such as config file, DNS-based discovery, AWS and even a manual way using rabbitmqctl.

How RabbitMQ NOdes Are Identified

Each nodes will be identified by their node names, a combination of a prefix(usually rabbit) and hostname, concatenated through @, which must be unique in a cluster. Hostname can be resolved using any of standard OS-provided methods: such as DNS records, Local host files(/etc/hosts)

Requirements For Clustering

Ports Access

You can change the ports using configuring.

4369: epmd, a helper discovery daemon used by RabbitMQ nodes and CLI tools
6000 through 6500: used by RabbitMQ Stream replication
25672: used for inter-node and CLI tools communication (Erlang distribution server port) and is allocated from a dynamic range (limited to a single port by default, computed as AMQP port + 20000). Unless external connections on these ports are really necessary (e.g. the cluster uses federation or CLI tools are used on machines outside the subnet), these ports should not be publicly exposed. See networking guide for details.
35672-35682: used by CLI tools (Erlang distribution client ports) for communication with nodes and is allocated from a dynamic range (computed as server distribution port + 10000 through server distribution port + 10010).

Data which Replicated Between Cluster Nodes

What Cluster Means For Clients

Peer Discovery

To form a cluster, each nodes need to be able to discover the others. There are two options for this. Introducing every nodes ahead of the time(using config file) or dynamic(nodes can com and go)

Use cluster_formation.peer_discovery_backend in the config file to set which Peer Discovery way you’d like to use. And also you can specify other clustering settings such as discovery service hostnames, credentials, and so on.

How It Works

Node starts
If peer discovery is set, detects whether there is a previously initialized database.
Perform the discovery(It may involve contacting external services, such as Consul for the AWS plugin)
Attempt to contact others in order
Attempt to join the cluster of the first reachable peer

You should initialize the cluster from only one node, unless you will end up with multiple different clusters.
- If all nodes starts in parallel, they’ll fall into a race condition. To prevent these cases, peer discovery tries to acquire a lock when either forming a cluster or joining one. Consul uses a lock in Consul and etcd for etcd. Classic config file, k8s and AWS use a built-in locking-library provided by the runtime.
If there’s a registration in the backend, then unregister upon stop
If there’s a pre-joined cluster, then the node retries to contact the cluster for a period of time and no peer discovery will be performed. By default, it retires for 10 times and 30 seconds per attempt.
If the node fails to connect the previous cluster, it’ll behave like a blank node but the cluster members would still think the nodes is in the cluster, which finally makes the cluster joining fail. You need to remove these nodes by rabbitmqctl forget_cluster_node on existing members.
Once a node is explicitly removed from the cluster and reset, the node will join the cluster as if it’s a new member.
If you change the node name or host name then it’ll make the node be considered as a new member as well if its data directory path might be changed as a result.
Backend of the RabbitMQ means the peer discovery
Before the whole nodes in the cluster have joined, each nodes will accept client connections. Which means, the cluster is considered fully available by the clients. In this case, certain features may not be available to use, for instance, quorum queues and features behind feature flags until the cluster reaches their required numbers of nodes.
When handling the cluster, you need to consider the following scenario, where you should consider resetting A or B.
1. A cluster of 3 nodes, A, B and C is formed.
2. A is shutdown.
3. B is reset.
4. A is started.
5. A tries to rejoin B but B’s cluster identity has changed.
6. B doesn’t recognize A.
7. A is rejected with the following message.
  
  Node 'rabbit@nodeA.local' thinks it's clustered with node 'rabbit@nodeB.local', but 'rabbit@nodeB.local' disagrees

Config File

(2024.1.25) I’ll skip this for now.

Pre-configured DNS A/AAAA records

This mechanism uses pre-configured hostname(a.k.a seed hostname) with DNS A(or AAAA) records. You can specify from the config file, like cluster_formation.dns.hostname = discovery.eng.example.local
1. Query DNS A records of the seed hostname : let’s say it’s discovery.eng.example.local and DNS A records return 192.168.100.1 and 192.168.100.2
2. For each returned DNS record’s IP addresses, perform a reverse DNS lookup : this will return their hostnames.
3. Append current node’s prefix to each hostname and return the result : the node’s name is not set, then the default name is rabbit@(hostname), so this will discover two nodes, rabbit@(node1’s hostname) and rabbit@(node2’s hostname)

Via Plugins

AWS
Kubernetes
Consul
etcd

You don’t need to install these plugins but enable them. Follow this cli.

rabbitmq-plugins --offline enable <plugin name>

rabbitmq-plugins --offline enable rabbitmq_peer_discovery_k8s

(2024.1.25) But I’ll skip this for now.

Network Partition

I need to check this information. It seems quite complicated.