How to Install Apache Kafka on Debian 9

Spread the love

In this guide, we will show you how to install Apache Kafka on a Debian 9 VPS.

Apache Kafka is a free and open-source distributed streaming software platform that lets you publish and subscribe to streams of records and store streams of records in a fault-tolerant and durable manner. Apache Kafka is written in Scala and Java. Used in thousands of companies across the world, Apache Kafka provides anyone with the ability to create streaming and stream processing applications that can read and store data in real time. This has a variety of use cases – anything from logging, to messaging, to processing almost any sort of data stream you could imagine. Let’s get started with the installation.

In order to run Apache Kafka on your VPS, the following requirements have to be met:

  • Java 8 or higher needs to be installed
  • ZooKeeper installed and running on the server
  • A VPS with at least 4GB of RAM

If you don’t have Java or ZooKeeper, don’t worry, we’ll be installing them in this tutorial as well.

Step 1 – Update OS Packages

Before we can start with the Apache Kafka installation, we have to make sure that all Debian OS packages that are installed on the server are up to date. We can do this by executing the following commands:

sudo apt-get update
sudo apt-get upgrade

Step 2 – Install JAVA

In order to run Apache Kafka on our server, we’ll need to have Java installed. We can check if Java is already installed using this command:

which java

If there is no output, that means that Java is not installed on the server yet. We can install it using the following command:

sudo apt-get install default-jdk

In order to check the Java version, run the following command on your server:

java -version

We should receive the following output:

openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

Step 3 – Install Zookeeper

Kafka uses ZooKeeper to store persistent cluster metadata, so we need to install ZooKeeper. The ZooKeeper service is responsible for configuration management, leader detection, synchronization, etc. ZooKeeper is available in the official Debian package repository, so we can install it using the following command:

sudo apt-get install zookeeperd

ZooKeeper is running on port 2181 and it doesn’t require much maintenance.

Step 4 – Install Apache Kafka

Crate a new system user dedicated for the Kafka service using the following command (we’re using the kafka name for our username, you can use any name you like):

useradd kafka -m

Set a password for the newly created user:

passwd kafka

Use a strong password and enter it twice. Next, add the user to the sudo group with:

adduser kafka sudo

Stop the ZooKeeper service:

systemctl stop zookeeper.service

Log in as the newly created admin user with:

su kafka

Download the latest version of Apache Kafka available at https://kafka.apache.org/downloads and extract it in a directory on your server:

cd ~
wget -O kafka.tgz http://apache.osuosl.org/kafka/2.1.0/kafka_2.12-2.1.0.tgz
tar -xvzf kafka.tgz
mv kafka_2.12-2.1.0/* .
rmdir /home/kafka/kafka_2.12-2.1.0

Edit the ZooKeeper systemd script:

vi /lib/systemd/system/zookeeper.service
[Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/bin/zookeeper-server-start.sh /home/kafka/config/zookeeper.properties
ExecStop=/home/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Create a systemd unit file for Apache Kafka, so that you can run Kafka as a service on your server:

vi /etc/systemd/system/kafka.service

Add the following lines:

[Unit]
Requires=network.target remote-fs.target zookeeper.service
After=network.target remote-fs.target zookeeper.service

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/bin/kafka-server-start.sh /home/kafka/config/server.properties
ExecStop=/home/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Edit the server.properties file and add/modify the following properties:

vi /home/kafka/config/server.properties
listeners=PLAINTEXT://:9092
log.dirs=/var/log/kafka

After we make changes to a unit file, we should always run the systemctl daemon-reload command:

systemctl daemon-reload

Create a new directory called kafka in the /var/log/ directory on your server:

mkdir -p /var/log/kafka
chown kafka:kafka -R /var/log/kafka

This can be useful for troubleshooting. Then, start the ZooKeeper and  Apache Kafka services:

systemctl start zookeeper.service
systemctl start kafka.service

Enable the Apache Kafka service to automatically start on server boot:

systemctl enable kafka.service

In order to check if ZooKeeper and Kafka services are up and running, run the following command on your VPS:

systemctl status zookeeper.service

We should then receive an output similar to this:

zookeeper.service
Loaded: loaded (/lib/systemd/system/zookeeper.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2018-12-19 06:23:33 EST; 25min ago
Main PID: 20157 (java)
Tasks: 21 (limit: 4915)
CGroup: /system.slice/zookeeper.service
└─20157 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true -Xloggc:/home/kafka/bin/../l

Run this command next:

systemctl status kafka.service

The output of this command should be similar to this one:

kafka.service
Loaded: loaded (/etc/systemd/system/kafka.service; disabled; vendor preset: enabled)
Active: active (running) since Wed 2018-12-19 06:46:49 EST; 27s ago
Process: 22520 ExecStop=/home/kafka/bin/kafka-server-stop.sh (code=exited, status=0/SUCCESS)
Main PID: 22540 (java)
Tasks: 62 (limit: 4915)
CGroup: /system.slice/kafka.service
└─22540 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true -Xloggc:/home/kafka/bin/../logs/

We can also use netstat command to check if Kafka and ZooKeeper services are listening on ports 9092 and 2181, respectively:

netstat -tunlp | grep -e \:9092 -e \:2181
tcp6       0      0 :::9092                 :::*                    LISTEN      22540/java
tcp6       0      0 :::2181                 :::*                    LISTEN      20157/java

If they are both running, and both ports are open and listening, then that is all. We have successfully installed Apache Kafka.


Of course, you don’t have to install and configure Apache Kafka on Debian 9 if you use one of our Managed Debian Support solutions, in which case you can simply ask our expert Linux admins to setup and configure Apache Kafka on Debian 9 for you. They are available 24×7 and will take care of your request immediately.

PS. If you liked this post on how to install Apache Kafka on a Debian 9 VPS, please share it with your friends on the social networks using the buttons on the left or simply leave a reply below. Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *