In this guide, we will show you how to install Apache Kafka on a Debian 9 VPS.
Apache Kafka is a free and open-source distributed streaming software platform that lets you publish and subscribe to streams of records and store streams of records in a fault-tolerant and durable manner. Apache Kafka is written in Scala and Java. Used in thousands of companies across the world, Apache Kafka provides anyone with the ability to create streaming and stream processing applications that can read and store data in real time. This has a variety of use cases – anything from logging, to messaging, to processing almost any sort of data stream you could imagine. Let’s get started with the installation.
In order to run Apache Kafka on your VPS, the following requirements have to be met:
- Java 8 or higher needs to be installed
- ZooKeeper installed and running on the server
- A VPS with at least 4GB of RAM
If you don’t have Java or ZooKeeper, don’t worry, we’ll be installing them in this tutorial as well.
Step 1 – Update OS Packages
Before we can start with the Apache Kafka installation, we have to make sure that all Debian OS packages that are installed on the server are up to date. We can do this by executing the following commands:
sudo apt-get update sudo apt-get upgrade
Step 2 – Install JAVA
In order to run Apache Kafka on our server, we’ll need to have Java installed. We can check if Java is already installed using this command:
which java
If there is no output, that means that Java is not installed on the server yet. We can install it using the following command:
sudo apt-get install default-jdk
In order to check the Java version, run the following command on your server:
java -version
We should receive the following output:
openjdk version "1.8.0_181" OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13) OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
Step 3 – Install Zookeeper
Kafka uses ZooKeeper to store persistent cluster metadata, so we need to install ZooKeeper. The ZooKeeper service is responsible for configuration management, leader detection, synchronization, etc. ZooKeeper is available in the official Debian package repository, so we can install it using the following command:
sudo apt-get install zookeeperd
ZooKeeper is running on port 2181 and it doesn’t require much maintenance.
Step 4 – Install Apache Kafka
Crate a new system user dedicated for the Kafka service using the following command (we’re using the kafka
name for our username, you can use any name you like):
useradd kafka -m
Set a password for the newly created user:
passwd kafka
Use a strong password and enter it twice. Next, add the user to the sudo
group with:
adduser kafka sudo
Stop the ZooKeeper service:
systemctl stop zookeeper.service
Log in as the newly created admin user with:
su kafka
Download the latest version of Apache Kafka available at https://kafka.apache.org/downloads and extract it in a directory on your server:
cd ~ wget -O kafka.tgz http://apache.osuosl.org/kafka/2.1.0/kafka_2.12-2.1.0.tgz tar -xvzf kafka.tgz mv kafka_2.12-2.1.0/* . rmdir /home/kafka/kafka_2.12-2.1.0
Edit the ZooKeeper systemd script:
vi /lib/systemd/system/zookeeper.service
[Unit] Requires=network.target remote-fs.target After=network.target remote-fs.target [Service] Type=simple User=kafka ExecStart=/home/kafka/bin/zookeeper-server-start.sh /home/kafka/config/zookeeper.properties ExecStop=/home/kafka/bin/zookeeper-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
Create a systemd
unit file for Apache Kafka, so that you can run Kafka as a service on your server:
vi /etc/systemd/system/kafka.service
Add the following lines:
[Unit] Requires=network.target remote-fs.target zookeeper.service After=network.target remote-fs.target zookeeper.service [Service] Type=simple User=kafka ExecStart=/home/kafka/bin/kafka-server-start.sh /home/kafka/config/server.properties ExecStop=/home/kafka/bin/kafka-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
Edit the server.properties
file and add/modify the following properties:
vi /home/kafka/config/server.properties
listeners=PLAINTEXT://:9092 log.dirs=/var/log/kafka
After we make changes to a unit file, we should always run the systemctl daemon-reload
command:
systemctl daemon-reload
Create a new directory called kafka
in the /var/log/
directory on your server:
mkdir -p /var/log/kafka
chown kafka:kafka -R /var/log/kafka
This can be useful for troubleshooting. Then, start the ZooKeeper and Apache Kafka services:
systemctl start zookeeper.service systemctl start kafka.service
Enable the Apache Kafka service to automatically start on server boot:
systemctl enable kafka.service
In order to check if ZooKeeper and Kafka services are up and running, run the following command on your VPS:
systemctl status zookeeper.service
We should then receive an output similar to this:
zookeeper.service Loaded: loaded (/lib/systemd/system/zookeeper.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2018-12-19 06:23:33 EST; 25min ago Main PID: 20157 (java) Tasks: 21 (limit: 4915) CGroup: /system.slice/zookeeper.service └─20157 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true -Xloggc:/home/kafka/bin/../l
Run this command next:
systemctl status kafka.service
The output of this command should be similar to this one:
kafka.service Loaded: loaded (/etc/systemd/system/kafka.service; disabled; vendor preset: enabled) Active: active (running) since Wed 2018-12-19 06:46:49 EST; 27s ago Process: 22520 ExecStop=/home/kafka/bin/kafka-server-stop.sh (code=exited, status=0/SUCCESS) Main PID: 22540 (java) Tasks: 62 (limit: 4915) CGroup: /system.slice/kafka.service └─22540 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true -Xloggc:/home/kafka/bin/../logs/
We can also use netstat command to check if Kafka and ZooKeeper services are listening on ports 9092 and 2181, respectively:
netstat -tunlp | grep -e \:9092 -e \:2181 tcp6 0 0 :::9092 :::* LISTEN 22540/java tcp6 0 0 :::2181 :::* LISTEN 20157/java
If they are both running, and both ports are open and listening, then that is all. We have successfully installed Apache Kafka.
Of course, you don’t have to install and configure Apache Kafka on Debian 9 if you use one of our Managed Debian Support solutions, in which case you can simply ask our expert Linux admins to setup and configure Apache Kafka on Debian 9 for you. They are available 24×7 and will take care of your request immediately.
PS. If you liked this post on how to install Apache Kafka on a Debian 9 VPS, please share it with your friends on the social networks using the buttons on the left or simply leave a reply below. Thanks.