Step By Step: Getting Kafka installed in Mac OS X Sierra
In this activity we are going to use the beautiful packaging manager tool Homebrew throughout the installation process. This tool make life easier to install and manage the latest version of the software and keep updated.
Step 1 : Install Homebrew (as an administrator)
$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
The above command will install the following packages :
- homebrew
- command line tool for Xcode-8.2
- Along with other supporting libraries
Following are the sample logs you might see at the time of homebrew installation
Sudhirs-MacBook-Pro:~ sudhir$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
==> This script will install:
/usr/local/bin/brew
/usr/local/share/doc/homebrew
/usr/local/share/man/man1/brew.1
/usr/local/share/zsh/site-functions/_brew
/usr/local/etc/bash_completion.d/brew
/usr/local/Homebrew
==> The following new directories will be created:
/usr/local/Cellar
/usr/local/Homebrew
/usr/local/Frameworks
/usr/local/bin
/usr/local/etc
/usr/local/include
/usr/local/lib
/usr/local/opt
/usr/local/sbin
/usr/local/share
/usr/local/share/zsh
/usr/local/share/zsh/site-functions
/usr/local/var
==> Cleaning up /Library/Caches/Homebrew...
==> Migrating /Library/Caches/Homebrew to /Users/sudhir/Library/Caches/Homebrew...
==> Deleting /Library/Caches/Homebrew...
Already up-to-date.
==> Installation successful!
==> Homebrew has enabled anonymous aggregate user behaviour analytics.
Read the analytics documentation (and how to opt-out) here:
https://git.io/brew-analytics
==> Next steps:
- Run `brew help` to get started
- Further documentation:
https://git.io/brew-docs
Step 2 : Now homebrew is ready to use. Install wget tool
$ brew install wget
The above command with install wget package along with OpenSSL library.
Sample log for above command …
Sudhirs-MacBook-Pro:~ sudhir$ brew install wget
==> Installing dependencies for wget: openssl
==> Installing wget dependency: openssl
==> Downloading https://homebrew.bintray.com/bottles/openssl-1.0.2j.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring openssl-1.0.2j.sierra.bottle.tar.gz
==> Using the sandbox
==> Caveats
A CA file has been bootstrapped using certificates from the SystemRoots
keychain. To add additional certificates (e.g. the certificates added in
the System keychain), place .pem files in
/usr/local/etc/openssl/certs
and run
/usr/local/opt/openssl/bin/c_rehash
This formula is keg-only, which means it was not symlinked into /usr/local.
Apple has deprecated use of OpenSSL in favor of its own TLS and crypto libraries
Generally there are no consequences of this for you. If you build your
own software and it requires this formula, you will need to add to your
build variables:
LDFLAGS: -L/usr/local/opt/openssl/lib
CPPFLAGS: -I/usr/local/opt/openssl/include
==> Summary
🍺 /usr/local/Cellar/openssl/1.0.2j: 1,695 files, 12M
==> Installing wget
==> Downloading https://homebrew.bintray.com/bottles/wget-1.18.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring wget-1.18.sierra.bottle.tar.gz
🍺 /usr/local/Cellar/wget/1.18: 9 files, 1.6M
Step 3 : Now its time to install Kafka
- You can search available Kafka builds by executing the following command
$ brew search kafka
Above command will give you the result something like
Sudhirs-MacBook-Pro:~ sudhir$ brew search kafka
kafka kafkacat librdkafka
homebrew/php/php53-kafka homebrew/php/php54-rdkafka homebrew/php/php56-kafka homebrew/php/php70-rdkafka
homebrew/php/php53-rdkafka homebrew/php/php55-kafka homebrew/php/php56-rdkafka homebrew/php/php71-rdkafka
homebrew/php/php54-kafka homebrew/php/php55-rdkafka homebrew/php/php70-kafka Caskroom/cask/kafka-tool
- Now execute the following command to install Kafka
$ brew install kafka
Above command will give you the result something like
Sudhirs-MacBook-Pro:kafka-0.8.2 sudhir$ brew install kafka
==> Installing dependencies for kafka: zookeeper
==> Installing kafka dependency: zookeeper
==> Downloading https://homebrew.bintray.com/bottles/zookeeper-3.4.9.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring zookeeper-3.4.9.sierra.bottle.tar.gz
==> Caveats
To have launchd start zookeeper now and restart at login:
brew services start zookeeper
Or, if you do not want/need a background service you can just run:
zkServer start
==> Summary
🍺 /usr/local/Cellar/zookeeper/3.4.9: 238 files, 18.2M
==> Installing kafka
==> Downloading https://homebrew.bintray.com/bottles/kafka.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring kafka.sierra.bottle.tar.gz
==> Caveats
To have launchd start kafka now and restart at login:
brew services start kafka
Or, if you do not want/need a background service you can just run:
zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties & kafka-server-start /usr/local/etc/kafka/server.properties
==> Summary
🍺 /usr/local/Cellar/kafka/ : 128 files, 35.2M
The above Kafka installation will also install all dependencies, like zookeeper which is required to run Kafka server.
You can see the configurations are installed for Kafka & zookeeper as follows
Sudhirs-MacBook-Pro:~ sudhir$ ls -ltr /usr/local/etc
total 16
drwxr-xr-x 3 sudhir admin 102 Jan 23 21:42 bash_completion.d
drwxr-xr-x 7 sudhir admin 238 Jan 23 21:42 openssl
-rw-r--r-- 1 sudhir admin 4945 Jan 23 21:42 wgetrc
drwxr-xr-x 6 sudhir admin 204 Jan 23 21:47 zookeeper
drwxr-xr-x 15 sudhir admin 510 Jan 23 21:47 kafka
Sudhirs-MacBook-Pro:~ sudhir$ ls -ltr /usr/local/etc/zookeeper/
total 32
-rw-r--r-- 1 sudhir admin 941 Jan 23 21:47 zoo_sample.cfg
-rw-r--r-- 1 sudhir admin 941 Jan 23 21:47 zoo.cfg
-rw-r--r-- 1 sudhir admin 339 Jan 23 21:47 log4j.properties
-rw-r--r-- 1 sudhir admin 44 Jan 23 21:47 defaults
Sudhirs-MacBook-Pro:~ sudhir$ view /usr/local/etc/zookeeper/zoo.cfg
Sudhirs-MacBook-Pro:~ sudhir$ ls -ltr /usr/local/etc/kafka/
total 120
-rw-r--r-- 1 sudhir admin 1037 Jan 23 21:47 zookeeper.properties
-rw-r--r-- 1 sudhir admin 1032 Jan 23 21:47 tools-log4j.properties
-rw-r--r-- 1 sudhir admin 5350 Jan 23 21:47 server.properties
-rw-r--r-- 1 sudhir admin 1900 Jan 23 21:47 producer.properties
-rw-r--r-- 1 sudhir admin 4369 Jan 23 21:47 log4j.properties
-rw-r--r-- 1 sudhir admin 1199 Jan 23 21:47 consumer.properties
-rw-r--r-- 1 sudhir admin 2061 Jan 23 21:47 connect-standalone.properties
-rw-r--r-- 1 sudhir admin 1074 Jan 23 21:47 connect-log4j.properties
-rw-r--r-- 1 sudhir admin 881 Jan 23 21:47 connect-file-source.properties
-rw-r--r-- 1 sudhir admin 883 Jan 23 21:47 connect-file-sink.properties
-rw-r--r-- 1 sudhir admin 2760 Jan 23 21:47 connect-distributed.properties
-rw-r--r-- 1 sudhir admin 909 Jan 23 21:47 connect-console-source.properties
-rw-r--r-- 1 sudhir admin 906 Jan 23 21:47 connect-console-sink.properties
Step 3 : Before starting Kafka server, first start ZooKeeper which is responsible for coordinating & selecting the leader.
- Check if the Zookeeper really installed
Sudhirs-MacBook-Pro:~ sudhir$ which zkserver
/usr/local/bin/zkserver
Sudhirs-MacBook-Pro:~ sudhir$ zkserver
ZooKeeper JMX enabled by default
Using config: /usr/local/etc/zookeeper/zoo.cfg
Usage: ./zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd}
- Now start the zookeeper server
Sudhirs-MacBook-Pro:~ sudhir$ zkserver start
ZooKeeper JMX enabled by default
Using config: /usr/local/etc/zookeeper/zoo.cfg
Starting zookeeper ... STARTED
- Test if the zookeeper server is really started
Sudhirs-MacBook-Pro:~ sudhir$ telnet localhost 2181
Trying ::1...
Connected to localhost.
Escape character is '^]'.
^CConnection closed by foreign host.
The above log shows that we are able to telnet to the zookeeper port , hence the zookeeper server is up and running
Step 4 : Lets create the symlinks for the configuration directory created after Kafka installation (Its not mandatory but makes life easier to use with out bothering the original directory)
Sudhirs-MacBook-Pro:~ sudhir$ ln -nsf /usr/local/etc/ ./symln.etc/
Sudhirs-MacBook-Pro:~ sudhir$ ls -ltr ~/symln.etc/
total 24
drwxr-xr-x 3 sudhir admin 102 Jan 23 21:42 bash_completion.d
drwxr-xr-x 7 sudhir admin 238 Jan 23 21:42 openssl
-rw-r--r-- 1 sudhir admin 4945 Jan 23 21:42 wgetrc
drwxr-xr-x 6 sudhir admin 204 Jan 23 21:47 zookeeper
drwxr-xr-x 15 sudhir admin 510 Jan 23 21:47 kafka
lrwxr-xr-x 1 sudhir admin 15 Jan 23 21:59 etc -> /usr/local/etc/
Step 5 : Now lets start Kafka server
Sudhirs-MacBook-Pro:~ sudhir$ cd ~/symln.kafka/
Sudhirs-MacBook-Pro:~ sudhir$ pwd
/Users/sudhir/symln.kafka/
Sudhirs-MacBook-Pro:~ sudhir$ cd ~/symln.kafka/
Sudhirs-MacBook-Pro: sudhir$ ./bin/kafka-server-start ~/symln.etc/kafka/server.properties
After successful execution of the above command, you might see the following result.
Sudhirs-MacBook-Pro: sudhir$ ./bin/kafka-server-start ~/symln.etc/kafka/server.properties
[2017-01-24 22:50:08,731] INFO KafkaConfig values:
advertised.host.name = null
advertised.listeners = null
advertised.port = null
.........
.............
[2017-01-24 22:50:34,751] INFO Completed load of log myTopic-0 with 1 log segments and log end offset 0 in 4 ms (kafka.log.Log)
[2017-01-24 22:50:34,753] INFO Created log for partition [myTopic,0] in /usr/local/var/lib/kafka-logs with properties {compression.type -> producer, message.format.version -> 0.10.1-IV2, file.delete.delay.ms -> 60000, max.message.bytes -> 1000012, min.compaction.lag.ms -> 0, message.timestamp.type -> CreateTime, min.insync.replicas -> 1, segment.jitter.ms -> 0, preallocate -> false, min.cleanable.dirty.ratio -> 0.5, index.interval.bytes -> 4096, unclean.leader.election.enable -> true, retention.bytes -> -1, delete.retention.ms -> 86400000, cleanup.policy -> [delete], flush.ms -> 9223372036854775807, segment.ms -> 604800000, segment.bytes -> 1073741824, retention.ms -> 604800000, message.timestamp.difference.max.ms -> 9223372036854775807, segment.index.bytes -> 10485760, flush.messages -> 9223372036854775807}. (kafka.log.LogManager)
[2017-01-24 22:50:34,754] INFO Partition [myTopic,0] on broker 0: No checkpointed highwatermark is found for partition [myTopic,0] (kafka.cluster.Partition)
[
Step 6 : Now we are good to start out Consumer and Producer to pull and push the data respectively from Kafka topic. (Here we will use command line consumer & producer for test purpose)
- Start Consumer
Sudhirs-MacBook-Pro:
sudhir$ pwd
/Users/sudhir/symln.kafka/
Sudhirs-MacBook-Pro:~ sudhir$ cd ~/symln.kafka/
Sudhirs-MacBook-Pro: sudhir$ ls -ltr bin/
total 216
-r-xr-xr-x 1 sudhir admin 141 Jan 23 21:47 zookeeper-shell
-r-xr-xr-x 1 sudhir admin 147 Jan 23 21:47 zookeeper-server-stop
-r-xr-xr-x 1 sudhir admin 148 Jan 23 21:47 zookeeper-server-start
-r-xr-xr-x 1 sudhir admin 154 Jan 23 21:47 zookeeper-security-migration
-r-xr-xr-x 1 sudhir admin 151 Jan 23 21:47 kafka-verifiable-producer
-r-xr-xr-x 1 sudhir admin 151 Jan 23 21:47 kafka-verifiable-consumer
-r-xr-xr-x 1 sudhir admin 138 Jan 23 21:47 kafka-topics
-r-xr-xr-x 1 sudhir admin 157 Jan 23 21:47 kafka-streams-application-reset
-r-xr-xr-x 1 sudhir admin 153 Jan 23 21:47 kafka-simple-consumer-shell
-r-xr-xr-x 1 sudhir admin 143 Jan 23 21:47 kafka-server-stop
-r-xr-xr-x 1 sudhir admin 144 Jan 23 21:47 kafka-server-start
-r-xr-xr-x 1 sudhir admin 141 Jan 23 21:47 kafka-run-class
-r-xr-xr-x 1 sudhir admin 152 Jan 23 21:47 kafka-replica-verification
-r-xr-xr-x 1 sudhir admin 151 Jan 23 21:47 kafka-replay-log-producer
-r-xr-xr-x 1 sudhir admin 151 Jan 23 21:47 kafka-reassign-partitions
-r-xr-xr-x 1 sudhir admin 150 Jan 23 21:47 kafka-producer-perf-test
-r-xr-xr-x 1 sudhir admin 158 Jan 23 21:47 kafka-preferred-replica-election
-r-xr-xr-x 1 sudhir admin 144 Jan 23 21:47 kafka-mirror-maker
-r-xr-xr-x 1 sudhir admin 150 Jan 23 21:47 kafka-consumer-perf-test
-r-xr-xr-x 1 sudhir admin 155 Jan 23 21:47 kafka-consumer-offset-checker
-r-xr-xr-x 1 sudhir admin 147 Jan 23 21:47 kafka-consumer-groups
-r-xr-xr-x 1 sudhir admin 148 Jan 23 21:47 kafka-console-producer
-r-xr-xr-x 1 sudhir admin 148 Jan 23 21:47 kafka-console-consumer
-r-xr-xr-x 1 sudhir admin 139 Jan 23 21:47 kafka-configs
-r-xr-xr-x 1 sudhir admin 136 Jan 23 21:47 kafka-acls
-r-xr-xr-x 1 sudhir admin 144 Jan 23 21:47 connect-standalone
-r-xr-xr-x 1 sudhir admin 145 Jan 23 21:47 connect-distributed
Execute Kafka consumer in a new terminal (It will write the message in the same console as soon as produces publish the data to Kafka topic named “myTopic”)
Sudhirs-MacBook-Pro: sudhir$ ./bin/kafka-console-consumer --zookeeper localhost:2181 --topic myTopic --from-beginning
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
Now Consumer started, but we don’t see any massage consuming because we didn’t start producer yet and not published any data
2. Now start the producer in a new terminal (will publish the data into Kafka topic “myTopic” to which the consumer already subscribed)
Sudhirs-MacBook-Pro:0.10.1.0 sudhir$ ./bin/kafka-console-producer --broker-list localhost:9092 --topic myTopic
Now type something like “Hello Kafka” and press enter
Sudhirs-MacBook-Pro:0.10.1.0 sudhir$ ./bin/kafka-console-producer --broker-list localhost:9092 --topic myTopic
Hello Kafka
Now observe in Consumer terminal , where you can see the consumer consumed the data published by the producer
Sudhirs-MacBook-Pro: sudhir$ ./bin/kafka-console-consumer --zookeeper localhost:2181 --topic myTopic --from-beginning
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
Hello Kafka
Conclusion :
Its just the begining of Kafka. there are much more to get into.
Kafka provides the following:
- Performs on persistent messaging with constant time O(1) even with terabytes of stored messages.
- High throughput.
- Partitioned massaging service in distributed environment.
- Support for parallel data streaming & loading into Hadoop.
Note : The performance , latency & throughout might differ according to some of the crucial configuration parameters configured at Kafka setup
I will try to put some use cases regarding those parameters in a different post.
Thanks for visiting the article & hope it helps.