running Apache Kafka for continuous data streaming on Oracle cloud infrastructure

Goal

Goal of this blog is to setup a docker based Apache Kafka on Oracle cloud infrastructure , create topics, post messages into those topics .. this is significantly important architectural component in cloud based solution where a company has a need to monitor a continuous data stream coming from various streams and they want to take actions based on these streams , since kafka is meant to manage high volume of data it can setup multiple replication clusters for data management

Start from Docker Hub choosing right docker image

most commonly recommended image is landoop/fast-data-dev , you can choose anything else depending upon your requirements.

On Oracle Cloud Infrastructure open the following Ports

Service Ports that we need are 2181 , 3030 , Range 8081-8083 , Range 9581-9585,  9092:9092 all of type TCP and CIDR Source being 0.0.0.0/0 ( Public Internet )

  • 3030 : Web Server
  • 9092 : Kafka Broker
  • 9581 : JMX
  • 8081 : Schema Registry
  • 9582 : JMX
  • 8082 : Kafka REST Proxy
  • 9583 : JMX
  • 8083 : Kafka Connect Distributed
  • 9584 : JMX
  • 2181 : ZooKeeper
  • 9585 : JMX

 

Downloading the Docker Image
 ubuntu@dockervm:~$ docker run --rm -it -p 2181:2181 -p 3030:3030 -p 
8081:8081 -p 8082:8082 -p 8083:8083 -p 9092:9092 -e ADV_HOST=publicip landoop/fast-data-dev
Unable to find image 'landoop/fast-data-dev:latest' locally latest: Pulling from landoop/fast-data-dev 4fe2ade4980c: Pull complete dde3e13a0db6: Pull complete dd15dddd8645: Pull complete 256d5aeb3e41: Pull complete 37b56afc3b63: Pull complete 27190792d7ca: Pull complete ea5a492b068c: Pull complete 9dfe942e8ef5: Pull complete 2ac9c1033b19: Pull complete b7c155be622e: Pull complete d127608652bd: Pull complete f5ec747a68a7: Pull complete 05a2813fc7e6: Pull complete Digest: sha256:3ffe2f11a0cf4f2cf380668fc26747fa00c73f5b054d57a6649438ea38c17da0 Status: Downloaded newer image for landoop/fast-data-dev:latest Setting advertised host to 132.145.169.51. Starting services. This is Landoop’s fast-data-dev. Kafka 1.1.1-L0 (Landoop's Kafka Distribution). You may visit http://132.145.169.51:3030 in about a minute. 2018-11-12 13:41:50,344 INFO Included extra file "/etc/supervisord.d/01-zookeeper.conf" during parsing 2018-11-12 13:41:50,344 INFO Included extra file "/etc/supervisord.d/02-broker.conf" during parsing 2018-11-12 13:41:50,344 INFO Included extra file "/etc/supervisord.d/03-schema-registry.conf" during parsing 2018-11-12 13:41:50,344 INFO Included extra file "/etc/supervisord.d/04-rest-proxy.conf" during parsing 2018-11-12 13:41:50,345 INFO Included extra file "/etc/supervisord.d/05-connect-distributed.conf" during parsing 2018-11-12 13:41:50,345 INFO Included extra file "/etc/supervisord.d/06-caddy.conf" during parsing 2018-11-12 13:41:50,345 INFO Included extra file "/etc/supervisord.d/07-smoke-tests.conf" during parsing 2018-11-12 13:41:50,345 INFO Included extra file "/etc/supervisord.d/08-logs-to-kafka.conf" during parsing 2018-11-12 13:41:50,345 INFO Included extra file "/etc/supervisord.d/99-supervisord-sample-data.conf" during parsing 2018-11-12 13:41:50,345 INFO Set uid to user 0 succeeded 2018-11-12 13:41:50,363 INFO RPC interface 'supervisor' initialized 2018-11-12 13:41:50,363 CRIT Server 'unix_http_server' running without any HTTP authentication checking 2018-11-12 13:41:50,363 INFO supervisord started with pid 9 2018-11-12 13:41:51,367 INFO spawned: 'sample-data' with pid 166 2018-11-12 13:41:51,369 INFO spawned: 'zookeeper' with pid 167 2018-11-12 13:41:51,371 INFO spawned: 'caddy' with pid 168 2018-11-12 13:41:51,375 INFO spawned: 'broker' with pid 169 2018-11-12 13:41:51,380 INFO spawned: 'smoke-tests' with pid 170 2018-11-12 13:41:51,392 INFO spawned: 'connect-distributed' with pid 172 2018-11-12 13:41:51,407 INFO spawned: 'logs-to-kafka' with pid 174 2018-11-12 13:41:51,422 INFO spawned: 'schema-registry' with pid 175 2018-11-12 13:41:51,425 INFO spawned: 'rest-proxy' with pid 178 2018-11-12 13:41:52,470 INFO success: sample-data entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2018-11-12 13:41:52,470 INFO success: zookeeper entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2018-11-12 13:41:52,470 INFO success: caddy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2018-11-12 13:41:52,470 INFO success: broker entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2018-11-12 13:41:52,470 INFO success: smoke-tests entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2018-11-12 13:41:52,470 INFO success: connect-distributed entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2018-11-12 13:41:52,470 INFO success: logs-to-kafka entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2018-11-12 13:41:52,470 INFO success: schema-registry entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2018-11-12 13:41:52,470 INFO success: rest-proxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Create Writable Docker Container using Docker Run command

You can exit the shell by Ctrl+C , and run the below command to run docker container in detached mode

ubuntu@dockervm:~$ docker run -d -p 2181:2181 -p 3030:3030 -p 8081-8083:8081-8083 -p 
9581-9585:9581-9585 -p 9092:9092 -e ADV_HOST=publicipaddress -e RUNNING_SAMPLEDATA=1 landoop/fast-data-dev
097202792f00d51902a04d29c105399bfaeabaf79e5ffa24cd612c60140aeb22
Run the Docker in Bash mode
ubuntu@dockervm:~$ docker run --rm -it --net=host landoop/fast-data-dev bash
// Create Topic
root@fast-data-dev / $ kafka-topics --zookeeper public_ipaddress:2181
--create --topic cricket_topic --partitions 3 --replication-factor 1
WARNING: Due to limitations in metric names, topics with a period ('.')
or underscore ('_') could collide.
To avoid issues it is best to use either, but not both.
Created topic "cricket_topic".
To get rid of Warning Signs
root@fast-data-dev / $ kafka-topics --zookeeper public_ipaddress:2181
--create --topic soccertopic --partitions 3 --replication-factor 1
Created topic "soccertopic".
// Create Message Stream under the given topic
ubuntu@dockervm:~$ docker run --rm -it --net=host landoop/fast-data-dev bash
root@fast-data-dev / $ kafka-console-producer --broker-list
public_ipaddress:9092 --topic cricket_topic
>this is message 1>
message 2>
message 3>
// Describe the Topic
root@fast-data-dev / $ kafka-topics --zookeeper public_ipaddress:2181 --describe --topic cricket_topic
root@fast-data-dev / $ kafka-topics --zookeeper publicip:2181 --describe --topic cricket_topic Topic:cricket_topic PartitionCount:3 ReplicationFactor:1 Configs: Topic: cricket_topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0 Topic: cricket_topic Partition: 1 Leader: 0 Replicas: 0 Isr: 0 Topic: cricket_topic Partition: 2 Leader: 0 Replicas: 0 Isr: 0
Reality Check Kafka Web UI

Access the Web UI under http://public ip address:3030

Wait for sometime 2 to 3 mins for health check to complete

View Test Results

View Topics and Corresponding Messages

JSON Format