Uncategorized

A new “Kafka” novel : the OpenShift & Kubernetes deployment

This blog post doesn’t want to be an exhaustive tutorial to describe the way to go for having Apache Kafka deployed in an OpenShift or Kubernetes cluster but just the story of my journey for having a “working” deployment and using it as a starting point to improve over time with a daily basis work in progress. This journey started using Apache Kafka 0.8.0, went through 0.9.0, finally reaching the nowadays 0.10.1.0 version.

From “stateless” to “stateful”

One of the main reasons to use a platform like OpenShift/Kubernetes (let me to use OS/K8S from now) is the scalability feature we can have for our deployed applications. With “stateless” applications there are not so much problems to use such a platform for a Cloud deployment; every time an application instance crashes or needs to be restarted (and/or relocated to a different node), just spin up a new instance without any relationship with the previous one and your deployment will continue to work properly as before. There is no need for the new instance to have information or state related to the previous one.

It’s also true that, out there, we have a lot of different applications which need to persist state information if something goes wrong in the Cloud and they need to be restarted. Such applications are “stateful” by nature and their “story” is important so that just spinning up a new instance isn’t enough.

The main challenges we have with OS/K8S platform are :

  • pods are scaled out and scaled in through Replica Sets (or using Deployment object)
  • pods will be assigned an arbitrary name at runtime
  • pods may be restarted and relocated (on a different node) at any point in time
  • pods may never be referenced directly by the name or IP address
  • a service selects a set of pods that match specific criterion and exposes them through a well-defined endpoint

All the above considerations aren’t a problem for “stateless” applications but they are for “stateful” ones.

The difference between them is also know as “Pets vs Cattle” meme, where “stateless” applications are just a herd of cattle and when one of them die, you can just replace it with a new one having same characteristics but not exactly the same (of course !); the “stateful” applications are like pets, you have to take care of them and you can’t just replace a pet if it’s die 😦

Just as reference you can read about the history of “Pets vs Cattle” in this article.

Apache Kafka is one of these type of applications … it’s a pet … which needs to be handle with care. Today, we know that OS/K8S offers Stateful Sets (previously known as Pet Sets … for clear reasons!) that can be used in this scenario but I started this journey when they didn’t exist (or not released yet), so I’d like to share with you my story, the main problems I encountered and how I solved them (you’ll see that I have “emulated” something that Stateful Sets offer today out of box).

Let’s start with a simple architecture

Let’s start in a very simple way using a Replica Set (only one replica) for Zookeeper server and the related service and a Replica Set (with three replicas) for Kafka servers and the related service.

reference_architecture_1st_ver

The Kafka Replica Set has three replicas for “quorum” and leader election (even for topic replication). The Kafka service is needed to expose Kafka servers access even to clients. Each Kafka server may need :

  • unique ID (for Zookeeper)
  • advertised host/port (for clients)
  • logs directory (for storing topic partitions)
  • Zookeeper info (for connection)

The first approach is to use the broker id dynamic generation so that when a Kafka server starts and needs to connect to Zookeeper, a new broker id is generated and assigned to it. The advertised host/port are just container IP and the fixed 9092 port while the logs directory is predefined (by configuration file). Finally, the Zookeeper connection info are provided through the related Zookeeper service using the related environment variables that OS/K8S creates for us (ZOOKEEPER_SERVICE_HOST and ZOOKEEPER_SERVICE_PORT).

Let’s consider the following use case with a topic (1 partition and 3 replicas). The initial situation is having Kafka servers with broker id 1001, 1002, 1003 and the topic with current state :

  • leader : 1001
  • replicas : 1001, 1002, 1003
  • ISR : 1001, 1002, 1003

It means that clients need to connect to 1001 for sending/receiving messages for the topic and that 1002 and 1003 are followers for having this topic replicated handling failures.

Now, imagine that the Kafka server 1003 crashes and a new instance is just started. The topic description becomes :

  • leader : 1001
  • replicas : 1001, 1002, 1003 <– it’s still here !
  • ISR  : 1001, 1002 <– that’s right, 1003 is not “in-sync”

Zookeeper still sees the broker 1003 as a host for one of the topic replicas but not “in-sync” with the others. Meantime, the new started Kafka server has a new auto generated id 1004. A manual script execution (through the kafka-preferred-replica-election.sh) is needed in order to :

  • adding 1004 to the replicas
  • removing 1003 from replicas
  • new leader election for replicas

use_case_autogenerated_id.png

So what does it mean ?

First of all, the new Kafka server instance needs to have the same id of the previous one and, of course, the same data so the partition replica of the topic. For this purpose, a persistent volume can be the solution used, through a claim, by the Replica Set for storing the logs directory for all the Kafka servers (i.e. /kafka-logs-<broker-id>). It’s important to know that, by Kafka design, a logs directory has a “lock” file locked by the server owner.

For searching for the “next” broker id to use, avoiding the auto-generation and getting the same data (logs directory) as the previous one, a script (in my case a Python one) can be used on container startup before launching the related Kafka server.

In particular, the script :

  • searches for a free “lock” file in order to reuse the broker id for the new Kafka server instance …
  • … otherwise a new broker id is used and a new logs directory is created

Using this approach, we obtain the following result for the previous use case :

  • the new started Kafka server instance acquires the broker id 1003 (as the previous one)
  • it’s just automatically part of the replicas and ISR

use_case_locked_id

But … what on Zookeeper side ?

In this deployment, the Zookeeper Replica Set has only one replica and the service is needed to allow connections from the Kafka servers. What happens if the Zookeeper crashes (application or node fails) ? The OS/K8S platform just restarts a new instance (not necessary on the same node) but what I see is that the currently running Kafka servers can’t connect to the new Zookeeper instance even if it holds the same IP address (through the service usage). The Zookeeper server closes the connections after an initial handshake, probably related to some Kafka servers information that Zookeeper stores locally. Restarting a new instance, this information are lost !

Even in this case, using a persistent volume for the Zookeeper Replica Set is a solution. It’s used for storing the data directory that will be the same for each instance restarted; the new instance just finds Kafka servers information in the volume and grants connections to them.

reference_architecture_1st_ver_zookeeper

When the Stateful Sets were born !

At some point (from the 1.5 Kubernetes release), the OS/K8S platform started to offer the Pet Sets then renamed in Stateful Sets like a sort of Replica Sets but for “stateful” application but … what they offer ?

First of all, each “pet” has a stable hostname that is always resolved by DNS. Each “pet” is being assigned a name with an ordinal index number (i.e. kafka-0, kafka-1, …) and finally a stable storage is linked to that hostname/ordinal index number.

It means that every time a “pet” crashes and it’s restarted, the new one will be the same : same hostname, same name with ordinal index number and same attached storage. The previous running situation is fully recovered and the new instance is exactly the same as the previous one. You could see them as something that I tried to emulate with my scripts on container startup.

So today, my current Kafka servers deployment has :

  • a Stateful set with three replicas for Kafka servers
  • an “headless” service (so without an assigned cluster IP) that is needed for having Stateful set working (so for DNS hostname resolution)
  • a “regular” service for providing access to the Kafka servers from clients
  • one persistent volume for each Kafka server with a claim template defined in the Stateful set declaration

reference_architecture_statefulsets

Other then to use a better implementation 🙂 … the current solution doesn’t use a single persistent volume for all the Kafka servers (having a logs directory for each of them) but it’s preferred to use a persistent storage dedicated to only one “pet”.

It’s great to read about it but … I want to try … I want to play !

You’re right, I told you my journey that isn’t finished yet but you would like to try … to play with some stuff for having Apache Kafka deployed on OS/K8S.

I called this project Barnabas like one of the main characters of the author Franz Kafka who was a … messenger in “The Castel” novel :-). It’s part of the bigger EnMasse project which provides a scalable messaging as a service (MaaS) infrastructure running on OS/K8S.

The repo provides different deployment types : from the “handmade” solution (based on bash and Python scripts) to the current Stateful Sets solution that I’ll improve in the coming weeks.

The great thing about that (in the context of the overall EnMasse project) is that today I’m able to use standard protocols like AMQP and MQTT to communicate with an Apache Kafka cluster (using an AMQP bridge and an MQTT gateway) for all the use cases where using Kafka makes sense against traditional messaging brokers … that from their side have to tell about a lot of stories and different scenarios 😉

Do you want to know more about that ? The Red Hat Summit 2017 (Boston, May 2-4) could be a good place, where me and Christian Posta (Principal Architect, Red Hat) will have the session “Red Hat JBoss A-MQ and Apache Kafka : which to use ?” … so what are you waiting for ? See you there !

Vert.x and IoT in Rome : what a meetup !

Yesterday I had a great day in Rome for a meetup hosted by Meet{cast} (powered by dotnetpodcast community) and Codemotion, speaking about Vert.x and how we can use it for developing “end to end” Internet of Things solutions.

17352445_10208955590111131_6229030843024604532_n

17352567_10208955588791098_766816304298598626_n

I started with an high level introduction on Vert.x and how it works, its internals and its main usage then I moved to dig into some specific components useful for developing IoT applications like the MQTT server, AMQP Proton and Kafka client.

17342690_10208955588751097_8818320599257580571_n

17352571_10208955588951102_2851165399929439718_n

It was interesting to know that even in Italy a lot of developers and companies are moving to use Vert.x for developing microservices based solutions. A lot of interesting questions came out … people seem to like it !

Finally, in order to prove the Vert.x usage in enterprise applications I showed two real use cases that today work thanks to the above components : Eclipse Hono and EnMasse. I had few time to explain better how EnMasse works in details, the Qpid Dispatch Router component in particular and for this reason I hope to have a future meetup on that, the AMQP router concept is quite new today ! In any case, knowing that such a scalable platform is based (even) on Vert.x was a great news for the attendees.

17264802_10208955590191133_8923182437405273553_n

If you are interested to know more about that, you can take a look to the slides and the demo. In the coming days, the video of the meetup will be available online but it will be in Italian (my apologies for my English only friends :-)). Hope you’ll enjoy the content !

Of course, I had some networking with attendees after the meetup and … with some beer 🙂

17310150_1421561734583219_8414988688301135801_o

“Reactive Internet of Things : the Vert.x way” … meetup in Rome !

vertx_iot

On March 16th I’ll be guest of the Meet{cast} and Codemotion community for a meetup in Rome speaking about … “Reactive Internet of Things : the Vert.x way”.

meetcastcodemotion

It’s a pleasure for me showing how the Vert.x toolkit can be used for developing Internet of Things solutions leveraging on the pillars of the reactive manifesto (responsive, elastic, resilient and asynchronous).

Starting from an introduction on what Vert.x is, what it provides and its main features, I’ll move to the messaging and IoT focused components that the toolkit offers. So we’ll see the new MQTT server and Kafka client (officially in the latest 3.4.0 Beta 1 release) and the well know AMQP Proton and Bridge components. Of course …. demos around them !

Finally, I’ll show how these components are already used today for enterprise IoT solutions introducing the Eclipse Hono project, for handling IoT connectivity, and the EnMasse which provides a Message as a Service platform. The great thing is that we’ll have the chance to see the code because … they are open source of course !

So … what are you waiting for … register for the meetup here ! See you in Rome 😉

Eclipse Hono : “Connect. Command. Control” … even on OpenShift !

The Eclipse Foundation is the main open source community which has a great focus on the Internet of Things world and the related Eclipse IoT ecosystem involves a lot of different projects and partners like Red Hat, Bosch, Eurotech, IBM and many more. Recently, publishing this white paper, they showed a full stack with all the available tools for building a complete IoT solution from devices to the Cloud through the gateways.

selection_005

In relation to the Cloud side, one of the main problems in the IoT world is the ability to handle millions of connections from devices (and gateways) in the field, their registration for managing authentication and authorisation and, last but not least, the ingestion of the data received from them like telemetry or events. Finally, the last point is related to control these devices sending them commands in order to execute actions in the environment around them or upgrading their software and configuration.

The Eclipse Hono™ project is the answer to these problem !

The APIs

From the official web site, we can read :

Eclipse Hono™ provides remote service interfaces for connecting large numbers of IoT devices to a back end and interacting with them in a uniform way regardless of the device communication protocol.

The mantra from the landing page of the web site project is “Connect. Command. Control” which is made a reality through a well defined API for :

  • Registration : for handling the requests related to the registration (so creation) of a new device so that it can be authorised to connect. It’s also possible retrieving information about registered devices or delete them;
  • Telemetry : for the ingestion of a large volume of data from devices (i.e sensors) available for analysis to the backend applications;
  • Event : for receiving specific events (i.e. alarms, notification, …) from devices for making decision on the Cloud side. This API is quite similar to a telemetry path but it uses a “different” channel in order to avoid such events going through the overwhelmed telemetry one;
  • Command & Control : for sending commands to the devices (from a backend application) for executing operations and actions on the field (receiving the related response) and/or upgrading local software and configuration;

All the above APIs are accessible through the AMQP 1.0 protocol (the only standardised AMQP version!) and it means that they are defined in terms of addresses on which devices need to connect for interacting with the system and the properties/content of the messages; of course, it’s true not only for devices but even for business and backend applications which can receive data from devices or send them commands. In any case, it doesn’t mean that devices which aren’t able to speak such protocol can’t connect but they can leverage on the protocol adapters provided by Hono; the current implementation provides an MQTT and HTTP REST adapter.

All these APIs are though in order to allow multi-tenancy so that using a single deployment, it’s possible to handle channels for different tenants so that each of them can’t see data or exchanged messages from the others.

The Architecture

The main components which build the Eclipse Hono™ architecture are :

  1. Protocol Adapters : these components adapt a device protocol to the first citizens protocol used in Hono, the AMQP 1.0. Today, an MQTT and HTT REST adapters are provided out of box but thanks to the available interfaces, the user can develop a new adapter even for some custom protocols;
  2. Hono Server : this is the main component to which devices can connect directly through AMQP 1.0 or through the protocol adapters. It’s in charge to expose the APIs in terms of endpoints and handling the authentication and authorisation of devices;
  3. Qpid Dispatch Router : this is an AMQP 1.0 router, part of the Apache Qpid project, which provides the underlying infrastructure for handling millions of connections from devices in the fields. The simpler deployment can use only one router but in order to guarantee reliability and high volume ingestion, a router network should be used;
  4. ActiveMQ Artemis : this is the broker mainly used for handling command and control API so for storing commands which have to be delivered to devices;

While the devices connect directly to the Hono Server (or through protocol adapters), the backend applications connect to the Qpid Dispatch Router leveraging on direct addressing for receiving telemetry data (if no application is online, no devices are able to send data) or sending commands (the queus backed in the broker are reachable through the router).

selection_004

The running environment

All the artifacts from the project are provided as Docker images for each of the above components that can run using Docker Compose (the Docker Swarm support will be available soon) or using a more focused Cloud platform like OpenShift (compatible with Kubernetes deployment as well).

selection_006

Regarding the OpenShift deployment, the building process of the Hono project provides a bunch of YAML files which describe the objects to deploy like deployment configurations, services, persistent volumes and related claims. All the need is having an OpenShift platform instance accessible and deploy such objects for having Eclipse Hono™ running in a full featured Cloud environment with all the related scaling capabilities.

hono_openshift

The example deployment is based on four pods, one for each of the main components so there is the router pod, the Hono Server pod and one pod for each protocol adapter; of course if you need the command & control path, even the broker pod need to be deployed.

In order to try it, an OpenShift Origin instance can be used on a local PC for deploying the entire Hono solution; for example, the above picture is related to my tests where you can see all the pods running on OpenShift (left side) with simulated devices that interact using MQTT and HTTP REST (on the right side).

The documentation which describes the steps for having such a deployment is available on the official web site here.

So what are you waiting for ? Give it a try !

Conclusion

In my mind every IoT solution should be made of three different layers (a little bit different from the Eclipse vision) : the devices/gateways, the connectivity layer and the service layer.

While the Eclipse Kura project fits in the gateways layer and the Eclipse Kapua in the service layer, Eclipse Hono is the glue between them in order to handle the connections from devices and making available their data to the backend services (and vice versa in the opposite direction for command and control). Thanks to the API standardisation  and the usage of a standard protocol like AMQP 1.0, Hono can be used for connecting any kind of devices with any kind of services; of course leveraging on a these three project has the big advantage of having big companies working on them, mainly Red Hat, Bosch and Eurotech.

Finally, the solution is always the same …. open source and collaboration ! 😉

 

Designing MQTT over AMQP

mqtt_over_amqp

Let’s think about what we can consider some weaknesses of the MQTT protocol …

  • it provides native publish/subscribe pattern only; having request/reply is quite cumbersome with “correlation injection” inside topics and payload messages;
  • it doesn’t provide flow control;
  • it doesn’t have metadata information inside messages (i.e. content type);
  • it’s fully broker centric;

but there are some unique features that we have to consider as strengths for this protocol as well …

  • retain message also known as “last well known” message;
  • last will testament (LWT);
  • session handling;

Now, let’s think about the main features that fill the gap for the MQTT protocol but provided by AMQP …

  • native support for both publish/subscribe and request/reply patterns (correlation for free);
  • flow control at different levels (with max window size and message credits);
  • a full type system and metadata information on each message;
  • peer-to-peer model (the broker is just a peer);

but it lacks of the above three MQTT features !

So how greater could be AMQP protocol having such features on top of it ?

Under the open source EnMasse project, I have been working on having a design (so a kind of “specification”) for having retain message, last will testament and session handling over AMQP. At same time I have been developing an MQTT “gateway” in order to allow remote MQTT clients to connect to an AMQP based “Message as a Service” like EnMasse.

Having such a design means that not only an MQTT client can leverage on receiving retain message after subscribing to a topic, sending its LWT on connection or receiving messages for a topic when it was offline; it means having the above features for native AMQP clients as well.

Each feature is made by a well defined AMQP message using specific subject, annotations and payload in order to bring all MQTT related information like retain flag, will QoS, will topic and so on but using AMQP semantic.

This sort of “specification” doesn’t force to use a specific implementation; the EnMasse project leverages on the Qpid Dispatch Router for connections and Apache Artemis brokers where state is needed (but other implementations could use something different like a simple file system or a database). Of course, some additional services are needed in order to handle LWT and subscriptions (we called them just “Will Service” and “Subscription Service”).

If you are so curious and want to give some feedback on that, you can find all the open source stuff on GitHub in the MQTT over AMQP documentation section.

Feel free to enjoy the related EnMasse implementation ! 😉

Internet of Things : reactive and asynchronous with Vert.x !

vertx_iot

I have to admit … before joining Red Hat I didn’t know about the Eclipse Vert.x project but it took me few days to fall in love with it !

For the other developers who don’t know what Vert.x is, the best definition is …

… a toolkit to build distributed and reactive systems on top of the JVM using an asynchronous non blocking development model

The first big thing is related to develop a reactive system using Vert.x which means :

  • Responsive : the system responds in an acceptable time;
  • Elastic : the system can scale up and scale down;
  • Resilient : the system is designed to handle failures gracefully;
  • Asynchronous : the interaction with the system is achieved using asynchronous messages;

The other big thing is related to use an asynchronous non blocking development model which doesn’t mean to be multi-threading but thanks to the non blocking I/O (i.e. for handling network, file system, …) and callbacks system, it’s possible to handle a huge numbers of events per second using a single thread (aka “event loop”).

You can find a lot of material on the official web site in order to better understand what Vert.x is and all its main features; it’s not my objective to explain it in this very short article that is mostly … you guess … messaging and IoT oriented  🙂

In my opinion, all the above features make Vert.x a great toolkit for building Internet of Things applications where being reactive and asynchronous is a “must” in order to handle millions of connections from devices and all the messages ingested from them.

Vert.x and the Internet of Things

As a toolkit, so made of different components, what are the ones provided by Vert.x and useful to IoT ?

Starting from the Vert.x Core component, there is support for both versions of HTTP protocol so 1.1 and 2.0 in order to develop an HTTP server which can expose a RESTful API to the devices. Today , a lot of web and mobile developers prefer to use this protocol for building their IoT solution leveraging on the deep knowledge they have about the HTTP protocol.

Regarding more IoT oriented protocols, there is the Vert.x MQTT server component which doesn’t provide a full broker but exposes an API that a developer can use in order to handle incoming connections and messages from remote MQTT clients and then building the business logic on top of it, so for example developing a real broker or executing protocol translation (i.e. to/from plain TCP,to/from the Vert.x Event Bus,to/from HTTP,to/from AMQP and so on). The API raises all events related to the connection request from a remote MQTT client and all subsequent incoming messages; at same time, the API provides the way to reply to the remote endpoint. The developer doesn’t need to know how MQTT works on the wire in terms of encoding/decoding messages.

Related to the AMQP 1.0 protocol there are the Vert.x Proton and the AMQP bridge components. The first one provides a thin wrapper around the Apache Qpid Proton engine and can be used for interacting with AMQP based messaging systems as clients (sender and receiver) but even developing a server. The last one provides a bridge between the protocol and the Vert.x Event Bus mostly used for communication between deployed Vert.x verticles. Thanks to this bridge, verticles can interact with AMQP components in a simple way.

Last but not least, the Vert.x Kafka client component which provides access to Apache Kafka for sending and consuming messages from topics and related partitions. A lot of IoT scenarios leverage on Apache Kafka in order to have an ingestion system capable of handling million messages per second.

Conclusion

The current Vert.x code base provides quite interesting components for developing IoT solutions which are already available in the current 3.3.3 version (see Vert.x Proton and AMQP bridge) and that will be available soon in the future 3.4.0 version (see MQTT server and Kafka client). Of course, you don’t need to wait for their official release because, even if under development, you can already adopt these components and provide your feedback to the community.

This ecosystem will grow in the future and Vert.x will be a leading actor in the IoT applications world based on a microservices architecture !

The power of collaboration and open source

Waiting for the flight coming back home … here in Stuttgart after a great Eclipse IoT Day during the EclipseCon conference in Ludwigsburg … thinking about the power of collaboration and open source …

Eclipse IoT : a big announcement

Few days ago, the Eclipse Foundation announced a collaboration between three big companies for developing the “Internet of Things” open source platform of the future under the Eclipse IoT umbrella : Red Hat, Bosch and Eurotech.

Yesterday, most of the Eclipse IoT Day sessions were focused on the projects that these companies are leading in order to build such a platform; three names that you should remember … Kura, Hono and Kapua.

From devices and gateway on the field … through the IoT connectivity at scale … to cloud services for gathering insights from data and controlling devices. You can find a lot of information about these projects on the related official web sites and public repositories so it’s not my intention digging into them in this post but just sharing my impressions about the power of collaboration and open source.

Let me just share a picture of the vision that Red Hat has about this IoT platform involving a broader ecosystem of open source projects not only from Eclipse Foundation but even from Apache Foundation.

red_hat_iot_eclipsecon

For sure you have Eclipse IoT projects but even the Apache Qpid Dispatch Router for connectivity and messaging at scale thanks to an AMQP router network, Vert.x based microservices for handling devices connectivity and protocols translation, ActiveMQ Artemis broker for the “store and forward” needs, AMQP – Spark Streaming integration for real time analytics and finally a future Apache Kafka support.

All the best open source projects for a great IoT platform !

From closed source to … open sourcing and collaboration

For a long time I worked in a company with closed source products using … closed source products. In the spare time, I decided to start sharing my knowledge and improving it developing open source software and … I have to say … my life is changed.

From that time I gave something to the community but I received ten times what I have done. Thanks to open sourcing what I was doing, my knowledge is increased exponentially because developers all around the world were able to see my code giving me suggestions and improvements … so then the power over collaboration.

Red Hat : the Open Organization

In the world … there is only one big company based on these pillars … Red Hat !

Today, I’m so proud to be part of such a company where people can use their creativity and having fun during their work days for building open source solutions collaborating with the best developers you can ever imagine. Thanks to the “remotees”, this potential grows exponentially because talented developers can be all around the world and Red Hat searches for them every day. You can imagine that in such a context … collaboration is the pillar !

Periodically … for meetings or conferences … you have the chance to meet your colleagues in person discussing about projects you (together) are working on but even …. drinking a beer and sharing experiences … another time … collaboration !

This week was even the “We Are Red Hat Week” (WARHW … a very complex acronym). Every year a lot of activities are organized in the Red Hat offices all around the world for team building and enjoying different experiences other than working. Here in Ludwigsburg we decided to take a picture at least !

warhw

A lot of companies should take the Red Hat model … the open source and collaboration won !