Author: ppatierno

Apache Kafka consumer groups … don’t use them in the “wrong” way !

8In this blog post I’d like to focus the attention on how the “automatic” and “manual” partitions assignment can interfere with each other even breaking things. I’d like to give an advice on using them in the right way avoiding to mix them in the same scenario or being aware of what you are doing.

The consumer group experience

In Apache Kafka, the consumer group concepts is a way for achieving two things :

  • having consumers as part of the same consumer group means providing the “competing consumers” pattern with whom the messages from topic partitions are spread across the members of the group; each consumer receive messages from one or more partitions (“automatically” assigned to it) and the same messages won’t be received by the other consumers (assigned to different partitions). In this way we can scale the number of the consumers up to the number of the partitions (having one consumer reading only one partition); in this case a new consumer joining the group will be in a idle state without being assigned to any partition.
  • having consumers as part of different consumer groups means providing the “publish/subscribe” pattern where the messages from topic partitions are sent to all the consumers across the different groups. It means that inside the same consumer group we’ll have the rules explained above but across different groups, the consumers will receive the same messages. It’s useful when the messages inside a topic are of interest for different applications which will process them in different ways; we want that the all the interested applications will receive all the same messages from the topic.

Another great advantage of consumers grouping is the re-balancing feature. When a consumer joins a group, if there are still enough partitions available (so we haven’t reached the limit of one consumer per partition), a re-balancing starts and the partitions will be reassigned to the current consumers plus the new one. In the same way, if a consumer leaves a group, the partitions will be reassigned to the remaining consumers.

What I have told so far it’s really true using the subscribe() method provided by the KafkaConsumer API. This method enforces you to assign the consumer to a consumer group, setting the “group.id” property, because it’s needed for re-balancing. In any case, it’s not the consumer to decide the partitions it wants to read for; in general the first consumer which joins the group doing the assignment while other consumers join the group.

How things can be “broken”

Other than using the subscribe() method, there is another way for a consumer for reading from topic partitions : it’s using the assign() method and in this case, the consumer is able to specify the topic partitions it wants to read for.

This type of approach can be useful when you know exactly where some specific messages will be written (the partition) and you want to read directly from there. Of course, you lose the re-balancing feature in this case and it is the first big difference with using the subscribe way.

Another difference is that with “manual” assignment you can avoid to specify a consumer group (so the “group.id” property) for the consumer : it will be just empty. In any case it’s better specifying it.

Most of the people use the subscribing way, leveraging “automatic” assignment and re-balancing feature; using both the way together can brake something … let’s see.

Imagine to have a single “test” topic with only two partitions (P0 and P1) and a consumer C1 which subscribe to the topic as part of the consumer group G1. This consumer will be assigned to both the partitions receiving messages from them. Now, let’s start a new consumer C2 which is configured to be part of the same consumer group G1 but … it uses the assign way for asking for the partitions P0 and P1 explicitly.

subscribe_assignNow we have broken something ! What ? Maybe you are asking …

Both the consumer C1 and C2 will receive messages from the topic, so from both partitions P0 and P1, but … they are part of the same consumer group G1 ! So we have “broken” what we said in the previous paragraph about “competing consumers” when they are part of the same consumer group. So you are experiencing a “publish/subscribe” pattern but with consumers within the same consumer group.

What about offsets commits ?

In general you should avoid a scenario like the one described before even because there is a real side effect on that. Starting from the version 0.8.2.0, the offsets committed by the consumers aren’t saved in Zookeeper but on a partitioned and replicated topic named “__consumer_offsets” which is hosted on the Kafka brokers in the cluster.

When a consumer commits some offsets (for different partitions), it really sends a message to the broker to the “__consumer_offsets” topic and such message has the following structure :

  • key = [group, topic, partition]
  • value = offset

Coming back to the previous scenario what does it mean ?

Having C1 and C2 as part of the same consumer group but able to receive from the same partitions (both P0 and P1) it could happen something like the following :

  • C1 commits offset X for partition P0 writing a message like this :
    • key = [G1, “test”, P0], value = X
  • C2 commits offset Y for partition P0 writing a message like this :
    • key = [G1, “test”, P0], value = Y

So the consumer C2 has overwritten the committed offset for the same partition P0 of the consumer C1 and maybe X was less than Y; if C1 crashes and restarts it will lose messages starting to read from Y (remember Y > X).

Something like that can’t happen with consumers which use only the subscribe way for being assigned to partitions, because as part of the same consumer group they’ll receive different partitions so the key for the offset commit message will be always different.

[Update 28/07/2017]

As a confirmation that mixing subscribe and assign isn’t a good thing to do, after a discussion with one of my colleagues, Henryk Konsek, it turned out that if you try to call both methods on the same consumer, the client library throws the following exception :

java.lang.IllegalStateException: Subscription to topics, partitions and pattern are mutually exclusive

Conclusion

The consumer groups mechanism in Apache Kafka works really well and leveraging on that for scaling consumers and having “automatic” partitions assignment with re-balancing is a great plus. There are cases where you would need to assign partitions “manually” but in that case pay attention on what could happen if you mix both solutions.

So … let’s consume from Apache Kafka but with judgment and the awareness of what we are doing.

A lot of fun with … AMQP, Spark, Kafka, EnMasse, MQTT, Vert.x & IoT

When I say to someone that I work for Red Hat they say me “Ah ! Are you working on Linux ?” … No, no, no and … no ! I’m not a Linux guy, I’m not a fan boy but I’m just a daily user 🙂

All people know that Red Hat is THE company which provides the best enterprise Linux distribution well known as Red Hat Enterprise Linux (RHEL) but Red Hat is not only Linux today. Its portfolio is huge : the cloud and containers business with the OpenShift effort, the microservices offer with Vert.x, Wildfly Swarm, Spring Boot, the IoT world with the involvement in the main Eclipse Foundation projects.

The objective of this blog is just showing briefly the projects I worked (or I’m working) on since last year when I was hired on March 1st. They are not “my” projects, they are projects I’m involved because the entire team is working on them … collaboration, you know 🙂

You could be surprised about that but … there is no Linux ! I’m on the messaging & IoT team, so you will see only projects about this stuff 🙂

AMQP – Apache Spark connector

This “little” component is strictly related to the “big” radanalytics.io project which takes the powerful of Apache Spark for analytics (batch, real-time, machine learning, …) running on OpenShift.

Because the messaging team works mainly on projects like ActiveMQ Artemis and the Qpid Dispatch Router, where the main protocol is AMQP 1.0, the idea was developing a connector for Spark Streaming in order to ingest data through this protocol so from queues/topics on a broker or through the router in a direct messaging fashion.

You can find the component here and even an IoT demo here which shows how it’s possible to ingest data through AMQP 1.0 using the EnMasse project (see below) and then executing a real time streaming analytics with Spark Streaming, all running on Kubernetes and OpenShift.

AMQP – Apache Kafka bridge

Apache Kafka is one of the best technologies used today for ingesting data (i.e. IoT related scenarios) with an high throughput. Even in this case, the idea was providing a way for having AMQP 1.0 clients and JMS clients pushing messages to Apache Kafka topics without knowing the related custom protocol.

In this way, if you have such clients because you are already using a broker technology but then you need some specific Kafka features (i.e. re-reading streams), you can just switch the messaging system (from the broker to Kafka) and using the bridge you don’t need to update or modify clients. I showed how this is possible at the Red Hat summit as well and the related demo is available here.

MQTT on EnMasse

EnMasse is an open source messaging platform, with focus on scalability and performance. It can run on your own infrastructure (on premise) or in the cloud, and simplifies the deployment of messaging infrastructure.

It’s based on other open source projects like ActiveMQ Artemis and Qpid Dispatch Router supporting the AMQP 1.0 protocol natively.

In order to provide support for the MQTT protocol, we designed how to take “MQTT over AMQP” so having MQTT features on the AMQP protocol. From the design we moved to develop two main components :

  • the MQTT gateway which handles connections with remote MQTT clients translating all messages from MQTT to AMQP and vice versa;
  • the MQTT LWT (Last and Will Testament) service which provides a way for notifying all clients connected to EnMasse that another client is suddenly died sending them its “will message”. The great thing about this service, is that it works with pure AMQP 1.0 clients so bringing the LWT feature on AMQP as well : for this reason the team is thinking to change its name just in AMQP LWT service.

EnMasse is great for IoT scenarios in order to handle a huge number of connections and ingesting a lot of data using AMQP and MQTT as protocols. I used it in all my IoT demos for showing how it’s possible to integrate it with streaming and analytics frameworks. It’s also the main choice as messaging infrastructure in the cloud for the Eclipse Hono project.

Vert.x and the IoT components

Vert.x is a great toolkit for developing reactive applications running on a JVM.

The reactive applications manifesto fits really well for IoT scenarios where responsiveness, resiliency, elasticity and the communication driven by messages are the pillars of all the IoT solutions.

Starting to work on the MQTT gateway for EnMasse using Vert.x for that, I decided to develop an MQTT server that was just able to handle communication with remote clients providing an API for interacting with them : this component was used for bridging MQTT to AMQP (in EnMasse) but can be used for any scenario where a sort of protocol translation or integration is needed (i.e. MQTT to Vert.x Event Bus, to Kafka, …). Pay attention, it’s not a full broker !

The other component was the Apache Kafka client, mainly developed by Julien Viet (lead on Vert.x) and then passed to me as maintainer for improving it and adding new features from the first release.

Finally, thanks to the Google Summer of Code, during the last 2 months I have been mentoring a student who is working on developing a Vert.x native MQTT client.

As you can see the Vert.x toolkit is really growing from an IoT perspective other then providing a lot of components useful for developing pure microservices based solutions.

Eclipse Hono

Eclipse Hono is a project under the big Eclipse IoT umbrealla in the Eclipse Foundation. It provides a service interfaces for connecting large numbers of IoT devices to a back end and interacting with them in a uniform way regardless of the device communication protocol.

It supports scalable and secure ingestion of large volumes of sensor data by means of its Telemetry API. The Command & Control API allows for sending commands (request messages) to devices and receive a reply to such a command from a device asynchronously in a reliable way.

This project is mainly developed by Red Hat and Bosch and I gave my support on designing all the API other then implementing the MQTT adapter even in this case using the Vert.x MQTT server component.

Because Eclipse Hono works on top of a messaging infrastructure for allowing messages exchange, the main choice was using ActiveMQ Artemis and the Qpid Dispatch Router even running them using Kubernetes and OpenShift with EnMasse.

Apache Kafka

Finally, I was involved to develop a POC named “barnabas” (a messenger character from a Frank Kafka novel :-)) in order to take Apache Kafka running on OpenShift.

Considering the stetaful nature of a project like Kafka, I started when Kubernetes didn’t offer the StatefulSets feature doing something similar by myself. Today, the available deploy is based on StatefulSets and it’s a work in progress on which I’ll continue to work for pushing the POC to the next level.

Apache Kafka is a really great project which has its own use cases in the messaging world; today it’s more powerful thanks to the new Streams API which allows to execute a real time streaming analytics using topics from your cluster and running simple applications. My next step is to move my EnMasse + Spark demo to an EnMasse + Kafka (and streaming) deployment. I’m also giving my support on the Apache Kafka code.

Conclusion

The variety and heterogeneity of all the above projects is giving me a lot of fun in my day by day work even collaborating with different people with different knowledge. I like learning new stuff and the great thing is that … things to learn are endless ! 🙂

 

How to learn Kubernetes ? “Kubernetes in action” the answer !

Few months ago I started the Manning Access Early Program (MEAP) for one of the books that I think is the best resource for all the newbies who want to start studying Kubernetes and for all the experts who want to dig into its details and internals.

It’s “Kubernetes in action” written by Marko Luksa , one of my colleagues in Red Hat as software engineer in the Cloud Enablement Team.

First of all, I can guarantee on the author ! Marko has a really deep knowledge of Kubernetes and OpenShift and luckily for us, he decided to write a book about that. He is a very nice person, always available to help you to solve any problem that you are facing with Kube. I was lucky to start working with Marko when the EnMasse project was born.

Speaking about the book, it’s awesome !

After the first part introducing Docker and Kubernetes and the first steps with it, the book moves to the core concepts.

You can find all information about what pods are and how you can deploy containers (so your applications) with them and finally how you can replicate pods. Then how to make applications accessible inside and outside the cluster using services and how to use storage for having a persistence layer for your data shared between pods and always available on restarts. Do you want to know how your application is configurable even with sensitive data (i.e. certificates and credentials), you will find such information in this book !

After covering all these core concepts, the last big part has the objective to bring you more deep information about Kubernetes. First of all, the new StatefulSets feature which allows to deploy stateful applications with their stable identity and stored data across restarts. Then a really interesting look to Kubernetes internals speaking about all the components which made it : etcd, the API server, the controller manager and all the other stuff. Managing resources is another interesting chapter describing how you can request specific resources in terms of CPU and memory for the containers even setting limits on them.

But one of the big advantage of using a container orchestrator like Kubernetes is the possibility to scale your infrastructure based on the load you have against the applications. You will find this information about auto-scaling on CPU utilization or custom metrics !

The final part is enriched by best practices for developing cloud native applications which run on Kubernetes. In order to learn a new technology, knowing how it works is the main part but it’s more useful having examples and patterns provided by expert people like Marko.

Finally, the book ends explaining how it’s possible to extend Kubernetes defining new components and custom API objects so showing a powerful feature like its extensibility.

In conclusion, I think that this book deserves to be read and to be part of your books collection even because after reading it you’ll become expert on two technologies for developing cloud native applications : not only Kubernetes but OpenShift as well ! 🙂

No winner in the (Industrial) IoT protocols war !

Yesterday, I read this article about declaring MQTT as the winner of the IIoT (Industrial IoT) protocols war and I have a completely different opinion on that so … I totally disagree with the author !

Don’t get me wrong, it’s not because I don’t like MQTT (who knows me, knows that I have done a lot of work around MQTT as well) but just because …

“There is NO winner in the (Industrial) IoT protocols war”

The IoT world is so rich of different use cases, scenarios, features needs and so on that most of the time, the better solution is an “hybrid” one which uses different protocols; even if you focus in the specific IIoT space, that’s true.

IoT has different communication patterns which come from the messaging land and every protocol provides support for one or more of them in different ways; sometimes we have builtin support, sometimes we need to do more work at application level.

MQTT for telemetry ? But …

MQTT fits really well for telemetry because it’s mainly based on publish/subscribe but at same time it has no flow control : what’s happen when the broker is overwhelmed by tons of messages at high rate and it can’t dispatch such messages to the subscribers at the same pace ? It’s even true that most of the time, MQTT devices are tiny sensors which send data with a slow rate (i.e. every second) because they are battery powered and use mobile connection so that they send a message, then go to sleep for few seconds and then wake up for sending the new message. In this case, you don’t have high rate but if you have thousands (millions ?) of these devices, the broker is overwhelmed as well : there is a burst of messages which come and it has to handle all of them.

AMQP doesn’t declare any specific supported pattern and it fits well for all. Regarding telemetry (so publish/subscribe), it provides flow control (even at different levels) so that the receiver node can stop the sender having more time for processing messages received up to now.

Why more complexity for Command & Control ?

Moving to command and control, so speaking about a request/reply pattern, all the MQTT limitations come. In this case, you have to build something on top of the protocol infrastructure defining specific topics for the requests and the related replies and having each client both subscriber (for receiving command) and publisher (for sending reply). There is no correlation between request and reply, it’s all defined at topic level (and/or using payload information).

With AMQP, even this pattern is supported natively. The requester has the possibility to specify a “replyTo” address inside the message, saying to the responder that it expects to receive the reply on such address; even the correlation is supported at protocol level thanks to message and correlation identifiers.

The real feature which makes this difference between AMQP and MQTT is that the former has message metadata (header, annotations and so on) while the latter has just payload (raw bytes) so all the features that it lacks for providing a different pattern from publish/subscribe need to be defined in terms of topics architecture and/or payload structure … so the complexity is moved at application level.

If you want to read more about these differences (even with HTTP protocol) maybe you can find my article “Strengths and weaknesses of IoT communication patterns” on DZone IoT as a useful reference (it’s part of the latest DZone IoT Guide).

Let’s say things as they are

The mentioned article says some wrong things as well.

“AMQP offers robust features like queuing” … to be precise there is no mention about queue in the AMQP specification but container, node, link and so on. This is because AMQP doesn’t specify the network architecture in terms of brokers : pay attention here, I’m speaking about AMQP 1.0 … the only OASIS and ISO/IEC standard (against the AMQP 0.9, used in RabbitMQ). AMQP can be used for RPC without “store and forward” mechanism (provided by brokers) but just with “direct” messaging; AMQP is a peer to peer protocol !

About MQTT … “An example of this optimization is its use of 1 byte keep alive packets.” … no true ! It’s 2 bytes ! … I know I’m a little bit pedantic here 🙂

Finally, it’s not true that only MQTT can work without high-availability and with a low-bandwidth. It’s true even for AMQP, considering the QoS (Quality of Service) levels it supports as well.

Speaking about messages size and computational needs on the devices side.

With MQTT each message carries the topic information, not true with AMQP where the address is specified one time on attaching the link.

When security and encryption come, the SSL/TLS overload minimizes all this differences so that even a 2 bytes packet for keep alive becomes an even bigger message. In this case, it all depends on computational resources you have on your tiny device and the difference between protocols doesn’t matter.

Conclusion

So my conclusion is clear. I have just started with that at the top of this article : there is no winner  in the (Industrial) IoT protocols wars. There are different use cases, scenarios, features needs, limitations … they all drive to the right choice that sometimes means having multiple winners in one solution !

The good news about MQTT is that in the latest v5 specification they are addressing a lot of limitations of the current 3.1.1 version, adding some AMQP-like features 🙂

So stay tuned … the war is endless !

 

 

 

Today meetup … “Open sourcing the IoT : running EnMasse on Kubernetes”

Yes … I’m at the airport waiting for my flight coming back home and I like to write something about the reason of my trip … as usual.

IMG_20170605_132419 DBjIUg1W0AEL7u7

Today, I had a meetup in Milan hosted in the Microsoft Office and organized by my friend Felice Pescatore who leads the AgileIoT project; of course my session was about messaging and IoT … so no news on that. The title ? “Open sourcing the IoT : running EnMasse on Kubernetes”.

Other friends were there with their sessions like Felice himself, Valter Minute speaking about how moving from an IoT prototype to a product and Clemente Giorio and Matteo Valoriani with very interesting sessions about Holo Lens real scenarios.

I started with an introduction about messaging and how it is related to the IoT then moving to the EnMasse project, an open source “messaging as a service” platform that is well suited for being the messaging infrastructure of an IoT solution (for example, it’s applicable inside the Eclipse Hono project).

I showed main EnMasse features and the new ones which will come in the next weeks and how EnMasse provides a messaging and IoT solution from an “on-premise” deployment to the “cloud” in a Kubernetes or OpenShift cluster. For this reason I said “open sourcing the IoT”, because all the components in such solution are open source !

IMG_20170605_132407 IMG_20170605_132359

For showing that, I had a demo with a Kubernetes cluster running on Azure Container Service deploying EnMasse and Apache Spark on that. This demo was made of an AMQP publisher sending simulated temperature values to a “temperature” address deployed in EnMasse (as a queue) and a Spark Streaming job reading such values in order to process them in real time and getting the max value in the latest 5 seconds writing the result to the “max” address (another queue); finally an AMQP receiver was running in order to read and show such values from “max”.

If you want to know more about that you can find the following resources :

Yesterday DevDay meetup : “messaging” in Naples !

devday_00

Yesterday evening I had the session titled “Messaging as a Service : building a scalable messaging service” during a meetup here in Naples speaking about the EnMasse project. The event was organized by the DevDay community which is active in my region in order to get in touch with developers who work with different technologies. I was very pleased to tell my experience (as a contributor) on developing a messaging service running “on premise” or in the cloud.

devday_01

devday_02

Following you can find the resources for this session :

  • the video published in the DevDay official YouTube channel
  • the slides and the demo code

Last but not least, I’d like to thank Davice Cerbo (from DevDay) who invited me to join the co-working space as guest during the day and setting up this meetup in the best way. Davide … keep up this great work for next events ! 😉

Let’s talk about EnMasse : the open source “Messaging as a Service”

After the Red Hat Summit speaking about JBoss AMQ and Apache Kafka using the EnMasse project, the coming weeks will be rich of sessions about this “Messaging as a Service” platform.

First of all, I’ll have a meetup on May 22nd in Naples organized by the DevDay community. It will be all around messaging (and I’m not going to speak about Whatsapp, Hangout, … :-)) and how we are developing a “Messaging as a Service” solution running on Kubernetes and OpenShift : it’s name is EnMasse.

Selection_058

The other session will be on June 5th in Milan during an IoT meetup organized by the AgileIoT community in the Microsoft House. There, I’ll always speak about EnMasse and how it “democratizes” the IoT giving you a full open source solution for that : in this case I’ll show how this “Messaging as a Service” platform can run in the Azure cloud as well.

Selection_059

So … if you want to know more about EnMasse just pick one of this events … or both ! 🙂

 

The “impact of an individual”

Monday 1st , 2017 … the alarm is ringing … it’s 3:00 AM … the time is arrived … let’s wake up … finally the Red Hat Summit is going to start !

After submitting my proposal and being accepted for a session with Christian Posta about JBoss AMQ and Apache Kafka … a few months have passed and one of the most thrilling experience in my work life is about less then 7000 kms far from me.

It’s 7:00 AM … the plane takes off and my mind starts to think about last year when, as a new hired, I was reading a lot of emails related to the Red Hat Summit materials and demo preparation from other “veteran” employees … I couldn’t imagine that after just one year I would have been one of the guys for the next summit !

Naples … Frankfurt … and finally Boston. I start to breath the summit just outside the airport with a lot of Red Hat advertising boards; the company has invested a lot of money and time for giving the best experience to employees, partners and customers in order to engage each other finding the way to collaborate and doing business but starting from something completely free … the freedom to think, to develop and to share ideas and projects with the community.

sdr dig

Yes ! … because this is the Red Hat … no NDAs … no restrictions … no limits … it’s all up to you … you are completely free … free to propose, to design, to collaborate and … finally having an “impact as an individual” (cit.)

sdr

Yes ! … every person as an individual has his freedom and more individuals make a community developing ideas … becoming the biggest “company” in the world with million of developers. This is the way for innovation, no other way is the way to go. In such a model, every single person with a different experience and a different background can give his contribution to any open source project improving and enriching it. Every single person has an “impact as an individual” but for doing that he needs to be in the right context and Red Hat is the leading company on doing that.

sdr

After landing to the Logan airport I set my compass to the Boston Convention & Exhibition Center with my new friend Bolek; I have never seen him, it’s the first time we meet in person even because he doesn’t come from my messaging team but from the Keycloak one. This is one of the great things of this kind of big conferences; you’ll meet a lot of colleagues, customers, partners or community members from all around the world sometimes for the first time even if you chat with them almost every day. During this summit I’ll be in a shared accommodation with other two guys from the Keycloak team, Sebastien and Bruno, other than one of the “gurus” of my team, Ted.

The venue is huge, people are coming for the checkin or the onsite registration and the staff is building the partners pavilion; it must be ready for the coming day … and it will be !

dav sdr

Let’s the summit begin ! A lot of speakers, from Red Hat and other companies, are bringing their knowledge and experience here with a common denominator … sharing what they are doing in the “open” way. Two keynotes every day and each one packed; all the rooms for the breakout sessions will be packed as well.

Announcements on JBoss AMQ 7, OpenShift, RHEL, Microservices and all the other projects and related products make the attendees so enthusiastic because there are a lot of things to do. New projects will come so let’s “start something” (cit) in the right way … “try, learn & modify” (cit.).

sdr

New announcements with “old” partners like Amazon, consolidation with other partners like Microsoft and experiences from big customers like Deutsche Bank. They trust us … they trust the way how Red Hat is the only company able to make an open source project reliable and usable at the enterprise level.

Attending the David Ingham and Ted Ross sessions around JBoss AMQ 7 make me so proud to be part of this team. They have been working so hard in the last years for bringing customers a new powerful experience doing messaging : the new broker, the new router component and the new clients. If you need to do messaging in your business there is no choice : from the hybrid to the cloud, AMQ 7 is the answer. And let me say that it will be even the pillar for the coming “baby” that is on its way : trust me … “incredible” guys are working on that, EnMasse is its name … the “messaging as a service” platform of the future.

dav dav

dav sdr

The IoT business is something that a lot of customers are exploring too and the “IoT Codestarter” evening event organized together with the Eclipse Foundation and Eurotech (our partner) is a great opportunity for hacking from the field with sensors and gateways to the Cloud : Kura and Kapua are the involved projects but I can say … pay attention to Hono as well (not only just because I’m working on that ;)).

On my side, the last day, when it seems that all the people are tired and want just to come back home, the room is packed. Why to use JBoss AMQ 7 ? When to use Apache Kafka ? Can I use them together ? I and Christian give the answers to these questions to the attendees. Even in this case, a lot of interest around that … “playing” around Kafka is on our radar (while I’m writing, this session has raised a lot of discussion on Twitter as well).

dav C_AZuofXYAElcyC

Last but not least during these days I’m falling in love with Boston day by day; it seems to be ten or more cities into only one. Walking through the city and longside the river is a great experience.

dig dav

sdr sdr

Three days have passed, all the Red Hatters and the other attendees are leaving Boston with a new freshness and the certainty that more great things are coming. What does it mean ? The countdown to the next summit is already started, San Francisco will wait for us, let’s see how every person will have his “impact as an individual” during this year.

This is the open source, this is Red Hat, this is the summit … something that you can breath every day which makes you part of a big community where the power, as I like to say, is always in the “collaboration” !

IoT developer survey : my 2 cents one year later …

As last year, I have decided to write a blog post about my point of view on the IoT developer survey from the Eclipse Foundation (IoT Working Group) with IEEE, Agile IoT and the IoT Council.

From my point of view, the final report gives always interesting insights on where the IoT business is going and about that, Ian Skerrett (Vice President of Marketing at Eclipse Foundation) has already analyzed the results, available here, writing a great blog post.

I want just to add 2 more cents on that …

Industry adoption …

It’s clear that industries are adopting IoT and there is a big increment for industrial automation, smart cities, energy management, building automation, transportation, healthcare and so on. IoT is becoming “real” even if, as we will see in the next paragraphs, it seems that we are still in a prototyping stage. A lot of companies are investing on that but few of them have real solutions running in the field. Finally, from my point of view, it could be great to add more information about countries because I think that there is a big difference on how and where every country is investing for IoT.

The concerns …

Security is always the big concern but, as Ian said, interoperability and connectivity are on a downward trend; I agree with him saying that all the available middleware solutions and the IoT connectivity platforms are solving these problems. The great news is that all of them support different open and standard protocols (MQTT, AMQP but even HTTP) that is the way to go for having interoperability; at same time we are able to connect a lot of different devices, supporting different protocols, so the connectivity problem is addressed as well.

Coming back to security, the survey shows that much more software developers are involved on building IoT solutions even because all the stuff they mostly use are SSL/TLS and data encryption so at software level. From my point of view, some security concerns should be addressed at hardware level (using crypto-chip, TPM and so on) but this is an area where software developers have a lack of knowledge. It’s not a surprise because we know that IoT needs a lot of different knowledge from different people but the survey shows that in some cases not the “right” people are involved on developing IoT solution. Too much web and mobile developers are working on that, too few embedded developer with a real hardware knowledge.

Languages : finally a distinction !

Last year, in my 2 cents, I asked for having a distinction on which side of an IoT solution we consider the most used programming languages. I’m happy to know that Eclipse Foundation got this suggestion so this year survey asked about languages used on constrained devices, gateway and cloud.

iot_survey

The results don’t surprise me : C is the most used language on “real” low constrained devices and all the other languages from Java to Python are mostly used on gateways; JavaScript fits in the cloud mainly with NodeJS. In any case, NodeJS is not a language so my idea is that providing only JavaScript as possible answer was enough even because other than using a server-side framework like NodeJS the other possibility is using JavaScript in “function as a service” platforms (i.e. Lambda from AWS, Azure Functions and so on) that are mostly based on NodeJS. Of course, the most used language in the cloud is Java.

What about OS ?

Linux is the most used OS for both constrained devices and IoT gateways but … here a strange thing comes in my mind. On “real” constrained devices that are based on MCUs (i.e. Cortex-Mx) you can run few specific Linux distros (i.e. uCLinux) and not a full Linux distro so it’s strange that Linux wins on constrained devices but then when the survey shows what distros are used, uCLinux has a very low percentage. My guess is that a lot of software developers don’t know what a constrained device is 🙂

On constrained devices I expect that developers uses “no OS” (programming on bare metal) or a really tiny RTOS but not something closed to Linux.

On gateways I totally agree with Linux but Windows is growing from last year.

Regarding the most used distros, the Raspbian victory shows that we are still in a prototyping stage. I can’t believe that developers are using Raspbian so the related Raspberry Pi hardware in production ! If it’s true … I’m scared about that ! If you know what are the planes, trains, building automation systems which are using something like that, please tell me … I have to avoid them 🙂

Regarding the protocols …

From my point of view, the presence of TCP/IP in the connectivity protocols results is misleading. TCP/IP is a protocol used on top of Ethernet and Wi-Fi that are in the same results and we can’t compare them.

Regarding communication protocols, the current know-how is still leading; this is the reason why HTTP 1.1 is still on the top and HTTP 2.0 is growing. MQTT is there followed by CoAP, which is surprising me considering the necessity to have an HTTP proxy for exporting local traffic outside of a local devices network. AMQP is finding its own way and I think that in the medium/long term it will become a big player on that.

Cloud services

In this area we should have a distinction because the question is pretty general but we know that you can use Amazon AWS or Microsoft Azure for IoT in two ways :

  • as IaaS hosting your own solution or an open source one for IoT (i.e. just using provided virtual machines for running an IoT software stack)
  • as PaaS using the managed IoT platforms (i.e. AWS IoT, Azure IoT Hub, …)

Having Amazon AWS on the top doesn’t surprise me but we could have more details on how it is used by the IoT developers.

Conclusion

The IoT business is growing and its adoption as well but looking at these survey results, most of the companies are still in a prototyping stage and few of them have a real IoT solution in the field.

It means that there is a lot of space for all to be invited to the party ! 😀

 

Being a “remotee” in Red Hat … one year later

When I started to work at Red Hat last year (on March 1st), all my friends and relatives asked me a lot of questions about my new job … from home !

“What are your working hours ?”, “How does you manager verify that you are working ?”, “How do you share artifacts with your colleagues ?”, “What’s your daily life without getting in touch with your colleagues ?” …

I understand that for people, who don’t know what being a “remotee” means, it’s quite difficult to understand this way to spend the time at home but … working.

Other people just for kidding say “You are at home, you can do whatever you want” … but it’s not absolutely true ! It’s exactly the opposite !

I can say that every day I chat with my colleagues, writing or speaking it depends on the stuff we have to discuss. Once a week I have a video call for syncing about the work each team member has done during the last week. Sometimes, we have the chance to meet in person for conferences or business travels (it was great for me being in Boston in January for the F2F meeting of the whole “messaging” team).

Even if I don’t meet in person my colleagues every day, I have started to have friendship feelings with them. I think that if we lived in the same place, I could have a lot activities with them outside of the office. Of course, it’s a matter of how people are … and I was very lucky with my colleagues. By the way, having such a feeling means that you are comfortable in the team and with your “remote” colleagues.

We are an open source company, so all my artifacts are available online (on GitHub) but in any case we also share documents regarding stuff we are developing in order to have a place for getting feedback and comments.

After one year, I have these main points to share with you :

  • You have to be the manager of yourself and it’s not always simple.
  • You need a separation between being at “home” and being at the “home office”.
  • You have flexibility on the working hours but my preference is to have a full day work as I was in a “real” office.
  • You have your manager who trusts in you … for having a dispersed team, the trust is one of the main aspects.
  • You need to be a passionate employee about your job … you have to love it … and I’m lucky on that ! 🙂

Quite often, working from home means working more and it’s exactly the opposite of what a lot of people think. You are right there at the “office” (just few seconds from the bed), you are right there at “home” (just few steps from your desk) … but if you are passionate about what you do … it’s not a problem … at least for me 🙂

Of course, there are some perks about being a “remotee” :

  • I have more free time in the early morning and late afternoon because I can avoid to waste my time in a traffic jam ! Now I’m a runner who starts his day at 6:00 AM for having a workout and even a father that can play with his children just “one minute” after ending the work.
  • I have time for taking and picking up my son to the school.
  • Last year I had a daughter and since July I have been seeing her every hour during each day and how she is growing.
  • When I have a break during the day I can speak with my wife or play a little bit with my children.
  • There are few distractions because when you are at your desk … you are alone 🙂 You can be more concentrate on the problem you are trying to solve.
  • Two additional perks are … having a good Napolitan coffee at lunch and watching a “The Big Bang Theory” episode after lunch 🙂

I think that in a such working environment you need two main things : being passionate and being professional.

This short post came to my mind after reading a blog post series written by a Red Hatter during the last days explaining how it’s possible to work in a “dispersed” team and I think that you can read it to understand better how “dispersed” teams work great here in Red Hat.

So with this … I hope I have answered to all the people with their questions ! 🙂

And now … now I’m ready … ready for the Red Hat Summit where I’ll meet in person some of my colleagues and other Red Hatters from all around the world !