Kafka is a distributed publish-subscribe messaging system that allows processes called producers to publish messages to topics, and processes called consumers to subscribe to topics and receive the stream of published messages. It maintains feeds of messages in categories called topics, and messages are distributed across partitions in a fault-tolerant way. Kafka is run as a cluster of servers called brokers that together form a scalable and durable messaging backbone.
Troubleshooting RabbitMQ and services that use itMichael Klishin
Designing a system in terms of [micro] services is hype du jour but it's not without trade-offs. Debugging a distributed system can be challenging. In this talk we will cover how one can start troubleshooting a distributed service-oriented system.
This document is a presentation from OpenStack Summit Sydney. It describes how to easily install OpenStack on Kubernetes. It explains Kubernetes and OpenStack-Helm.
A presentation on how applying Cloud Architecture Patterns using Docker Swarm as orchestrator is possible to create reliable, resilient and scalable FIWARE platforms.
Troubleshooting RabbitMQ and services that use itMichael Klishin
Designing a system in terms of [micro] services is hype du jour but it's not without trade-offs. Debugging a distributed system can be challenging. In this talk we will cover how one can start troubleshooting a distributed service-oriented system.
This document is a presentation from OpenStack Summit Sydney. It describes how to easily install OpenStack on Kubernetes. It explains Kubernetes and OpenStack-Helm.
A presentation on how applying Cloud Architecture Patterns using Docker Swarm as orchestrator is possible to create reliable, resilient and scalable FIWARE platforms.
NATS: Simple, Secure and Scalable Messaging For the Cloud Native Erawallyqs
The majority of middleware and messaging systems in use were built in a time that did not have the concept of scale and real-time data that developers operate in today.
With the rise of Cloud Native and Microservices architectures as a design principle and the emphasis on simplicity, speed, and flexibility that come with it, developers need a messaging protocol to match.
Enter NATS. NATS is a remarkably lightweight messaging protocol, and extremely flexible and resilient. It is just a few MB in size, and can scale to publish tens of millions of message from a single server.
The Zen of High Performance Messaging with NATS (Strange Loop 2016)wallyqs
Video: https://www.youtube.com/watch?v=dYrYCt2dTkw
HTML5: https://wallyqs.github.io/stl-nats-talk/
NATS is an open source, high performant messaging system with a design oriented towards both being as simple and reliable as possible without at the same time trading off scalability. Originally written in Ruby, and then rewritten in Go, a NATS server can nowadays push over 11M messages per second.
In this talk, we will cover how following simplicity as the main design constraint as well as focusing on a limited built-in feature set, resulted in a system which is easy to operate and reason about, making up for an attractive choice for when building many types of distributed systems where low latency and high availability are very important.
Jaime Piña, @variadico, Software Engineer at Apcera
Microservice issues are networking issues. Fixing code in your app is easy, but the hard part of using microservices is the networking. How do you actually know if you're sending what you think you are? Why does this request fail in my app, but not when I use curl? Is this service very slow or is it up at all?
This talk will help demystify some common problems you might experience while building out your collection of microservices. Once you can find the issue, it becomes way easier to fix.
Securing & Enforcing Network Policy and Encryption with Weave NetLuke Marsden
This talk starts with a primer on container networking, then goes on to cover two distinct areas of container network security: encryption, enabled by IPsec in Weave Net and container firewalls, enabled by Kubernetes Network Policy and enforced by the Weave Net Network Policy Controller. A discussion of thread models is included.
Webinar Monitoring in era of cloud computingCREATE-NET
The webinar "Monitoring in era of cloud computing" covers the topics:
1. What is Monitoring
2. Ceilometer
- Architecture
- Agents (Compute/Central)
- Storage & API
- Quick Demo
3. Monasca
- Architecture
- Events/Messages
- Storage & API
- Quick Demo
Introduction to ZooKeeper - TriHUG May 22, 2012mumrah
Presentation given at TriHUG (Triangle Hadoop User Group) on May 22, 2012. Gives a basic overview of Apache ZooKeeper as well as some common use cases, 3rd party libraries, and "gotchas"
Demo code available at https://github.com/mumrah/trihug-zookeeper-demo
At DockerCon EU we introduced Docker Swarm: a Docker-native clustering system. It allows you to connect to a single Docker endpoint and run containers on an entire cluster.
Docker Swarm comes with a simple discovery service, for an easy setup. If you already have a discover service within your infrastructure like consul or etcd, you can use those instead.
How to use kakfa for storing intermediate data and use it as a pub/sub model with each of the Producer/Consumer/Topic configs deeply and the Internals working of it.
NATS: Simple, Secure and Scalable Messaging For the Cloud Native Erawallyqs
The majority of middleware and messaging systems in use were built in a time that did not have the concept of scale and real-time data that developers operate in today.
With the rise of Cloud Native and Microservices architectures as a design principle and the emphasis on simplicity, speed, and flexibility that come with it, developers need a messaging protocol to match.
Enter NATS. NATS is a remarkably lightweight messaging protocol, and extremely flexible and resilient. It is just a few MB in size, and can scale to publish tens of millions of message from a single server.
The Zen of High Performance Messaging with NATS (Strange Loop 2016)wallyqs
Video: https://www.youtube.com/watch?v=dYrYCt2dTkw
HTML5: https://wallyqs.github.io/stl-nats-talk/
NATS is an open source, high performant messaging system with a design oriented towards both being as simple and reliable as possible without at the same time trading off scalability. Originally written in Ruby, and then rewritten in Go, a NATS server can nowadays push over 11M messages per second.
In this talk, we will cover how following simplicity as the main design constraint as well as focusing on a limited built-in feature set, resulted in a system which is easy to operate and reason about, making up for an attractive choice for when building many types of distributed systems where low latency and high availability are very important.
Jaime Piña, @variadico, Software Engineer at Apcera
Microservice issues are networking issues. Fixing code in your app is easy, but the hard part of using microservices is the networking. How do you actually know if you're sending what you think you are? Why does this request fail in my app, but not when I use curl? Is this service very slow or is it up at all?
This talk will help demystify some common problems you might experience while building out your collection of microservices. Once you can find the issue, it becomes way easier to fix.
Securing & Enforcing Network Policy and Encryption with Weave NetLuke Marsden
This talk starts with a primer on container networking, then goes on to cover two distinct areas of container network security: encryption, enabled by IPsec in Weave Net and container firewalls, enabled by Kubernetes Network Policy and enforced by the Weave Net Network Policy Controller. A discussion of thread models is included.
Webinar Monitoring in era of cloud computingCREATE-NET
The webinar "Monitoring in era of cloud computing" covers the topics:
1. What is Monitoring
2. Ceilometer
- Architecture
- Agents (Compute/Central)
- Storage & API
- Quick Demo
3. Monasca
- Architecture
- Events/Messages
- Storage & API
- Quick Demo
Introduction to ZooKeeper - TriHUG May 22, 2012mumrah
Presentation given at TriHUG (Triangle Hadoop User Group) on May 22, 2012. Gives a basic overview of Apache ZooKeeper as well as some common use cases, 3rd party libraries, and "gotchas"
Demo code available at https://github.com/mumrah/trihug-zookeeper-demo
At DockerCon EU we introduced Docker Swarm: a Docker-native clustering system. It allows you to connect to a single Docker endpoint and run containers on an entire cluster.
Docker Swarm comes with a simple discovery service, for an easy setup. If you already have a discover service within your infrastructure like consul or etcd, you can use those instead.
How to use kakfa for storing intermediate data and use it as a pub/sub model with each of the Producer/Consumer/Topic configs deeply and the Internals working of it.
Apache Kafka - Scalable Message Processing and more!Guido Schmutz
In the world of sensors and social media streams, the integration and handling of high-volume event streams is more important than ever. Events have to be handled both efficiently and reliably and often many consumers or systems are interested in all or part of the events. How do we make sure that all these event are accepted and forwarded in an efficient and reliable way? Apache Kafka, a distributed, highly-scalable messaging broker, build for exchanging huge amount of messages between a source and a target can be of great help in such scenario.
This session introduces Apache Kafka and its place in a modern architecture, shows its integration with Oracle Stack and presents the Oracle Event Hub cloud service, the managed Kafka service.
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
Developing Realtime Data Pipelines With Apache Kafka. Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers. Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact. Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
Streaming kafka search utility for Mozilla's Bagheera
KAFKA Quickstart
1. KAFKA
Introduction:
Kafka is a distributed publish-subscribe messaging system that is designed to be
fast, scalable, and durable.
Kafka maintains feeds of messages in categories called topics.
Kafka messages are generated by processes called producers.
The processes that subscribe to topics and process the feed of published
messages are called consumers.
Kafka is run as a cluster comprised of one or more servers each of which
is called a broker.
Quick Start:
1. Create a topic
/usr/bin/kafka-topics --create --zookeeper zookeeperIP:2181 --
replication-factor1 --partitions1 --topic testTopic
2. Publish Messagevia Producer to a topic
/usr/bin/kafka-console-producer --broker-list producerIP:9092 --topic
testTopic
This is firstkafka message
2. 3. Starting a consumer
/usr/bin/kafka-console-consumer --zookeeper zookeeperIP:2181 --topic
testTopic --from-beginning
If you start different consumers in different sessions of putty then you can see
messages being delivered to all the consumers as soon as the producer
publishes them
A bit more details:
A topic is a category or feed name to which messages are published. For each
topic, the Kafka cluster maintains a partitioned log that looks like…
3. Each partition is an ordered, immutable sequence of messages that is continually
appended to commit log. The messages in the partitions are each assigned a
sequential id number called the offset that uniquely identifies each message
within the partition.
The Kafka cluster retains all published messages whether or not they have been
consumed for a configurable period of time. Log retention can be set in two
either, day based retention or size based retention. Kafka's performance is
effectively constant with respect to data size so retaining lots of data is not a
problem.
QnA:
1. What type of messages canbe sent froma producer
The producer class takes two generic parameter i.e
Producer<K, V>
V: type of the message
K: type of the optional key associated with the message
So any kind of message can be sent for example String, JSON, AVRO
2. How a consumer can start reading from a particular offset
Kafka does not take care of the offset up till which a particular consumer has
already read. The consumer has to take care of the offset on his side. The
information regarding the offset up till he has consumed the messages have to
be stored elsewhere i.e HDFS/Db/HBase etc.
Kafka only provides two kind of reading from Beginning OR from Latest Time
4. 3. So When touse Kafka?
Cloudera recommends using Kafka if the data will be consumed by multiple
applications
API Examples:
A sample Producer
Propertiesprops=newProperties();
props.put("metadata.broker.list",args[0]);
props.put("zk.connect",args[1]);
props.put("serializer.class","kafka.serializer.StringEncoder");
props.put("request.required.acks","1");
StringTOPIC= "event";
ProducerConfigconfig=newProducerConfig(props);
Producer<String,String>producer=new Producer<String,String>(config);
String[] events={"Normal","Normal","Normal",…];
String[] truckIds= {"1", "2", "3","4"};
String[] driverIds={"11", "12", "13", "14"};
Stringmessage = newTimestamp(newDate().getTime()) +"|"
+ truckIds[2] + "|" + driverIds[2] +"|" + events[random.nextInt(evtCnt)] );
try {
KeyedMessage<String,String>data= new KeyedMessage<String,String>(TOPIC, message);
producer.send(data);
Thread.sleep(1000);
} catch (Exceptione) {
e.printStackTrace();
}
A sample Consumer
Kafka provides a simple consumer which can be modified as per requirement
Steps for using a Simple Consumer
Find an active Broker and find out which Broker is the leader for your topic
and partition
Determine who the replica Brokers are for your topic and partition
5. Build the request defining what data you are interested in
Fetch the data
Identify and recover from leader changes
Data fetch pseudo code
FetchRequestreq=new FetchRequestBuilder().clientId(clientName).addFetch(a_topic,a_partition,
readOffset, 100000).build();
FetchResponsefetchResponse =consumer.fetch(req);
if (fetchResponse.hasError()) {
//Error Handlingcode here
}
for (MessageAndOffset messageAndOffset:fetchResponse.messageSet(a_topic,a_partition)) {
longcurrentOffset=messageAndOffset.offset();
if (currentOffset<readOffset) {
//Properloggerhere
continue;
}
readOffset=messageAndOffset.nextOffset();
ByteBufferpayload=messageAndOffset.message().payload();
byte[] bytes=new byte[payload.limit()];
payload.get(bytes);
System.out.println(String.valueOf(messageAndOffset.offset()) + ": " + new String(bytes, "UTF-8"));
numRead++;
a_maxReads--;
}
Conclusion:
As you can see, Kafka has a unique design that makes it very useful for solving a
wide range of architectural challenges. It is important to make sure you use the
right approach for your use case and use it correctly to ensure high throughput,
low latency, high availability, and no loss of data.