Kafka y python

Define y gobierna tus APIs
Kafka y python
17/05/2016

Python Madrid · Python y Kafka
Kafka y python

Kafka y Python
¿Quién soy?
Ingeniero de Software
@Paradigma Digital
@lvaroleon
aleonsan

Kafka y Python
Kafka Origen

Kafka y Python
Kafka ¿Qué es?
“
”
If you think of Hadoop as long-term
memory, the question then is how you
get the memories in there to begin with
Apache Kafka is like the central nervous
system, which collects all of these
messages from the underlying systems
and transmits them into the memory
vault, or storage.
- Eric Vishria

Kafka y Python
Kafka Motivation
To be able to act as a unified
platform for handling all the
real-time data feeds a large
company might have.
…
…
…
…
…
…
Event
Tracking
Application
Logs
Application
Messages
Application
Monitoring
data

Kafka y Python
Kafka How to ?
● Distributed, the essence
● Scalable
● Efficient
● Durable, fault tolerance

Kafka y Python
Kafka Básicos
P PP
C C C C
…
…
…
Kafka Cluster
● Producers
● Brokers
● Consumers

Kafka y Python
Kafka Cluster: Topics & Partitions
Kafka Cluster ● Topics
● Partitions
● Message
1 2 3 4 5
1 2 3 4
1 2 3
1 2 3
T1P0
T1P1
T2P0
T2P1

Kafka y Python
Kafka Partitions & Replication
Kafka Cluster ● Replication factor
○ Leader
○ Followers
● ISR
○ In-sync policies
1 2 3 4 5
Broker1Broker2
1 2 3 4 5
PM
PM

Kafka y Python
Kafka Producers
● Publish Messages
● Choose partitions
○ policies
● Producer configuration
○ ACKs
○ Retries
○ Batch size
○ ...

Kafka y Python
Kafka Consumers
● “Subscribe” to a feed
● Consumer groups Kafka Cluster
Partition 0
Broker1Broker2
Partition 1
○ Queue
○ Publish-subscribe
C
C
C
● Order
guarantees
C
C

Kafka y Python
Kafka Efficiency
● Small I/O problem
○ Message sets
● Message set compression
○ policies
● Standard binary message format
○ Transfer without modifications

Kafka y Python
Kafka Python Clients
● Kafka-python
○ 0.8+, recomendada 0.9
○ Python 3.6+
○ Python 3.3+
https://github.com/dpkp/kafka-python
● Pykafka
○ 0.8.2+
○ Python 2.7+
○ Python 3.4+
https://github.com/Parsely/pykafka

Kafka y Python
Kafka Python Clients Kafka-python
● Producer
class kafka.KafkaProducer:
def __init__(self, **configs)
def send(self, topic, value=None, key=None, partition=None)
● class RecordAccumulator:
● class Partitioner:
def flush(self, timeout=None)

Kafka y Python
● Consumer
○ message iterator
class kafka.KafkaConsumer(six.Iterator):
def __init__(self, *topics, **configs)
def __next__(self)
def subscribe(self, topics=(), pattern=None, listener=None)
def unsubscribe(self)
def assign(self, partitions)
def seek(self, partition, offset)
def commit(self, offsets=None)

Kafka y Python
● Cluster
○ client manages some cluster metadata
class kafka.ClusterMetadata:
def __init__(self, **configs)
def available_partitions_for_topic(self, topic)
def leader_for_partition(self, partition)
def partitions_for_broker(self, broker_id)
…
def update_metadata(self, metadata)
● ConsumerCoordinator

Kafka y Python
Kafka Python Clients pyKafka
● Producer
class pykafka.Producer:
def __init__(self, . . . )
def produce(self, message, partition_key=None)
● Consumer
class pykafka.SimpleConsumer:
def __init__(self, . . .)
def consume(self, block=True)

Kafka y Python
● Ejemplo
client = pykafka.KafkaClient(. . .)
topic = client.topics[0]
producer = topic.get_sync_producer()
. . .
consumer = topic.get_simple_consumer()
for message in consumer:

Kafka y Python
Kafka Python Clients Demo
Demo Time

Kafka y Python
Kafka y Python Thanks
for your attention
Thank you

Kafka y Python
Kafka y Python Questions
¿ ?

Kafka y Python
Kafka El Clúster

Kafka y python

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (18)

Similar to Kafka y python

Similar to Kafka y python (20)

More from Paradigma Digital

More from Paradigma Digital (14)

Recently uploaded

Recently uploaded (20)

Kafka y python