SlideShare a Scribd company logo
1 of 124
Download to read offline
Real-time
real estate listings
with Apache Kafka
Ferran Galí i Reniu
What is LIFULL Connect?
The real-time listings system
What is a Real Estate Listing?
What is a Real Estate Listing?
Now...
What is LIFULL Connect?
WE
HELP PEOPLE
FIND A HOME
LIFULL Connect mission
What is LIFULL Connect?
LIFULL Connect
LIFULL Connect
MILLIONS
OF
PEOPLE
TO HELP
What is LIFULL Connect?
The real-time listings system
Connections
To connect
home-seekers
with listings…
…we should
connect them
with their owners
Connections
Challenge #1: Listing Publishing
Challenge #1: Listing Publishing
Challenge #1: Listing Publishing
We created a unified back-office, so
professionals can publish in a single
place
Challenge #1: Listing Publishing
Unified Back-office
Unified Back-office
Unified Back-office
Challenge #2: Freshness
Yeah! But…
10 sites
5 teams
8 legacy batch systems
4h-24h listing publication delay
Real Estate Professionals don’t care.
They want listings published, now.
(and we want to offer them the best experience)
Challenge #2: Freshness
Welcome Real-time Listings System
(Internally we call it ReTiS)
Challenge #2: Freshness
Goals
Real-time data propagation
Strangle legacy systems
Reduce complexity
Real-time Listings System
Real-time Listings System
Real-time Listings System
Real-time Listings System
Real-time Listings System
How we did it?
What is LIFULL Connect?
The real-time listings system
How we propagate data?
How we process listings?
How we serve listings?
How we propagate data?
How we propagate data?
How we propagate data?
Domain Events
+
Event Log
How we propagate data?
Domain events
A domain event is something that happened in a specific domain
that we want others to be aware of.
How we propagate data?
Event Log
It’s an append-only sequence of events.
It differs from a queue, because items are not consumed
How we propagate data?
How we propagate data?
ListingHasBeenPublished
How we propagate data?
ListingHasBeenPublished
ListingHasBeenModified
How we propagate data?
ListingHasBeenPublished
ListingHasBeenModified
ListingHasBeenHidden
ListingHasBeenModified
ListingHasBeenModified
ListingHasBeenPublished
How we propagate data?
ListingHasBeenPublished
ListingHasBeenModified
ListingHasBeenHidden
ListingHasBeenModified
ListingHasBeenModified
ListingHasBeenPublished
How we implemented it?
Apache Kafka
+
Apache Avro
Apache Avro
● Efficient engine to serialize data
● Libraries in multiple languages
● Schema is used to write and read data
Apache Avro
MyData
id:
c4d729f3-3d20-4b72-8ffe-c6dd828b9a2c
price: 233243.2
Datum
{
"type": "record",
"name": "MyData",
"namespace": "com.lifullconnect"
,
"doc": "This is the data"
,
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "price",
"type": "double"
"default": 0.0
}
]
}
Apache Avro
Schema
MyData
id:
c4d729f3-3d20-4b72-8ffe-c6dd828b9a2c
price: 233243.2
Datum
{
"type": "record",
"name": "MyData",
"namespace": "com.lifullconnect"
,
"doc": "This is the data"
,
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "price",
"type": "double"
"default": 0.0
}
]
}
Apache Avro
c4d729f3-3d20-4b72-8ffe-c6dd
828b9a2c�����234243.2�
Schema
Serialized
MyData
id:
c4d729f3-3d20-4b72-8ffe-c6dd828b9a2c
price: 233243.2
Datum
{
"type": "record",
"name": "MyData",
"namespace": "com.lifullconnect"
,
"doc": "This is the data"
,
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "price",
"type": "double"
"default": 0.0
}
]
}
Apache Avro
c4d729f3-3d20-4b72-8ffe-c6dd
828b9a2c�����234243.2�
Schema
Serialized
MyData
id:
c4d729f3-3d20-4b72-8ffe-c6dd828b9a2c
price: 233243.2
Datum
How we implemented it?
ListingHasBeenPublished
event_id: UUID //a unique ID
produced_by: String //owner of data
occurred_on: Date //when it happened
id: String //which ID in the domain refers
listing: …
…
How we implemented it?
ListingHasBeenPublished
event_id: UUID //a unique ID
produced_by: String //owner of data
occurred_on: Date //when it happened
id: String //which ID in the domain refers
listing: …
…
ListingHasBeenModified
event_id: UUID //a unique ID
produced_by: String //owner of data
occurred_on: Date //when it happened
id: String //which ID in the domain refers
modifications: …
…
How we implemented it?
ListingHasBeenPublished
event_id: UUID //a unique ID
produced_by: String //owner of data
occurred_on: Date //when it happened
id: String //which ID in the domain refers
listing: …
…
ListingHasBeenModified
event_id: UUID //a unique ID
produced_by: String //owner of data
occurred_on: Date //when it happened
id: String //which ID in the domain refers
modifications: …
…
ListingHasBeenBoosted
event_id: UUID //a unique ID
produced_by: String //owner of data
occurred_on: Date //when it happened
id: String //which ID in the domain refers
boosting: …
…
Apache Kafka
● Implements a log
● Distributed in a cluster of broker nodes
● Fault tolerant through partitions & replicas
● Enables multiple consumers of data
● Enables real-time consumption of data
Apache Kafka
Topic 1 Topic 2
Apache Kafka
Topic 1
Partition 2
Partition 3
Topic 2
Partition 2
Partition 1
Partition 1
Apache Kafka
Topic 1
Partition 1 Partition 2
Partition 3
Topic 2
Partition 1 Partition 2
Apache Kafka
Topic 1
Partition 1 Partition 2
Partition 3
Topic 2
Partition 1 Partition 2
Broker 1 Broker 2
Broker 3
Apache Kafka
Topic 1
Partition 1 Partition 2
Partition 3
Topic 2
Partition 1 Partition 2
Partition 1 Partition 2 Partition 1
Partition 3 Partition 2
Broker 1 Broker 2
Broker 3
Apache Kafka
Topic 1
Partition 1 Partition 2
Partition 3
Topic 2
Partition 1 Partition 2
Partition 1 Partition 2
replica
Partition 2
replica
Partition 2 Partition 1
Partition 3
replica
Partition 3 Partition 2
Partition 1
replica
Partition 1
replica
Broker 1 Broker 2
Broker 3
How we implemented it?
Topic listingEvents
Partition 1 Partition 2
How we implemented it?
ListingHasBeenPublished (id: 1)
Topic listingEvents
Partition 1 Partition 2
ListingHasBeenModified (id: 1)
How we implemented it?
ListingHasBeenPublished (id: 1) ListingHasBeenPublished (id: 2)
Topic listingEvents
Partition 1 Partition 2
ListingHasBeenPublished (id: 1)
ListingHasBeenModified (id: 1) ListingHasBeenModified (id: 2)
How we implemented it?
ListingHasBeenPublished (id: 1)
ListingHasBeenModified (id: 1)
ListingHasBeenPublished (id: 3)
ListingHasBeenPublished (id: 2)
ListingHasBeenModified (id: 2)
Topic listingEvents
Partition 1 Partition 2
How we implemented it?
ListingHasBeenPublished (id: 1)
ListingHasBeenModified (id: 1)
ListingHasBeenPublished (id: 3)
ListingHasBeenModified (id: 1)
ListingHasBeenPublished (id: 2)
ListingHasBeenModified (id: 2)
Topic listingEvents
Partition 1 Partition 2
Apache Kafka + Apache Avro
● Send Avro binary records to Kafka
● Every single record has a schema, registered into a
schema registry
● Schema registry allows controlled schema evolution
Apache Kafka + Apache Avro
Schema Registry
Kafka Cluster
Apache Kafka + Apache Avro
Producer
Schema Registry
Datum A
Datum A
Datum B
Datum B
Kafka Cluster
Apache Kafka + Apache Avro
Producer
Schema Registry
Datum A
Datum A
Datum B
Datum B
Kafka Cluster
1 �����
Schema A
1
Apache Kafka + Apache Avro
Producer
Schema Registry
Datum A
Datum A
Datum B
Datum B
Kafka Cluster
1 �����
Schema A
1
1 �����
Apache Kafka + Apache Avro
Producer
Schema Registry
Datum A
Datum A
Datum B
Datum B
Kafka Cluster
1 �����
Schema A
1
1 �����
Schema B
2
2
2
�����
�����
Apache Kafka + Apache Avro
1
1
2
2
�����
�����
�����
�����
Consumer
Producer
Schema A
1
Schema B
2
Schema Registry
Datum A
Datum A
Datum B
Datum B
Kafka Cluster
Apache Kafka + Apache Avro
1
1
2
2
�����
�����
�����
�����
Consumer
Producer
Schema A
1
Schema B
2
Schema Registry
Datum A
Datum A
Datum B
Datum B
Kafka Cluster
Apache Kafka + Apache Avro
1
1
2
2
�����
�����
�����
�����
Consumer
Producer
Schema A
1
Schema B
2
Schema Registry
Datum A
Datum A
Datum B
Datum B
Kafka Cluster
Datum A
Apache Kafka + Apache Avro
1
1
2
2
�����
�����
�����
�����
Consumer
Producer
Schema A
1
Schema B
2
Schema Registry
Datum A
Datum A
Datum B
Datum B
Kafka Cluster
Datum A
Datum A
Apache Kafka + Apache Avro
1
1
2
2
�����
�����
�����
�����
Consumer
Producer
Schema A
1
Schema B
2
Schema Registry
Datum A
Datum A
Datum B
Datum B
Kafka Cluster
Datum A
Datum A
Datum B
Datum B
Apache Kafka + Apache Avro
1
1
2
2
�����
�����
�����
�����
Consumer
Producer
Schema A
1
Schema B
2
Schema Registry
Datum A
Datum A
Datum B
Datum B
Datum A’
Kafka Cluster
Datum A
Datum A
Datum B
Datum B
Apache Kafka + Apache Avro
1
1
2
2
�����
�����
�����
�����
Consumer
Producer
Schema A
1
Schema B
2
Schema Registry
Datum A
Datum A
Datum B
Datum B
Datum A’
3 �����
Schema A’
3
Kafka Cluster
Datum A
Datum A
Datum B
Datum B
Since it’s an evolution, it
will check compatibility.
Fail if broken.
Apache Kafka + Apache Avro
1
1
2
2
�����
�����
�����
�����
Consumer
Producer
Schema A
1
Schema B
2
Schema Registry
Datum A
Datum A
Datum B
Datum B
Datum A’
3 �����
Schema A’
3
Kafka Cluster
Datum A
Datum A
Datum B
Datum B
Datum A’
No compatibility broken,
so can read without
problem.
How we propagate data?
ListingHasBeenPublished
ListingHasBeenModified
ListingHasBeenHidden
ListingHasBeenModified
ListingHasBeenModified
ListingHasBeenPublished
What is LIFULL Connect?
The real-time listings system
How we propagate data?
How we process listings?
How we serve listings?
Real-time Listings System
Real-time Listings System
How we process data?
Apache Kafka Streams
● Framework to do transformations to data stored in
Kafka
● Has an easy-to-use functional API, that runs on JVM
● Does the data processing in streaming
● Ensures exactly once delivery
● Distributed
● Fault tolerant
● Enables stateless and stateful processing of
information
Apache Kafka Streams
Apache Kafka Streams
import org.apache.kafka.streams.
KafkaStreams
import org.apache.kafka.streams.
StreamsBuilder
import org.apache.kafka.streams.kstream.
KStream
import java.util.
Properties
fun main() {
val streamsBuilder = StreamsBuilder()
val eventsStream
: KStream<Key, Value> = streamsBuilder
.stream("events")
val transformedStream
: KStream<Key2, Value2> = eventsStream
.map { key, value ->
TODO("do a transformation"
)
}
transformedStream
.to("eventsTransformed"
)
val topology = streamsBuilder
.build()
val kafkaStreams = KafkaStreams(
topology, Properties())
kafkaStreams
.start()
}
Stateless Kafka Streams Application
events
eventsTransformed
Apache Kafka Streams
import org.apache.kafka.streams.
KafkaStreams
import org.apache.kafka.streams.
StreamsBuilder
import org.apache.kafka.streams.kstream.
KStream
import java.util.
Properties
fun main() {
val streamsBuilder = StreamsBuilder()
val eventsStream: KStream<Key, Value> = streamsBuilder.stream("events")
val transformedStream: KStream<Key2, Value2> = eventsStream.map
{ key, value ->
TODO("do a transformation")
}
transformedStream.to("eventsTransformed")
val topology = streamsBuilder
.build()
val kafkaStreams = KafkaStreams(
topology, Properties())
kafkaStreams
.start()
}
Stateless Kafka Streams Application
events
eventsTransformed
Boilerplate
Apache Kafka Streams
import org.apache.kafka.streams.KafkaStreams
import org.apache.kafka.streams.StreamsBuilder
import org.apache.kafka.streams.kstream.KStream
import java.util.Properties
fun main() {
val streamsBuilder = StreamsBuilder()
val eventsStream
: KStream<Key, Value> = streamsBuilder
.stream("events")
val transformedStream: KStream<Key2, Value2> = eventsStream.map
{ key, value ->
TODO("do a transformation")
}
transformedStream.to("eventsTransformed")
val topology = streamsBuilder.build()
val kafkaStreams = KafkaStreams(topology, Properties())
kafkaStreams.start()
}
Stateless Kafka Streams Application
events
eventsTransformed
Read from a topic
Apache Kafka Streams
import org.apache.kafka.streams.KafkaStreams
import org.apache.kafka.streams.StreamsBuilder
import org.apache.kafka.streams.kstream.KStream
import java.util.Properties
fun main() {
val streamsBuilder = StreamsBuilder()
val eventsStream: KStream<Key, Value> = streamsBuilder.stream("events")
val transformedStream
: KStream<Key2, Value2> = eventsStream
.map { key, value ->
TODO("do a transformation"
)
}
transformedStream.to("eventsTransformed")
val topology = streamsBuilder.build()
val kafkaStreams = KafkaStreams(topology, Properties())
kafkaStreams.start()
}
Stateless Kafka Streams Application
events
eventsTransformed
Do a transformation
Apache Kafka Streams
import org.apache.kafka.streams.KafkaStreams
import org.apache.kafka.streams.StreamsBuilder
import org.apache.kafka.streams.kstream.KStream
import java.util.Properties
fun main() {
val streamsBuilder = StreamsBuilder()
val eventsStream: KStream<Key, Value> = streamsBuilder.stream("events")
val transformedStream: KStream<Key2, Value2> = eventsStream.map
{ key, value ->
TODO("do a transformation")
}
transformedStream
.to("eventsTransformed"
)
val topology = streamsBuilder.build()
val kafkaStreams = KafkaStreams(topology, Properties())
kafkaStreams.start()
}
Stateless Kafka Streams Application
events
eventsTransformed
Write to a topic
Apache Kafka Streams
import org.apache.kafka.streams.
KafkaStreams
import org.apache.kafka.streams.
StreamsBuilder
import org.apache.kafka.streams.kstream.
KStream
import java.util.
Properties
fun main() {
val streamsBuilder = StreamsBuilder()
val eventsStream
: KStream<Key, Value> = streamsBuilder
.stream("events")
val transformedStream
: KStream<Key2, Value2> = eventsStream
.map { key, value ->
TODO("do a transformation"
)
}
transformedStream
.to("eventsTransformed"
)
val topology = streamsBuilder
.build()
val kafkaStreams = KafkaStreams(
topology, Properties())
kafkaStreams
.start()
}
Stateless Kafka Streams Application
events
eventsTransformed
Apache Kafka Streams
events
eventsAggregated
import org.apache.kafka.streams.
KafkaStreams
import org.apache.kafka.streams.
StreamsBuilder
import org.apache.kafka.streams.kstream.
KGroupedStream
import org.apache.kafka.streams.kstream.
KStream
import org.apache.kafka.streams.kstream.
KTable
import java.util.
Properties
fun main() {
val streamsBuilder = StreamsBuilder()
val eventsStream
: KStream<Key, Value> = streamsBuilder
.stream("events")
val aggregated
: KTable<Key, Value2> = grouped.groupByKey()
.aggregate(
{
TODO("provide an initial value"
)
}
) { key, value, aggregatedValue ->
TODO("do an aggregation"
)
}
aggregated
.to("eventsAggregated"
)
val topology = streamsBuilder
.build()
val kafkaStreams = KafkaStreams(
topology, Properties())
kafkaStreams
.start()
}
Stateful Kafka Streams Application
Apache Kafka Streams
events
eventsAggregated
import org.apache.kafka.streams.KafkaStreams
import org.apache.kafka.streams.StreamsBuilder
import org.apache.kafka.streams.kstream.KGroupedStream
import org.apache.kafka.streams.kstream.KStream
import org.apache.kafka.streams.kstream.KTable
import java.util.Properties
fun main() {
val streamsBuilder = StreamsBuilder()
val eventsStream
: KStream<Key, Value> = streamsBuilder
.stream("events")
val aggregated: KTable<Key, Value2> = grouped
.groupByKey()
.aggregate(
{
TODO("provide an initial value")
}
) { key, value, aggregatedValue ->
TODO("do an aggregation")
}
aggregated.
to("eventsAggregated")
val topology = streamsBuilder.build()
val kafkaStreams = KafkaStreams(topology, Properties())
kafkaStreams.start()
}
Stateful Kafka Streams Application
Read a topic
Apache Kafka Streams
mem
store
events
eventsAggregated
import org.apache.kafka.streams.KafkaStreams
import org.apache.kafka.streams.StreamsBuilder
import org.apache.kafka.streams.kstream.KGroupedStream
import org.apache.kafka.streams.kstream.KStream
import org.apache.kafka.streams.kstream.KTable
import java.util.Properties
fun main() {
val streamsBuilder = StreamsBuilder()
val eventsStream: KStream<Key, Value> = streamsBuilder.stream("events")
val aggregated
: KTable<Key, Value2> = grouped.groupByKey()
.aggregate(
{
TODO("provide an initial value"
)
}
) { key, value, aggregatedValue ->
TODO("do an aggregation"
)
}
aggregated.
to("eventsAggregated")
val topology = streamsBuilder.build()
val kafkaStreams = KafkaStreams(topology, Properties())
kafkaStreams.start()
}
Stateful Kafka Streams Application
Do an aggregation
Apache Kafka Streams
mem
store
events
eventsAggregated
import org.apache.kafka.streams.KafkaStreams
import org.apache.kafka.streams.StreamsBuilder
import org.apache.kafka.streams.kstream.KGroupedStream
import org.apache.kafka.streams.kstream.KStream
import org.apache.kafka.streams.kstream.KTable
import java.util.Properties
fun main() {
val streamsBuilder = StreamsBuilder()
val eventsStream: KStream<Key, Value> = streamsBuilder.stream("events")
val aggregated: KTable<Key, Value2> = grouped
.groupByKey()
.aggregate(
{
TODO("provide an initial value")
}
) { key, value, aggregatedValue ->
TODO("do an aggregation")
}
aggregated
.to("eventsAggregated"
)
val topology = streamsBuilder.build()
val kafkaStreams = KafkaStreams(topology, Properties())
kafkaStreams.start()
}
Stateful Kafka Streams Application
Write to a topic
Apache Kafka Streams
mem
store
events
eventsAggregated
import org.apache.kafka.streams.
KafkaStreams
import org.apache.kafka.streams.
StreamsBuilder
import org.apache.kafka.streams.kstream.
KGroupedStream
import org.apache.kafka.streams.kstream.
KStream
import org.apache.kafka.streams.kstream.
KTable
import java.util.
Properties
fun main() {
val streamsBuilder = StreamsBuilder()
val eventsStream
: KStream<Key, Value> = streamsBuilder
.stream("events")
val aggregated
: KTable<Key, Value2> = grouped.groupByKey()
.aggregate(
{
TODO("provide an initial value"
)
}
) { key, value, aggregatedValue ->
TODO("do an aggregation"
)
}
aggregated
.to("eventsAggregated"
)
val topology = streamsBuilder
.build()
val kafkaStreams = KafkaStreams(
topology, Properties())
kafkaStreams
.start()
}
Stateful Kafka Streams Application
How we implemented it?
mem
store
Topic listingEvents Topic listings
How we implemented it?
mem
store
Topic listingEvents Topic listings
ListingHasBeenPublished (id: 1)
How we implemented it?
mem
store
Topic listingEvents Topic listings
ListingHasBeenPublished (id: 1) Listing (id: 1)
How we implemented it?
mem
store
Topic listingEvents Topic listings
ListingHasBeenPublished (id: 1) Listing (id: 1)
ListingHasBeenModified (id: 1)
How we implemented it?
mem
store
Topic listingEvents Topic listings
ListingHasBeenPublished (id: 1) Listing (id: 1)
ListingHasBeenModified (id: 1) Listing’ (id: 1)
How we implemented it?
mem
store
Topic listingEvents Topic listings
ListingHasBeenPublished (id: 1) Listing (id: 1)
ListingHasBeenModified (id: 1) Listing’ (id: 1)
ListingHasBeenPublished (id: 2) Listing (id: 2)
How we implemented it?
ListingHasBeenPublished (id: 1)
ListingHasBeenModified (id: 1)
ListingHasBeenPublished (id: 2)
ListingHasBeenModified (id: 1)
ListingHasBeenModified (id: 2)
ListingHasBeenPublished (id: 3)
Listing (id: 1)
Listing’ (id: 1)
Listing (id: 2)
Listing’’ (id: 1)
Listing’ (id: 2)
Listing (id: 3)
mem
store
Topic listingEvents Topic listings
What is LIFULL Connect?
The real-time listings system
How we propagate data?
How we process listings?
How we serve listings?
How we serve listings?
How we serve listings?
How we implemented it?
Apache Kafka Connect
Kafka Connect
● Software to connect Kafka data to external systems
● Move data in Kafka with source connectors
● Move data out of Kafka with sink connectors
● Pluggable with third party or custom connectors
Kafka Connect
{
"name": "source-connector"
,
"config": {
"connector.class"
: "io.confluent.connect.jdbc.JdbcSourceConnector"
,
"connection.url"
: "jdbc:mysql://mysql:3306/test"
,
"connection.user"
: "connect_user"
,
"connection.password"
: "connect_password"
,
"topic.prefix"
: "mysql-01-"
,
"poll.interval.ms" : 3600000,
"table.whitelist" : "table",
"mode":"bulk"
}
}
Source Connector
Kafka Connect
{
"name": "sink-connector"
,
"config": {
"connector.class"
: "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector"
,
"tasks.max"
: "10",
"topics": "realEstateListings"
,
"key.ignore"
: "false",
"batch.size"
: 4000,
"max.buffered.records"
: 20000,
"behavior.on.malformed.documents"
: "warn",
"key.converter"
: "org.apache.kafka.connect.storage.StringConverter"
,
"value.converter"
: "org.apache.kafka.connect.json.JsonConverter"
,
"value.converter.schemas.enable"
: false,
"connection.url"
: "https://elasticsearch-host:443"
}
}
Sink Connector
How we implemented it?
Topic listings
How we implemented it?
Topic listings
Listing (id: 1)
How we implemented it?
Topic listings
Listing (id: 1)
Listing (id: 1)
How we implemented it?
Topic listings
Listing (id: 1)
Listing (id: 1)
Listing’ (id: 1)
How we implemented it?
Topic listings
Listing (id: 1)
Listing’ (id: 1)
Listing’ (id: 1)
How we implemented it?
Topic listings
Listing (id: 1)
Listing’ (id: 1)
Listing’ (id: 1)
Listing (id: 2)
How we implemented it?
Topic listings
Listing (id: 1)
Listing’ (id: 1)
Listing’ (id: 1)
Listing (id: 2)
Listing (id: 2)
How we implemented it?
Listing (id: 1)
Listing’ (id: 1)
Listing (id: 2)
Listing’’ (id: 1)
Listing’ (id: 2)
Listing (id: 3)
Listing’’ (id: 1)
Listing’ (id: 2)
Listing (id: 3)
Topic listings
How we implemented it?
Listing (id: 1)
Listing’ (id: 1)
Listing (id: 2)
Listing’’ (id: 1)
Listing’ (id: 2)
Listing (id: 3)
Listing’’ (id: 1)
Listing’ (id: 2)
Listing (id: 3)
API
Topic listings
How we implemented it?
Listing (id: 1)
Listing’ (id: 1)
Listing (id: 2)
Listing’’ (id: 1)
Listing’ (id: 2)
Listing (id: 3)
🌎
🌍
🌏
Topic listings
API
API
API
What is LIFULL Connect?
The real-time listings system
How we propagate data?
How we process listings?
How we serve listings?
Recap
Recap
To-do list
● Better observability
● Conventions with Public vs Private topics
● Proper data catalog
● Other divisions using Kafka, avoiding the caveats of a
multi-tenant cluster
● …
Potential unlocked
We do more than just serve
listings
Like, adapt the listing depending on its popularity, in real-time!
WE
HELP PEOPLE
FIND A HOME
Keep following our mission
OK THANKS BYE
Want to know more?
See you at LIFULL Connect stand
@ferrangali
@LifullConnEng
Questions?

More Related Content

Similar to Real-time, real estate listings with Apache Kafka

Fast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache PulsarFast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache Pulsar
Timothy Spann
 
Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...
Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...
Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...
HostedbyConfluent
 

Similar to Real-time, real estate listings with Apache Kafka (20)

[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
 
Apache Spark Streaming + Kafka 0.10 with Joan Viladrosariera
Apache Spark Streaming + Kafka 0.10 with Joan ViladrosarieraApache Spark Streaming + Kafka 0.10 with Joan Viladrosariera
Apache Spark Streaming + Kafka 0.10 with Joan Viladrosariera
 
How to build 1000 microservices with Kafka and thrive
How to build 1000 microservices with Kafka and thriveHow to build 1000 microservices with Kafka and thrive
How to build 1000 microservices with Kafka and thrive
 
Fast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache PulsarFast Streaming into Clickhouse with Apache Pulsar
Fast Streaming into Clickhouse with Apache Pulsar
 
Real Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaReal Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and Kafka
 
Automatically scaling Kubernetes workloads - SVC215-S - New York AWS Summit
Automatically scaling Kubernetes workloads - SVC215-S - New York AWS SummitAutomatically scaling Kubernetes workloads - SVC215-S - New York AWS Summit
Automatically scaling Kubernetes workloads - SVC215-S - New York AWS Summit
 
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
 
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca ArbezzanoOSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
 
OSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringOSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoring
 
Scaling big with Apache Kafka
Scaling big with Apache KafkaScaling big with Apache Kafka
Scaling big with Apache Kafka
 
Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...
Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...
Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...
 
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
 
Lessons learned and challenges faced while running Kubernetes at Scale
Lessons learned and challenges faced while running Kubernetes at ScaleLessons learned and challenges faced while running Kubernetes at Scale
Lessons learned and challenges faced while running Kubernetes at Scale
 
Westpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache KafkaWestpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache Kafka
 
Code Red Security
Code Red SecurityCode Red Security
Code Red Security
 
Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...
Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...
Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...
 
Fluentd v1 and Roadmap
Fluentd v1 and RoadmapFluentd v1 and Roadmap
Fluentd v1 and Roadmap
 
Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com Data Streaming Ecosystem Management at Booking.com
Data Streaming Ecosystem Management at Booking.com
 
Timely Auto-Scaling of Kafka Streams Pipelines with Remotely Connected APIs w...
Timely Auto-Scaling of Kafka Streams Pipelines with Remotely Connected APIs w...Timely Auto-Scaling of Kafka Streams Pipelines with Remotely Connected APIs w...
Timely Auto-Scaling of Kafka Streams Pipelines with Remotely Connected APIs w...
 

Recently uploaded

Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
HyderabadDolls
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
HyderabadDolls
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
HyderabadDolls
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Call Girls in G.T.B. Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in G.T.B. Nagar  (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in G.T.B. Nagar  (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in G.T.B. Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Call Girls in G.T.B. Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in G.T.B. Nagar  (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in G.T.B. Nagar  (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in G.T.B. Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 

Real-time, real estate listings with Apache Kafka

  • 1. Real-time real estate listings with Apache Kafka Ferran Galí i Reniu
  • 2. What is LIFULL Connect? The real-time listings system
  • 3. What is a Real Estate Listing?
  • 4. What is a Real Estate Listing?
  • 6. What is LIFULL Connect?
  • 7. WE HELP PEOPLE FIND A HOME LIFULL Connect mission
  • 8. What is LIFULL Connect?
  • 11. What is LIFULL Connect? The real-time listings system
  • 13. …we should connect them with their owners Connections
  • 14. Challenge #1: Listing Publishing
  • 15. Challenge #1: Listing Publishing
  • 16. Challenge #1: Listing Publishing
  • 17. We created a unified back-office, so professionals can publish in a single place Challenge #1: Listing Publishing
  • 21. Challenge #2: Freshness Yeah! But… 10 sites 5 teams 8 legacy batch systems 4h-24h listing publication delay
  • 22. Real Estate Professionals don’t care. They want listings published, now. (and we want to offer them the best experience) Challenge #2: Freshness
  • 23. Welcome Real-time Listings System (Internally we call it ReTiS) Challenge #2: Freshness
  • 24. Goals Real-time data propagation Strangle legacy systems Reduce complexity
  • 30. How we did it?
  • 31. What is LIFULL Connect? The real-time listings system How we propagate data? How we process listings? How we serve listings?
  • 34. How we propagate data? Domain Events + Event Log
  • 35. How we propagate data? Domain events A domain event is something that happened in a specific domain that we want others to be aware of.
  • 36. How we propagate data? Event Log It’s an append-only sequence of events. It differs from a queue, because items are not consumed
  • 38. How we propagate data? ListingHasBeenPublished
  • 39. How we propagate data? ListingHasBeenPublished ListingHasBeenModified
  • 40. How we propagate data? ListingHasBeenPublished ListingHasBeenModified ListingHasBeenHidden ListingHasBeenModified ListingHasBeenModified ListingHasBeenPublished
  • 41. How we propagate data? ListingHasBeenPublished ListingHasBeenModified ListingHasBeenHidden ListingHasBeenModified ListingHasBeenModified ListingHasBeenPublished
  • 42. How we implemented it? Apache Kafka + Apache Avro
  • 43. Apache Avro ● Efficient engine to serialize data ● Libraries in multiple languages ● Schema is used to write and read data
  • 45. { "type": "record", "name": "MyData", "namespace": "com.lifullconnect" , "doc": "This is the data" , "fields": [ { "name": "id", "type": "string" }, { "name": "price", "type": "double" "default": 0.0 } ] } Apache Avro Schema MyData id: c4d729f3-3d20-4b72-8ffe-c6dd828b9a2c price: 233243.2 Datum
  • 46. { "type": "record", "name": "MyData", "namespace": "com.lifullconnect" , "doc": "This is the data" , "fields": [ { "name": "id", "type": "string" }, { "name": "price", "type": "double" "default": 0.0 } ] } Apache Avro c4d729f3-3d20-4b72-8ffe-c6dd 828b9a2c�����234243.2� Schema Serialized MyData id: c4d729f3-3d20-4b72-8ffe-c6dd828b9a2c price: 233243.2 Datum
  • 47. { "type": "record", "name": "MyData", "namespace": "com.lifullconnect" , "doc": "This is the data" , "fields": [ { "name": "id", "type": "string" }, { "name": "price", "type": "double" "default": 0.0 } ] } Apache Avro c4d729f3-3d20-4b72-8ffe-c6dd 828b9a2c�����234243.2� Schema Serialized MyData id: c4d729f3-3d20-4b72-8ffe-c6dd828b9a2c price: 233243.2 Datum
  • 48. How we implemented it? ListingHasBeenPublished event_id: UUID //a unique ID produced_by: String //owner of data occurred_on: Date //when it happened id: String //which ID in the domain refers listing: … …
  • 49. How we implemented it? ListingHasBeenPublished event_id: UUID //a unique ID produced_by: String //owner of data occurred_on: Date //when it happened id: String //which ID in the domain refers listing: … … ListingHasBeenModified event_id: UUID //a unique ID produced_by: String //owner of data occurred_on: Date //when it happened id: String //which ID in the domain refers modifications: … …
  • 50. How we implemented it? ListingHasBeenPublished event_id: UUID //a unique ID produced_by: String //owner of data occurred_on: Date //when it happened id: String //which ID in the domain refers listing: … … ListingHasBeenModified event_id: UUID //a unique ID produced_by: String //owner of data occurred_on: Date //when it happened id: String //which ID in the domain refers modifications: … … ListingHasBeenBoosted event_id: UUID //a unique ID produced_by: String //owner of data occurred_on: Date //when it happened id: String //which ID in the domain refers boosting: … …
  • 51. Apache Kafka ● Implements a log ● Distributed in a cluster of broker nodes ● Fault tolerant through partitions & replicas ● Enables multiple consumers of data ● Enables real-time consumption of data
  • 53. Apache Kafka Topic 1 Partition 2 Partition 3 Topic 2 Partition 2 Partition 1 Partition 1
  • 54. Apache Kafka Topic 1 Partition 1 Partition 2 Partition 3 Topic 2 Partition 1 Partition 2
  • 55. Apache Kafka Topic 1 Partition 1 Partition 2 Partition 3 Topic 2 Partition 1 Partition 2 Broker 1 Broker 2 Broker 3
  • 56. Apache Kafka Topic 1 Partition 1 Partition 2 Partition 3 Topic 2 Partition 1 Partition 2 Partition 1 Partition 2 Partition 1 Partition 3 Partition 2 Broker 1 Broker 2 Broker 3
  • 57. Apache Kafka Topic 1 Partition 1 Partition 2 Partition 3 Topic 2 Partition 1 Partition 2 Partition 1 Partition 2 replica Partition 2 replica Partition 2 Partition 1 Partition 3 replica Partition 3 Partition 2 Partition 1 replica Partition 1 replica Broker 1 Broker 2 Broker 3
  • 58. How we implemented it? Topic listingEvents Partition 1 Partition 2
  • 59. How we implemented it? ListingHasBeenPublished (id: 1) Topic listingEvents Partition 1 Partition 2 ListingHasBeenModified (id: 1)
  • 60. How we implemented it? ListingHasBeenPublished (id: 1) ListingHasBeenPublished (id: 2) Topic listingEvents Partition 1 Partition 2 ListingHasBeenPublished (id: 1) ListingHasBeenModified (id: 1) ListingHasBeenModified (id: 2)
  • 61. How we implemented it? ListingHasBeenPublished (id: 1) ListingHasBeenModified (id: 1) ListingHasBeenPublished (id: 3) ListingHasBeenPublished (id: 2) ListingHasBeenModified (id: 2) Topic listingEvents Partition 1 Partition 2
  • 62. How we implemented it? ListingHasBeenPublished (id: 1) ListingHasBeenModified (id: 1) ListingHasBeenPublished (id: 3) ListingHasBeenModified (id: 1) ListingHasBeenPublished (id: 2) ListingHasBeenModified (id: 2) Topic listingEvents Partition 1 Partition 2
  • 63. Apache Kafka + Apache Avro ● Send Avro binary records to Kafka ● Every single record has a schema, registered into a schema registry ● Schema registry allows controlled schema evolution
  • 64. Apache Kafka + Apache Avro Schema Registry Kafka Cluster
  • 65. Apache Kafka + Apache Avro Producer Schema Registry Datum A Datum A Datum B Datum B Kafka Cluster
  • 66. Apache Kafka + Apache Avro Producer Schema Registry Datum A Datum A Datum B Datum B Kafka Cluster 1 ����� Schema A 1
  • 67. Apache Kafka + Apache Avro Producer Schema Registry Datum A Datum A Datum B Datum B Kafka Cluster 1 ����� Schema A 1 1 �����
  • 68. Apache Kafka + Apache Avro Producer Schema Registry Datum A Datum A Datum B Datum B Kafka Cluster 1 ����� Schema A 1 1 ����� Schema B 2 2 2 ����� �����
  • 69. Apache Kafka + Apache Avro 1 1 2 2 ����� ����� ����� ����� Consumer Producer Schema A 1 Schema B 2 Schema Registry Datum A Datum A Datum B Datum B Kafka Cluster
  • 70. Apache Kafka + Apache Avro 1 1 2 2 ����� ����� ����� ����� Consumer Producer Schema A 1 Schema B 2 Schema Registry Datum A Datum A Datum B Datum B Kafka Cluster
  • 71. Apache Kafka + Apache Avro 1 1 2 2 ����� ����� ����� ����� Consumer Producer Schema A 1 Schema B 2 Schema Registry Datum A Datum A Datum B Datum B Kafka Cluster Datum A
  • 72. Apache Kafka + Apache Avro 1 1 2 2 ����� ����� ����� ����� Consumer Producer Schema A 1 Schema B 2 Schema Registry Datum A Datum A Datum B Datum B Kafka Cluster Datum A Datum A
  • 73. Apache Kafka + Apache Avro 1 1 2 2 ����� ����� ����� ����� Consumer Producer Schema A 1 Schema B 2 Schema Registry Datum A Datum A Datum B Datum B Kafka Cluster Datum A Datum A Datum B Datum B
  • 74. Apache Kafka + Apache Avro 1 1 2 2 ����� ����� ����� ����� Consumer Producer Schema A 1 Schema B 2 Schema Registry Datum A Datum A Datum B Datum B Datum A’ Kafka Cluster Datum A Datum A Datum B Datum B
  • 75. Apache Kafka + Apache Avro 1 1 2 2 ����� ����� ����� ����� Consumer Producer Schema A 1 Schema B 2 Schema Registry Datum A Datum A Datum B Datum B Datum A’ 3 ����� Schema A’ 3 Kafka Cluster Datum A Datum A Datum B Datum B Since it’s an evolution, it will check compatibility. Fail if broken.
  • 76. Apache Kafka + Apache Avro 1 1 2 2 ����� ����� ����� ����� Consumer Producer Schema A 1 Schema B 2 Schema Registry Datum A Datum A Datum B Datum B Datum A’ 3 ����� Schema A’ 3 Kafka Cluster Datum A Datum A Datum B Datum B Datum A’ No compatibility broken, so can read without problem.
  • 77. How we propagate data? ListingHasBeenPublished ListingHasBeenModified ListingHasBeenHidden ListingHasBeenModified ListingHasBeenModified ListingHasBeenPublished
  • 78. What is LIFULL Connect? The real-time listings system How we propagate data? How we process listings? How we serve listings?
  • 81. How we process data? Apache Kafka Streams
  • 82. ● Framework to do transformations to data stored in Kafka ● Has an easy-to-use functional API, that runs on JVM ● Does the data processing in streaming ● Ensures exactly once delivery ● Distributed ● Fault tolerant ● Enables stateless and stateful processing of information Apache Kafka Streams
  • 83. Apache Kafka Streams import org.apache.kafka.streams. KafkaStreams import org.apache.kafka.streams. StreamsBuilder import org.apache.kafka.streams.kstream. KStream import java.util. Properties fun main() { val streamsBuilder = StreamsBuilder() val eventsStream : KStream<Key, Value> = streamsBuilder .stream("events") val transformedStream : KStream<Key2, Value2> = eventsStream .map { key, value -> TODO("do a transformation" ) } transformedStream .to("eventsTransformed" ) val topology = streamsBuilder .build() val kafkaStreams = KafkaStreams( topology, Properties()) kafkaStreams .start() } Stateless Kafka Streams Application events eventsTransformed
  • 84. Apache Kafka Streams import org.apache.kafka.streams. KafkaStreams import org.apache.kafka.streams. StreamsBuilder import org.apache.kafka.streams.kstream. KStream import java.util. Properties fun main() { val streamsBuilder = StreamsBuilder() val eventsStream: KStream<Key, Value> = streamsBuilder.stream("events") val transformedStream: KStream<Key2, Value2> = eventsStream.map { key, value -> TODO("do a transformation") } transformedStream.to("eventsTransformed") val topology = streamsBuilder .build() val kafkaStreams = KafkaStreams( topology, Properties()) kafkaStreams .start() } Stateless Kafka Streams Application events eventsTransformed Boilerplate
  • 85. Apache Kafka Streams import org.apache.kafka.streams.KafkaStreams import org.apache.kafka.streams.StreamsBuilder import org.apache.kafka.streams.kstream.KStream import java.util.Properties fun main() { val streamsBuilder = StreamsBuilder() val eventsStream : KStream<Key, Value> = streamsBuilder .stream("events") val transformedStream: KStream<Key2, Value2> = eventsStream.map { key, value -> TODO("do a transformation") } transformedStream.to("eventsTransformed") val topology = streamsBuilder.build() val kafkaStreams = KafkaStreams(topology, Properties()) kafkaStreams.start() } Stateless Kafka Streams Application events eventsTransformed Read from a topic
  • 86. Apache Kafka Streams import org.apache.kafka.streams.KafkaStreams import org.apache.kafka.streams.StreamsBuilder import org.apache.kafka.streams.kstream.KStream import java.util.Properties fun main() { val streamsBuilder = StreamsBuilder() val eventsStream: KStream<Key, Value> = streamsBuilder.stream("events") val transformedStream : KStream<Key2, Value2> = eventsStream .map { key, value -> TODO("do a transformation" ) } transformedStream.to("eventsTransformed") val topology = streamsBuilder.build() val kafkaStreams = KafkaStreams(topology, Properties()) kafkaStreams.start() } Stateless Kafka Streams Application events eventsTransformed Do a transformation
  • 87. Apache Kafka Streams import org.apache.kafka.streams.KafkaStreams import org.apache.kafka.streams.StreamsBuilder import org.apache.kafka.streams.kstream.KStream import java.util.Properties fun main() { val streamsBuilder = StreamsBuilder() val eventsStream: KStream<Key, Value> = streamsBuilder.stream("events") val transformedStream: KStream<Key2, Value2> = eventsStream.map { key, value -> TODO("do a transformation") } transformedStream .to("eventsTransformed" ) val topology = streamsBuilder.build() val kafkaStreams = KafkaStreams(topology, Properties()) kafkaStreams.start() } Stateless Kafka Streams Application events eventsTransformed Write to a topic
  • 88. Apache Kafka Streams import org.apache.kafka.streams. KafkaStreams import org.apache.kafka.streams. StreamsBuilder import org.apache.kafka.streams.kstream. KStream import java.util. Properties fun main() { val streamsBuilder = StreamsBuilder() val eventsStream : KStream<Key, Value> = streamsBuilder .stream("events") val transformedStream : KStream<Key2, Value2> = eventsStream .map { key, value -> TODO("do a transformation" ) } transformedStream .to("eventsTransformed" ) val topology = streamsBuilder .build() val kafkaStreams = KafkaStreams( topology, Properties()) kafkaStreams .start() } Stateless Kafka Streams Application events eventsTransformed
  • 89. Apache Kafka Streams events eventsAggregated import org.apache.kafka.streams. KafkaStreams import org.apache.kafka.streams. StreamsBuilder import org.apache.kafka.streams.kstream. KGroupedStream import org.apache.kafka.streams.kstream. KStream import org.apache.kafka.streams.kstream. KTable import java.util. Properties fun main() { val streamsBuilder = StreamsBuilder() val eventsStream : KStream<Key, Value> = streamsBuilder .stream("events") val aggregated : KTable<Key, Value2> = grouped.groupByKey() .aggregate( { TODO("provide an initial value" ) } ) { key, value, aggregatedValue -> TODO("do an aggregation" ) } aggregated .to("eventsAggregated" ) val topology = streamsBuilder .build() val kafkaStreams = KafkaStreams( topology, Properties()) kafkaStreams .start() } Stateful Kafka Streams Application
  • 90. Apache Kafka Streams events eventsAggregated import org.apache.kafka.streams.KafkaStreams import org.apache.kafka.streams.StreamsBuilder import org.apache.kafka.streams.kstream.KGroupedStream import org.apache.kafka.streams.kstream.KStream import org.apache.kafka.streams.kstream.KTable import java.util.Properties fun main() { val streamsBuilder = StreamsBuilder() val eventsStream : KStream<Key, Value> = streamsBuilder .stream("events") val aggregated: KTable<Key, Value2> = grouped .groupByKey() .aggregate( { TODO("provide an initial value") } ) { key, value, aggregatedValue -> TODO("do an aggregation") } aggregated. to("eventsAggregated") val topology = streamsBuilder.build() val kafkaStreams = KafkaStreams(topology, Properties()) kafkaStreams.start() } Stateful Kafka Streams Application Read a topic
  • 91. Apache Kafka Streams mem store events eventsAggregated import org.apache.kafka.streams.KafkaStreams import org.apache.kafka.streams.StreamsBuilder import org.apache.kafka.streams.kstream.KGroupedStream import org.apache.kafka.streams.kstream.KStream import org.apache.kafka.streams.kstream.KTable import java.util.Properties fun main() { val streamsBuilder = StreamsBuilder() val eventsStream: KStream<Key, Value> = streamsBuilder.stream("events") val aggregated : KTable<Key, Value2> = grouped.groupByKey() .aggregate( { TODO("provide an initial value" ) } ) { key, value, aggregatedValue -> TODO("do an aggregation" ) } aggregated. to("eventsAggregated") val topology = streamsBuilder.build() val kafkaStreams = KafkaStreams(topology, Properties()) kafkaStreams.start() } Stateful Kafka Streams Application Do an aggregation
  • 92. Apache Kafka Streams mem store events eventsAggregated import org.apache.kafka.streams.KafkaStreams import org.apache.kafka.streams.StreamsBuilder import org.apache.kafka.streams.kstream.KGroupedStream import org.apache.kafka.streams.kstream.KStream import org.apache.kafka.streams.kstream.KTable import java.util.Properties fun main() { val streamsBuilder = StreamsBuilder() val eventsStream: KStream<Key, Value> = streamsBuilder.stream("events") val aggregated: KTable<Key, Value2> = grouped .groupByKey() .aggregate( { TODO("provide an initial value") } ) { key, value, aggregatedValue -> TODO("do an aggregation") } aggregated .to("eventsAggregated" ) val topology = streamsBuilder.build() val kafkaStreams = KafkaStreams(topology, Properties()) kafkaStreams.start() } Stateful Kafka Streams Application Write to a topic
  • 93. Apache Kafka Streams mem store events eventsAggregated import org.apache.kafka.streams. KafkaStreams import org.apache.kafka.streams. StreamsBuilder import org.apache.kafka.streams.kstream. KGroupedStream import org.apache.kafka.streams.kstream. KStream import org.apache.kafka.streams.kstream. KTable import java.util. Properties fun main() { val streamsBuilder = StreamsBuilder() val eventsStream : KStream<Key, Value> = streamsBuilder .stream("events") val aggregated : KTable<Key, Value2> = grouped.groupByKey() .aggregate( { TODO("provide an initial value" ) } ) { key, value, aggregatedValue -> TODO("do an aggregation" ) } aggregated .to("eventsAggregated" ) val topology = streamsBuilder .build() val kafkaStreams = KafkaStreams( topology, Properties()) kafkaStreams .start() } Stateful Kafka Streams Application
  • 94. How we implemented it? mem store Topic listingEvents Topic listings
  • 95. How we implemented it? mem store Topic listingEvents Topic listings ListingHasBeenPublished (id: 1)
  • 96. How we implemented it? mem store Topic listingEvents Topic listings ListingHasBeenPublished (id: 1) Listing (id: 1)
  • 97. How we implemented it? mem store Topic listingEvents Topic listings ListingHasBeenPublished (id: 1) Listing (id: 1) ListingHasBeenModified (id: 1)
  • 98. How we implemented it? mem store Topic listingEvents Topic listings ListingHasBeenPublished (id: 1) Listing (id: 1) ListingHasBeenModified (id: 1) Listing’ (id: 1)
  • 99. How we implemented it? mem store Topic listingEvents Topic listings ListingHasBeenPublished (id: 1) Listing (id: 1) ListingHasBeenModified (id: 1) Listing’ (id: 1) ListingHasBeenPublished (id: 2) Listing (id: 2)
  • 100. How we implemented it? ListingHasBeenPublished (id: 1) ListingHasBeenModified (id: 1) ListingHasBeenPublished (id: 2) ListingHasBeenModified (id: 1) ListingHasBeenModified (id: 2) ListingHasBeenPublished (id: 3) Listing (id: 1) Listing’ (id: 1) Listing (id: 2) Listing’’ (id: 1) Listing’ (id: 2) Listing (id: 3) mem store Topic listingEvents Topic listings
  • 101. What is LIFULL Connect? The real-time listings system How we propagate data? How we process listings? How we serve listings?
  • 102. How we serve listings?
  • 103. How we serve listings?
  • 104. How we implemented it? Apache Kafka Connect
  • 105. Kafka Connect ● Software to connect Kafka data to external systems ● Move data in Kafka with source connectors ● Move data out of Kafka with sink connectors ● Pluggable with third party or custom connectors
  • 106. Kafka Connect { "name": "source-connector" , "config": { "connector.class" : "io.confluent.connect.jdbc.JdbcSourceConnector" , "connection.url" : "jdbc:mysql://mysql:3306/test" , "connection.user" : "connect_user" , "connection.password" : "connect_password" , "topic.prefix" : "mysql-01-" , "poll.interval.ms" : 3600000, "table.whitelist" : "table", "mode":"bulk" } } Source Connector
  • 107. Kafka Connect { "name": "sink-connector" , "config": { "connector.class" : "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector" , "tasks.max" : "10", "topics": "realEstateListings" , "key.ignore" : "false", "batch.size" : 4000, "max.buffered.records" : 20000, "behavior.on.malformed.documents" : "warn", "key.converter" : "org.apache.kafka.connect.storage.StringConverter" , "value.converter" : "org.apache.kafka.connect.json.JsonConverter" , "value.converter.schemas.enable" : false, "connection.url" : "https://elasticsearch-host:443" } } Sink Connector
  • 108. How we implemented it? Topic listings
  • 109. How we implemented it? Topic listings Listing (id: 1)
  • 110. How we implemented it? Topic listings Listing (id: 1) Listing (id: 1)
  • 111. How we implemented it? Topic listings Listing (id: 1) Listing (id: 1) Listing’ (id: 1)
  • 112. How we implemented it? Topic listings Listing (id: 1) Listing’ (id: 1) Listing’ (id: 1)
  • 113. How we implemented it? Topic listings Listing (id: 1) Listing’ (id: 1) Listing’ (id: 1) Listing (id: 2)
  • 114. How we implemented it? Topic listings Listing (id: 1) Listing’ (id: 1) Listing’ (id: 1) Listing (id: 2) Listing (id: 2)
  • 115. How we implemented it? Listing (id: 1) Listing’ (id: 1) Listing (id: 2) Listing’’ (id: 1) Listing’ (id: 2) Listing (id: 3) Listing’’ (id: 1) Listing’ (id: 2) Listing (id: 3) Topic listings
  • 116. How we implemented it? Listing (id: 1) Listing’ (id: 1) Listing (id: 2) Listing’’ (id: 1) Listing’ (id: 2) Listing (id: 3) Listing’’ (id: 1) Listing’ (id: 2) Listing (id: 3) API Topic listings
  • 117. How we implemented it? Listing (id: 1) Listing’ (id: 1) Listing (id: 2) Listing’’ (id: 1) Listing’ (id: 2) Listing (id: 3) 🌎 🌍 🌏 Topic listings API API API
  • 118. What is LIFULL Connect? The real-time listings system How we propagate data? How we process listings? How we serve listings? Recap
  • 119. Recap
  • 120. To-do list ● Better observability ● Conventions with Public vs Private topics ● Proper data catalog ● Other divisions using Kafka, avoiding the caveats of a multi-tenant cluster ● …
  • 121. Potential unlocked We do more than just serve listings Like, adapt the listing depending on its popularity, in real-time!
  • 122. WE HELP PEOPLE FIND A HOME Keep following our mission
  • 123. OK THANKS BYE Want to know more? See you at LIFULL Connect stand @ferrangali @LifullConnEng