Daniel Aragao & Simon Hope
Daniel Aragao Simon Hope
@dear_dr_dan @mapbutcher
REALESTATE.COM.AU
6BMarket
Cap
11MAustralian
Properties
55MVisits in
September
4.7MApp Downloads
…and counting
3,500PEOPLE
13COUNTRIES
34OFFICES
TECHNOLOGY
&
SOCIAL JUSTICE
• In the beginning…
• Organising our Data
• Implementation approaches
• Hipster Batches
• Reactify
• Bring Your Own Data
• Finding the Data
• What we have learned so far
THIS IS WHAT THE STORY IS ABOUT
SORRY… IT’S OK TO LEAVE NOW
• Nope, we didn’t create a new Hadoop
• No hardcore Data Science
• There are some implementation details
• REA embraced the Cloud. AWS everywhere
• Under construction
IN THE BEGINNING…
ORGANISING OUR DATA
Increasingly, content is being 

distributed through search

and social platforms...



There’s less visiting 

of publishers as destinations.
Jeff Weiner, CEO, Linkedin
Data sources
Data
warehouse
PROBLEM…
STRATEGY…
STRATEGY…
STRATEGY…
Data Warehouse
StagingSSIS Dim Fact
PROBLEM…
Data Warehouse
StagingSSIS Dim Fact
PROBLEM…
Star schema
leaky details
No Data Warehouse
StagingSSIS Dim Fact
STRATEGY…
STRATEGY…
Data Warehouse Facade
StagingSSIS Dim Fact
???
WHAT’S IN THE BOX?
Good things come in small
packages services
THE HIPSTER BATCH
???
Hipster Batch
Hipster Batch
THE HIPSTER BATCH
• Small and short lived
• Decoupled via flat
files via S3
• Single purpose
• Idempotent
• Polyglot
• Minimal runtime
dependencies
• Discoverable
SNS, SQS
Data
A ‘TYPICAL’ IMPLEMENTATION
Hipster Batch
SNS, SQS
ASG, ECS,
Lambda
Data
A ‘TYPICAL’ IMPLEMENTATION
Hipster Batch
SNS, SQS
ASG, ECS,
Lambda
KMS
Data
A ‘TYPICAL’ IMPLEMENTATION
Hipster Batch
Logs
SNS, SQS
ASG, ECS,
Lambda
KMS
Data
A ‘TYPICAL’ IMPLEMENTATION
Hipster Batch
Logs
SNS, SQS
ASG, ECS,
Lambda
KMS
Cloudwatch
Data
A ‘TYPICAL’ IMPLEMENTATION
Hipster Batch
Logs
SNS, SQS
ASG, ECS,
Lambda
KMS
Cloudwatch
S3 buckets
Data
A ‘TYPICAL’ IMPLEMENTATION
Hipster Batch
Hipster Batch
HIPSTER BATCH DOES SCIENCE
• Behavioural models for targeted marketing
• Recommendation engine
• External channels
Hipster Batch
SCIENCE!
x 20
Hipster Batch
Stats models
SCIENCE!
x 20
API
Hipster Batch
Stats models
SCIENCE!
API
x 20
API
Hipster Batch
Stats models
SCIENCE!
API
x 20
API
Hipster Batch
Stats models
SCIENCE!
API
x 20
API
Hipster Batch
Stats models
Google
Now
API
SCIENCE!
From legacy to reactive
REACTIFY
Reactify
???
Reactify
http://www.reactivemanifesto.org
REACTIFY
• Manage Data flow with messages
• Protect consumers and care about isolation
• Resilience is important and Data replication is
just fine
• Demand is elastic - and your components
should be too
Reactify
Listings
Data
coupling
No resilience
or elasticity
Coupling
PROBLEM…
Reactify
Listings
SOLUTION…
Reactify
Listings
Reactify
SOLUTION…
Reactify
Listings
Reactify
SOLUTION…
Reactify
Listings
Reactify
Hipster Batch
SOLUTION…
Reactify
Listings
Reactify
Hipster Batch
Shielded
consumers
IsolationDecoupled
SOLUTION…
Reactify
Listings
IMPLEMENTATION…
Reactify
Listings
REST
API
IMPLEMENTATION…
Reactify
Listings
REST
API
IMPLEMENTATION…
Reactify
Listings
REST
API Dynamo
Event
Maker
Event
Differ
IMPLEMENTATION…
Reactify
Listings
REST
API Dynamo
Event
Maker
Event
Differ
Kinesis
2
IMPLEMENTATION…
2
• Exposes current state only
• Stream of change notifications
• Hypertext Application Language - HAL
• Clear entity types
• Linking over embedding
• Cacheable and discoverable
REST API
REACTIFY REST API
REST API
https://feeds.listings.realestate.com.au/combined-listings/120449689
REST API
https://feeds.listings.realestate.com.au/combined-listings/120449689
REST API
https://feeds.listings.realestate.com.au/combined-listings/120449689
REST API
https://feeds.listings.realestate.com.au/combined-listings/120449689
REST API
Event
Maker
https://feeds.listings.realestate.com.au/combined-listings/-/changes
REST API
Event
Maker
https://feeds.listings.realestate.com.au/combined-listings/-/changes
REST API
Event
Maker
https://feeds.listings.realestate.com.au/combined-listings/-/changes
REST API
Event
Maker
https://feeds.listings.realestate.com.au/combined-listings/-/changes
Reactify
Event
Differ
Reactify
Event
Differ
Reactify
Event
Differ
Reactify
Event
Differ
The octopus in the box
— Did you use that data set?
— Errr… No, we have another one
BRING YOUR OWN DATA
BRING YOUR OWN DATA - BYOD
• Allow data to flow freely
• Help the business to get what they need
when they need it
• Self-service
BYOD
BYOD
CSV
BYOD
CSV
x 5
BYOD
CSV
x 5
Smarts on
datatypes
BYOD
CSV
x 5
Tableau
Server
Smarts on
datatypes
BYOD
CSV
x 5
Tableau
Server
Smarts on
datatypes
BYOD
CSV
x 5
Tableau
Server
Audit, auth,
share…
Smarts on
datatypes
These were the implementation
approaches, now to…
FIND THE DATA
Meaningful, automated, 

and easy-to-search metadata
WE TRIED
SNS, SQS
ASG, ECS,
Lambda
KMS
Cloudwatch
Logs
MORE THAN DATA
Hipster Batch
SNS, SQS
ASG, ECS,
Lambda
KMS
Cloudwatch
Logs
MORE THAN DATA
Hipster Batch
SNS, SQS
ASG, ECS,
Lambda
KMS
Cloudwatch
Logs
Dataz
Ancestry
MORE THAN DATA
Hipster Batch
SNS, SQS
ASG, ECS,
Lambda
KMS
Cloudwatch
Logs
Dataz
Ancestry
Metadata
MORE THAN DATA
Hipster Batch
Ancestry
Ancestry
Ancestry
Ancestry
Ancestry
REST
API
METADATA PIPELINE
Producers
REST
API
Ancestry
Ancestry
Ancestry
METADATA PIPELINE
Producers
REST
API
Ancestry
Ancestry
Ancestry
METADATA PIPELINE
Producers
REST
API
Ancestry
Ancestry
Ancestry
METADATA PIPELINE
Producers
Scrapy
REST
API
Ancestry
Ancestry
Ancestry
METADATA PIPELINE
Producers
Scrapy
REST
API
Ancestry
Ancestry
Ancestry
METADATA PIPELINE
Producers
Scrapy
WHAT WE HAVE LEARNED SO FAR
• Consumers create the last-mile data as needed
• We must work with external, independent
delivery channels
• Push quality back to source/producer systems
• Data belongs to the entire organisation, 

not to a single team
I’ll give you my 

Data Warehouse 

when you can pry it

from my cold dead hands.
THANKYOU
Daniel Aragao Simon Hope
@dear_dr_dan @mapbutcher
REALESTATE.COM.AU

Building Data-Centric Businesses