Building a Graph-based Analytics Platform

(graphs)-[:are]->(everywhere)
Building

platform
© All Rights Reserved 2014 | Neo Technology, Inc.
@kennybastani
Neo4j

Using Meetup as an example use case
Meetup.com is a valuable source of data for
understanding trends around products or brands.
Understanding demand is key for delivering compelling
content at meetups.
It sounded like a great use case for Neo4j.

The Problem
Track meetup group growth over time.
Apply tags to meetup groups and report combined
growth of all groups over time.

Question #1
Given a start date and an end date, what is the time
series that plots the membership growth of a given
meetup group?

Question #2
Given a start date, an end date, and a combination of
tags, what is the time series that plots the combined
membership growth of all meetup groups with those
tags?

Question #3
How do you generate the JSON data of a time series
for a basic JS line chart plugin?

The GraphGist Project
The GraphGist project is a way to quickly build a
graph-based proof of concept on Neo4j.
I started with a GraphGist.
Neo4j for Graph Analytics: Meetup.com Example

How are tags/topics connected?

Tackling Time in Neo4j
How do you implement a time series in Neo4j?
For any node that represents a unit of time, use a
timestamp. Traversals can be costly for selecting time
series. Expose a REST API that takes a normal date format
and then convert it to an integer that allows you to select a
range of dates in your Neo4j Cypher query.
For any node that represents a unit of time, use a timestamp. Traversals can be costly for selecting time series. Expose a REST API that takes a normal date format and then convert it to a Int32 that allows you to select a range of
dates.

Scale it up!
It started with a GraphGist and then I said “Why not?”
let’s build something cool using Neo4j.

Challenges
I decided to take my GraphGist and make a full
platform.
There were some challenges.

Challenge #1
How do I get historical Meetup group statistics for all
groups?

Challenge #2
How do I handle the data import on a daily basis?

Challenge #3
What kind of reports do I want to create? What do I
want to know about Meetup groups?

Challenge #4
How do I safely expose Neo4j to a client-side charting
control?

Ask Questions
I decided to start asking some questions about my
data model.

What do I want to know?
Assuming I had as much historical Meetup data as I
pleased, what kind of questions would I want to ask
about that data?
How would I want to present it?

What’s the combined growth percent of Meetup
groups having a certain topic?
This chart plots a line chart of the time series for a meetup group topic on Meetup.com. Each group on Meetup.com has a set of topics associated with it. This chart is meant to show the percent growth month over month.

What’s the cumulative growth of Meetup groups with
a speciﬁc topic?
This chart plots a bar chart of the cumulative growth of a meetup group topic on Meetup.com. Using the time series data of monthly growth from the Meetup Tag Growth % chart, the growth percents over the period are
aggregated into a sum for each topic. This chart shows total growth percentage over the period.

What’s the relative growth of Meetup groups with a
topic for a date range?
This chart plots an Donut Chart of the relative cumulative growth of a meetup group topic on Meetup.com. Using the data from Cumulative Meetup Growth, the percentage growth of each topic over the period is compared relative
to one another as a ratio of 100.

How many groups does a topic have relative to
others?
This chart plots an Donut Chart of the number of groups in the region during the period for each topic. Each group is compared relative to one another as a ratio of 100.

What’s the growth percent of all groups for a topic in
a location for a date range?
This report is a simple table that shows the growth percent of all groups for a topic broke down by location. What do these high percentages tell us about Meetup? Within the last year there has been massive growth for meetup
groups that are focused on NoSQL database technology. If I imported a different topic, not related to technology, what would the data show?

How do I give users a clean set of controls to ﬁlter
and search?

Scaling it up
Designing a graph-based analytics platform using Node.js and Neo4j

Architecture
Front-end web-based dashboard in Node.js and
bootstrap
REST API via Neo4j Swagger in Node.js
Data import services in Node.js
Data storage in Neo4j graph database

Applications
Analytics REST API 
(Node.js)
Dashboard
(Node.js)
Analytics Data Import Scheduler
(Node.js)
Web
Web
Console

Neo4j 
(JVM)
REST API 
(Node.js)
Dashboard
(Node.js)
Import
Scheduler
(Node.js)
Polls Meetup API
Graph Data Storage Analytical Queries Presentation, Filtering
FilterQuery
Import
Web App Web App
Retrieves Report Data Visualizes Report Data

REST API
The REST API is a fork of Neo4j Swagger. Swagger is
a speciﬁcation and complete framework
implementation for describing, producing, consuming,
and visualizing RESTful web services.

Demo
http://meetup-analytics-api.herokuapp.com/

Swagger
The REST API module of this project is based on a
fork of Swagger.

The Neo4j Swagger Project
The Swagger project was modiﬁed to use Neo4j as its
data source. The REST API module of this project is
extended from the Neo4j swagger project.

REST API Methods
Get Weekly Growth
Get Monthly Growth
Get Monthly Growth By Tag
Get Monthly Growth By Location
Get Cities
Get Countries
Get Group Count By Tag

Get Weekly Growth
Gets the weekly growth percent of meetup groups as
a time series. Returns a set of data points containing
the week of the year, the meetup group name, and
membership count.

Get Monthly Growth
Gets the monthly growth percent of meetup groups as
a time series. Returns a set of data points containing
the month of the year, the meetup group name, and
membership count.

Get Monthly Growth By Tag
Gets the monthly growth percent of meetup group
tags as a time series. Returns a set of data points
containing the month of the year, the meetup group
tag name, and membership count.

Get Monthly Growth By Location
Gets the monthly growth percent of meetup group
locations and tags as a time series. Returns a set of data
points containing the month of the year, the meetup
group tag name, the city, and membership count.

Get Cities
Gets a list of cities that meetup groups reside in.
Returns a distinct list of cities for typeahead.

Get Countries
Gets a list of countries that meetup groups reside in.
Returns a distinct list of countries for typeahead.

Get Group Count By Tag
Gets a count of groups by tag. Returns a list of tags
and the number of groups per tag.

Analytics Dashboard
The dashboard is a web application that uses client-
side JavaScript to communicate with the Neo4j
Swagger REST API to populate a series of interactive
chart controls with data. This web application uses
bootstrap for the front-end styles and highcharts.js for
the charting controls.

Demo
http://meetup-analytics-dashboard.herokuapp.com/

Reports
Meetup Tag Growth %
Cumulative Meetup Growth
Category Growth %
Groups By Tag
Meetup Tag Growth By Location

Meetup Tag Growth %
https://github.com/kbastani/meetup-analytics/blob/master/docs/DOCS.md#meetup-tag-growth-
This chart plots a line chart of the time series for a meetup group topic on Meetup.com. Each group on Meetup.com has a set of topics associated with it. This chart is meant to show the percent growth month over month.

Cumulative Meetup Growth
https://github.com/kbastani/meetup-analytics/blob/master/docs/DOCS.md#cumulative-meetup-growth
This chart plots a bar chart of the cumulative growth of a meetup group topic on Meetup.com. Using the time series data of monthly growth from the Meetup Tag Growth % chart, the growth percents over the period are
aggregated into a sum for each topic. This chart shows total growth percentage over the period.

Category Growth %
https://github.com/kbastani/meetup-analytics/blob/master/docs/DOCS.md#category-growth-
This chart plots an Donut Chart of the relative cumulative growth of a meetup group topic on Meetup.com. Using the data from Cumulative Meetup Growth, the percentage growth of each topic over the period is compared relative
to one another as a ratio of 100.

Groups By Tag
https://github.com/kbastani/meetup-analytics/blob/master/docs/DOCS.md#group-count-by-tag
This chart plots an Donut Chart of the number of groups in the region during the period for each topic. Each group is compared relative to one another as a ratio of 100.

Meetup Tag Growth By Location
This report is a simple table that shows the growth percent of all groups for a topic broke down by location. What do these high percentages tell us about Meetup? Within the last year there has been massive growth for meetup
groups that are focused on NoSQL database technology. If I imported a different topic, not related to technology, what would the data show?

Building a Graph-based Analytics Platform

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Building a Graph-based Analytics Platform

Similar to Building a Graph-based Analytics Platform (20)

More from Kenny Bastani

More from Kenny Bastani (13)

Recently uploaded

Recently uploaded (20)

Building a Graph-based Analytics Platform