Visualizing 4G Experience
by country, across networks
James Robinson, CTO OpenSignal
We collect data.

3.5M

devices

• We’ve built the world’s largest global database
of cellular data
• Our data comes via Android and iOS apps (a
sensor network)
• SmartUK: UK’s Most Innovative Mobile Co.

And visualize it.

9.0M
map tiles

OpenSignal

• Market reports for mobile network operators
• Independent coverage maps for consumers
• One-off reports for everybody

4G is in focus now.
#bigdatashow

@jamesCRR
How and why we made these

coverage
OpenSignal

rollout
#bigdatashow

speeds
@jamesCRR
opensignal.com/jr/big-data-show

OpenSignal

#bigdatashow

@jamesCRR
The Lifecycle of a Data point
1. Acquisition

2. Storage

3. Visualization

OpenSignal

#bigdatashow

@jamesCRR
Acquisition
Automated crowd-sourcing: a sensor network

Smartphones are ideal probes of
network performance.
We’re re-purposing sensors.
Consumers come first,
data comes second.

OpenSignal

#bigdatashow

@jamesCRR
Acquisition
Other sensor networks

• Google maps
• Waze (traffic)
Similar to OpenSignal:
• Rootmetrics
• Sensorly

OpenSignal

#bigdatashow

Sensor creep Galaxy S
(S4 has 3 more!)

@jamesCRR
Storage
Automated crowd-sourcing: a sensor network

Going via MySQL:
• Analysis on the datastream
• Quick temperature check on
the app
• S3 not ideal for appending
data to files

Compressed in app
MySQL buffer
Amazon S3

OpenSignal

Relatively easy to bring files
from S3 into Hadoop running on
EC2

#bigdatashow

@jamesCRR
Visualization: themes

A varied tool box.
Use of open-source &
web technologies.

OpenSignal

#bigdatashow

@jamesCRR
Visualization 1: 4G Coverage
We wanted to create a resource where
people could compare networks for
areas that matter to them.

Google Maps
was the natural choice.

• Familiar - to designers and consumers
• Scalable - and battle tested
• Flexible – Bayeux tapestry

OpenSignal

#bigdatashow

@jamesCRR
Visualization 1: 4G Coverage
What we didn’t want.

• We’re mapping user
experience, not modelling
cellular propagation
• Coverage feels
organic, maps should
reflect that
• Everything should be in
one place and easily
filterable

OpenSignal

#bigdatashow

@jamesCRR
Visualization 1: 4G Coverage
How we do it.

MTS (network)
3G (10 poss)
Zoom 10
Novosibirsk (x,y)

OpenSignal

Pull data into Hadoop
Pre-aggregate for different zoom levels
Output MySQL tables

Generate tiles when needed
- When users scroll to an area on the map,
query the server
- Check if a tile already exists
- Tiles generated in PHP (i.e. on server)
- Could move to HTML5 or a javascript langu
(D3!) – client based
- Store new tiles on server

#bigdatashow

@jamesCRR
Visualization 2: 4G Rollout

D3

We wanted to show:
• Countries with LTE
• When it was deployed
• Planned deployments
• Individual networks

DATA DRIVEN
DOCUMENTS

& it had

• Created by Mike
Bostock of NYT.
• Opensource.
• JS & SVG based.
• Engineers should
love it.

OpenSignal

to look

good
#bigdatashow

@jamesCRR
Visualization 2: 4G Rollout
What we could have done.
• Don’t use pins for country level data!
• Better & simpler: Google Fusion Tables, or
Google Viz (but no time dimension)
• Custom tiles (time dimension but hard to make
interactive)
Pins are OK for cities.

Fusion tables: shallow learning
curve, more flexible than you initially
think, but less flexible than you’d like:
OpenSignal

#bigdatashow

@jamesCRR
Visualization 2: 4G Rollout
How we do it.
• Countries defined
by geojson
(various sources
available)
• Data on rollout
also in json

• The result of a graphic designer/front-end code,
working with a data analyst and a copy writer
• One data analyst with knowledge of javascript
could get similar results

OpenSignal

#bigdatashow

@jamesCRR
Visualization 3: 4G Speed

• We had 11 countries and 22 networks with
good data on 3G speed.
• We could have just put everything in one
chart (33 bars) or two charts (11 and 22)
• But it wouldn’t be extensible or so easily
navigable.

OpenSignal

#bigdatashow

@jamesCRR
Visualization 3: 4G Speed
How we do it.

• Use interactivity as a way of hiding data
• Give hints that the data can be explored
• Re-scaling axes can be confusing

OpenSignal

#bigdatashow

@jamesCRR
Final Thoughts 1
D3 powerful for
• Transitioning between data sets / visualization types
• Your company already has people who’d love to use it
(they just don’t know it yet)
But … it takes more time to set up each visualization
than Excel/Tableau/R

When starting to analyse, don’t have one tool or visualization in mind
But know what’s out there
OpenSignal

#bigdatashow

@jamesCRR
Final thoughts 2
You use open-source tools for analysis – why not visualization?
Excel could make a comeback – but unlikely to be cutting edge
A visualization is great when everyone can understand it
4G rollouts are a very mixed bag

OpenSignal

#bigdatashow

@jamesCRR
Thank you
OpenSignal.com
@jamesCRR

Visualizing 4G experience by country, across networks, OpenSignal

  • 1.
    Visualizing 4G Experience bycountry, across networks James Robinson, CTO OpenSignal
  • 2.
    We collect data. 3.5M devices •We’ve built the world’s largest global database of cellular data • Our data comes via Android and iOS apps (a sensor network) • SmartUK: UK’s Most Innovative Mobile Co. And visualize it. 9.0M map tiles OpenSignal • Market reports for mobile network operators • Independent coverage maps for consumers • One-off reports for everybody 4G is in focus now. #bigdatashow @jamesCRR
  • 3.
    How and whywe made these coverage OpenSignal rollout #bigdatashow speeds @jamesCRR
  • 4.
  • 5.
    The Lifecycle ofa Data point 1. Acquisition 2. Storage 3. Visualization OpenSignal #bigdatashow @jamesCRR
  • 6.
    Acquisition Automated crowd-sourcing: asensor network Smartphones are ideal probes of network performance. We’re re-purposing sensors. Consumers come first, data comes second. OpenSignal #bigdatashow @jamesCRR
  • 7.
    Acquisition Other sensor networks •Google maps • Waze (traffic) Similar to OpenSignal: • Rootmetrics • Sensorly OpenSignal #bigdatashow Sensor creep Galaxy S (S4 has 3 more!) @jamesCRR
  • 8.
    Storage Automated crowd-sourcing: asensor network Going via MySQL: • Analysis on the datastream • Quick temperature check on the app • S3 not ideal for appending data to files Compressed in app MySQL buffer Amazon S3 OpenSignal Relatively easy to bring files from S3 into Hadoop running on EC2 #bigdatashow @jamesCRR
  • 9.
    Visualization: themes A variedtool box. Use of open-source & web technologies. OpenSignal #bigdatashow @jamesCRR
  • 10.
    Visualization 1: 4GCoverage We wanted to create a resource where people could compare networks for areas that matter to them. Google Maps was the natural choice. • Familiar - to designers and consumers • Scalable - and battle tested • Flexible – Bayeux tapestry OpenSignal #bigdatashow @jamesCRR
  • 11.
    Visualization 1: 4GCoverage What we didn’t want. • We’re mapping user experience, not modelling cellular propagation • Coverage feels organic, maps should reflect that • Everything should be in one place and easily filterable OpenSignal #bigdatashow @jamesCRR
  • 12.
    Visualization 1: 4GCoverage How we do it. MTS (network) 3G (10 poss) Zoom 10 Novosibirsk (x,y) OpenSignal Pull data into Hadoop Pre-aggregate for different zoom levels Output MySQL tables Generate tiles when needed - When users scroll to an area on the map, query the server - Check if a tile already exists - Tiles generated in PHP (i.e. on server) - Could move to HTML5 or a javascript langu (D3!) – client based - Store new tiles on server #bigdatashow @jamesCRR
  • 13.
    Visualization 2: 4GRollout D3 We wanted to show: • Countries with LTE • When it was deployed • Planned deployments • Individual networks DATA DRIVEN DOCUMENTS & it had • Created by Mike Bostock of NYT. • Opensource. • JS & SVG based. • Engineers should love it. OpenSignal to look good #bigdatashow @jamesCRR
  • 14.
    Visualization 2: 4GRollout What we could have done. • Don’t use pins for country level data! • Better & simpler: Google Fusion Tables, or Google Viz (but no time dimension) • Custom tiles (time dimension but hard to make interactive) Pins are OK for cities. Fusion tables: shallow learning curve, more flexible than you initially think, but less flexible than you’d like: OpenSignal #bigdatashow @jamesCRR
  • 15.
    Visualization 2: 4GRollout How we do it. • Countries defined by geojson (various sources available) • Data on rollout also in json • The result of a graphic designer/front-end code, working with a data analyst and a copy writer • One data analyst with knowledge of javascript could get similar results OpenSignal #bigdatashow @jamesCRR
  • 16.
    Visualization 3: 4GSpeed • We had 11 countries and 22 networks with good data on 3G speed. • We could have just put everything in one chart (33 bars) or two charts (11 and 22) • But it wouldn’t be extensible or so easily navigable. OpenSignal #bigdatashow @jamesCRR
  • 17.
    Visualization 3: 4GSpeed How we do it. • Use interactivity as a way of hiding data • Give hints that the data can be explored • Re-scaling axes can be confusing OpenSignal #bigdatashow @jamesCRR
  • 18.
    Final Thoughts 1 D3powerful for • Transitioning between data sets / visualization types • Your company already has people who’d love to use it (they just don’t know it yet) But … it takes more time to set up each visualization than Excel/Tableau/R When starting to analyse, don’t have one tool or visualization in mind But know what’s out there OpenSignal #bigdatashow @jamesCRR
  • 19.
    Final thoughts 2 Youuse open-source tools for analysis – why not visualization? Excel could make a comeback – but unlikely to be cutting edge A visualization is great when everyone can understand it 4G rollouts are a very mixed bag OpenSignal #bigdatashow @jamesCRR
  • 20.

Editor's Notes

  • #3 My background (Physics & Philosophy, Tesco optimization of ordering strategies based on sales data)
  • #4 We’re going to look at 3 particular ways we’ve gone about visualizing consumer experience of 4G.
  • #5 Since the visualizations are all interactive, get them from here if you can.
  • #6 Before we can get to the very tasty stuff of visualization we need to spend some time seeing where all the data comes from.
  • #8 Google maps: the first app to get 1 billion users. Collecting cell and Wifi info Waze c. 50m downloads. – collecting I like to mention our competitors sometimes, give the guys a chance … Other possibilities: Met office exploring using energy from grid-tied solar panels to measure sunlight hours. Fitbit. Withings. Connected devices – internet of things.S4 has 3 more: hygrometer, amb temperature, infra-red
  • #9 Particularly important for sensor networks – or any collection of data in real-time, or even any continuous data collection that’s not real time – is to have a way of watching over the stream of incoming data. If it gets too large you may have a bug in your client, your servers may fail, the users of the client might get angry, pretty soon you’ll have a mob with pitchforks. On the other hand something migth happen that prevents data coming in and then … you have nothing to play with.
  • #10 My background (Physics & Philosophy, Tesco optimization of ordering strategies based on sales data)
  • #11 My background (Physics & Philosophy, Tesco optimization of ordering strategies based on sales data)
  • #12 My background (Physics & Philosophy, Tesco optimization of ordering strategies based on sales data)
  • #13 My background (Physics & Philosophy, Tesco optimization of ordering strategies based on sales data)
  • #14 4G rollout:There’s close to 100 countries that either have LTE or are planning it, generally with multiple networks.The data is simple, but it’s quite a lot to visualize.Our aim was very much to give an overview of LTE – we released this report under the name “The State of LTE”, we thought it was important people understood a little bit about LTE and it’s global context.
  • #15 Some people do use pins, in fact another popular 4G rollout map does. So will this may seem like a total straw man its not.
  • #16 My background (Physics & Philosophy, Tesco optimization of ordering strategies based on sales data)
  • #17 We collected the speedsexpereinced by users – not the claims made by operators or the possibilities of the network technology being used.
  • #18 Read Tufte!People spend a lot of time learning Excel, and learning how to master pivot tables is indeed time well spent, you learn some valuable skills – some of which can be transferred to other forms of analysis. But… read Tufte or encourage your employees to. Note the absence of chartjunk.
  • #19 Stack = toolbox for those without a web background 
  • #20 If you’re using hadoop you’re already using opensource.If the non-techy team members understand it, then argument will be even str=onger, and the tech guys will understand it faster. So the context of a visualization is not as important as you think: your aim should be to make it as clear as possible.