SlideShare a Scribd company logo
1 of 14
1
Charlottesville Open Data
Challenge
Team DSB
Matt Miller, Nikhil Shetty
2
OBSERVATIONS IN NUMBER OF CLIENTS DATA
• High variance from April to June indicate either special events (holiday, festival, event in downtown mall),
beautiful weather drawing visitors to the downtown mall, and/or surprise inclement weather forcing visitors
indoors and onto Wi-Fi
• No observable increasing or decreasing trend in overall time series; the slope of the plotted trendline is not
statistically significant
Monticello Wine Trail Festival
Tom Tom Founder’s Festival
Pride Festival
3
NUMBER OF CLIENTS & WEATHER DATA
• Monthly trend in number of clients reveals
correlation with weather data. Number of
clients rises and falls with temperature
• April-August: High
• Sept-Oct : Medium
• Nov-Mar: Low
• Precipitation, observed at a daily level, does
not seem to have a consistent effect on the
number of clients. More granular, hourly data
may be more predictive
• The number of clients is highest in the
months of April to August – a time when most
UVa students are out of town. Thus, UVa
students are not a significant percentage of
Wi-Fi clients at the downtown mall
4
STRONG WEEKLY SEASONALITY OBSERVED IN
THE CLIENTS DATA
• Number of clients exhibits strong weekly seasonality – increases steadily through the week starting on
Sunday, peaks on Friday and settles down at the end of the week
• Fridays are the most popular days on downtown mall, particularly from April to September, during
“Fridays After Five”
5
SESSIONS DATA CLOSELY FOLLOWS CLIENTS
DATA
• Number of sessions is highly correlated with number of clients
• The histogram of sessions per client follows a near normal distribution indicating there are no additional factors affecting
number of sessions beyond those captured in the number of clients
Note: The data for number of sessions is missing for the months of Jan and half of Feb.
Therefore the # sessions values in Jan & Feb are low.
6
OBSERVATIONS IN USAGE DATA
• Usage data is inconsistent with clients data. Usage is highest in Oct-Nov while clients are highest in Apr-Aug,
indicating that the drivers of usage differ from drivers of clients
• No global trend observed in usage data
• Downloads are roughly 85% of total data usage, with uploads comprising the remainder. This ratio shifts slightly
towards uploads on Friday, Saturday, and Sunday
7
NO WEEKLY SEASONALITY IN USAGE
• The number of clients is highest on Fridays and Saturdays, but data usage does not peak on those days. Thus,
weekend visitors drive up the number of clients but are light consumers of Wi-Fi data
• Therefore, clients can be broken down into two segments:
• Segment 1 – Weekend visitors, large in number but light users of data
• Segment 2 – Likely local residents/businesses, small in number but heavy users of data
8
DAILY SEASONALITY IN USAGE DATA
Total usage follows a daily seasonality peaking between 10am-6pm EST (9am-5pm with daylight savings) each
day. Since these are non-peak hours for visitors, it reinforces the hypothesis that local residents and/or
businesses (Segment 2) are the biggest consumers of Wi-Fi data
Note: The time on the x-axis is UTC time zone
9
PARKING TICKET DATA ACTS AS A PROXY
FOR DOWNTOWN MALL ACTIVITY
Heatmap of Parking Tickets Issued 2017
• Parking tickets are issued Mon-Fri
• Data set is publicly available through
City of Charlottesville Open Data Portal
0
50
100
150
200
250
300
350
400
450
500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Parking Tickets by Hour of Day
and Day of Week
Mon Tue Wed Thu Fri
10
COMPARISON OF WEEKLY SEASONALITY IS
INCONCLUSIVE
On a daily level, parking tickets
track more closely with data
usage than with sessions or
clients, but still the relationship
is weak
Note: Weekends excluded because very few parking tickets are issued on weekends
-
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
0
20
40
60
80
100
120
140
160
180
Mon Tue Wed Thu Fri
DataUsage(MB)
Tickets,Clients,Sessions
Average Parking Tickets Verses
Wi-Fi Clients, Sessions, and Usage
Tickets Clients Sessions (x10^-1) Data Usage
11
PARKING TICKETS SHOW A MEANINGFUL
CORRELATION TO DATA USAGE AT 4-HOUR
GRANULARITY
Note: Weekends excluded because very few parking tickets are issued on weekends
y = 9982.5x + 533600
R² = 0.0297
-
1,000,000
2,000,000
3,000,000
4,000,000
5,000,000
6,000,000
7,000,000
0 20 40 60 80 100
DataUsage(B)
Parking Tickets
4-Hour Data Usage vs Parking Tickets
y = 0.3945x + 11.469
R² = 0.0362
0
2
4
6
8
10
12
14
16
18
0 1 2 3 4 5
LN(DataUsage)
LN(Parking Tickets + 1)
Log-Log Transform
4-Hour Data Usage vs Parking Tickets
• Parking tickets partially explain visitors to the downtown mall, and therefore data usage
• If client and session data were available with 4-hour granularity, we could more rigorously test this claim
and tease out the relationship between tickets and data usage versus tickets and clients
12
13
NO OBSERVED SEASONALITY IN CLIENTS
ACROSS DAYS OF MONTH
14
BREAK-UP OF USAGE DATA

More Related Content

Similar to 2018 Charlottesville Open Data Challenge - Team DSB

Where's the Broadband? Inter-County Coordinating Committee, 4.21.14
Where's the Broadband? Inter-County Coordinating Committee, 4.21.14Where's the Broadband? Inter-County Coordinating Committee, 4.21.14
Where's the Broadband? Inter-County Coordinating Committee, 4.21.14WI Broadband
 
Pabit solution final_project_advitprocessmgmt v3.0 (1)
Pabit solution final_project_advitprocessmgmt v3.0 (1)Pabit solution final_project_advitprocessmgmt v3.0 (1)
Pabit solution final_project_advitprocessmgmt v3.0 (1)Supreet Kaur
 
Portfolio MS-MBA
Portfolio MS-MBAPortfolio MS-MBA
Portfolio MS-MBARAHUL SINGH
 
The Future of DIGITAL Retail
The Future of DIGITAL RetailThe Future of DIGITAL Retail
The Future of DIGITAL RetailDeborah Weinswig
 
Rise - The Future of Digital Retail
Rise - The Future of Digital RetailRise - The Future of Digital Retail
Rise - The Future of Digital RetailDeborah Weinswig
 
AMMC 2011: Building an Online Fundraising Program
AMMC 2011: Building an Online Fundraising ProgramAMMC 2011: Building an Online Fundraising Program
AMMC 2011: Building an Online Fundraising ProgramAvalon Consulting
 
Ces consumer electronics show attendees email list 08 11. january 2019
Ces consumer electronics show attendees email list 08 11. january 2019Ces consumer electronics show attendees email list 08 11. january 2019
Ces consumer electronics show attendees email list 08 11. january 2019Global B2B Contacts LLC
 
Symposium Data-Driven Marketing: Rogier van Nieuwenhuizen - Powering growth w...
Symposium Data-Driven Marketing: Rogier van Nieuwenhuizen - Powering growth w...Symposium Data-Driven Marketing: Rogier van Nieuwenhuizen - Powering growth w...
Symposium Data-Driven Marketing: Rogier van Nieuwenhuizen - Powering growth w...Marian Dragt
 
Warsaw industry week attendees email list 06 08 nov 2019
Warsaw industry week attendees email list 06 08 nov 2019Warsaw industry week attendees email list 06 08 nov 2019
Warsaw industry week attendees email list 06 08 nov 2019Global B2B Contacts LLC
 
Postal Data & Google Maps Shine Light on New Orleans Recovery
Postal Data & Google Maps Shine Light on New Orleans RecoveryPostal Data & Google Maps Shine Light on New Orleans Recovery
Postal Data & Google Maps Shine Light on New Orleans Recoverydenicew
 
Procera at AfricaCom 2015 - Six Strategies to Monetize Data
Procera at AfricaCom 2015 - Six Strategies to Monetize DataProcera at AfricaCom 2015 - Six Strategies to Monetize Data
Procera at AfricaCom 2015 - Six Strategies to Monetize DataProcera Networks
 
Statistical and demographic business research
Statistical and demographic business researchStatistical and demographic business research
Statistical and demographic business researchSanower Azad
 
Selling against Pandora in 2014
Selling against Pandora in 2014Selling against Pandora in 2014
Selling against Pandora in 2014Peter W. Burton
 
BAE Capstone PPT Broadband
BAE Capstone PPT BroadbandBAE Capstone PPT Broadband
BAE Capstone PPT BroadbandAmgad Gaffar
 
Ericsson Mobility Report, June 2016 - Regional report North America
Ericsson Mobility Report, June 2016 - Regional report North AmericaEricsson Mobility Report, June 2016 - Regional report North America
Ericsson Mobility Report, June 2016 - Regional report North AmericaEricsson
 

Similar to 2018 Charlottesville Open Data Challenge - Team DSB (20)

Starbucks Use Case
Starbucks Use CaseStarbucks Use Case
Starbucks Use Case
 
Where's the Broadband? Inter-County Coordinating Committee, 4.21.14
Where's the Broadband? Inter-County Coordinating Committee, 4.21.14Where's the Broadband? Inter-County Coordinating Committee, 4.21.14
Where's the Broadband? Inter-County Coordinating Committee, 4.21.14
 
Pabit solution final_project_advitprocessmgmt v3.0 (1)
Pabit solution final_project_advitprocessmgmt v3.0 (1)Pabit solution final_project_advitprocessmgmt v3.0 (1)
Pabit solution final_project_advitprocessmgmt v3.0 (1)
 
Portfolio MS-MBA
Portfolio MS-MBAPortfolio MS-MBA
Portfolio MS-MBA
 
The Future of DIGITAL Retail
The Future of DIGITAL RetailThe Future of DIGITAL Retail
The Future of DIGITAL Retail
 
Rise - The Future of Digital Retail
Rise - The Future of Digital RetailRise - The Future of Digital Retail
Rise - The Future of Digital Retail
 
AMMC 2011: Building an Online Fundraising Program
AMMC 2011: Building an Online Fundraising ProgramAMMC 2011: Building an Online Fundraising Program
AMMC 2011: Building an Online Fundraising Program
 
TTWN
TTWNTTWN
TTWN
 
Ces consumer electronics show attendees email list 08 11. january 2019
Ces consumer electronics show attendees email list 08 11. january 2019Ces consumer electronics show attendees email list 08 11. january 2019
Ces consumer electronics show attendees email list 08 11. january 2019
 
Symposium Data-Driven Marketing: Rogier van Nieuwenhuizen - Powering growth w...
Symposium Data-Driven Marketing: Rogier van Nieuwenhuizen - Powering growth w...Symposium Data-Driven Marketing: Rogier van Nieuwenhuizen - Powering growth w...
Symposium Data-Driven Marketing: Rogier van Nieuwenhuizen - Powering growth w...
 
Warsaw industry week attendees email list 06 08 nov 2019
Warsaw industry week attendees email list 06 08 nov 2019Warsaw industry week attendees email list 06 08 nov 2019
Warsaw industry week attendees email list 06 08 nov 2019
 
Making Use of Big Data
Making Use of Big DataMaking Use of Big Data
Making Use of Big Data
 
Postal Data & Google Maps Shine Light on New Orleans Recovery
Postal Data & Google Maps Shine Light on New Orleans RecoveryPostal Data & Google Maps Shine Light on New Orleans Recovery
Postal Data & Google Maps Shine Light on New Orleans Recovery
 
E-rate: Basic Training: Funding Year 2013
E-rate: Basic Training: Funding Year 2013E-rate: Basic Training: Funding Year 2013
E-rate: Basic Training: Funding Year 2013
 
Procera at AfricaCom 2015 - Six Strategies to Monetize Data
Procera at AfricaCom 2015 - Six Strategies to Monetize DataProcera at AfricaCom 2015 - Six Strategies to Monetize Data
Procera at AfricaCom 2015 - Six Strategies to Monetize Data
 
Statistical and demographic business research
Statistical and demographic business researchStatistical and demographic business research
Statistical and demographic business research
 
Selling against Pandora in 2014
Selling against Pandora in 2014Selling against Pandora in 2014
Selling against Pandora in 2014
 
BAE Capstone PPT Broadband
BAE Capstone PPT BroadbandBAE Capstone PPT Broadband
BAE Capstone PPT Broadband
 
Ericsson Mobility Report, June 2016 - Regional report North America
Ericsson Mobility Report, June 2016 - Regional report North AmericaEricsson Mobility Report, June 2016 - Regional report North America
Ericsson Mobility Report, June 2016 - Regional report North America
 
SBRA for IFC Georgia 04june2013
SBRA for IFC Georgia 04june2013SBRA for IFC Georgia 04june2013
SBRA for IFC Georgia 04june2013
 

More from Astraea, Inc.

Building a Geospatial Analysis Platform - Dr. Kimberly Scott
Building a Geospatial Analysis Platform - Dr. Kimberly ScottBuilding a Geospatial Analysis Platform - Dr. Kimberly Scott
Building a Geospatial Analysis Platform - Dr. Kimberly ScottAstraea, Inc.
 
Detecting Solar Farms Using Deep Learning
Detecting Solar Farms Using Deep LearningDetecting Solar Farms Using Deep Learning
Detecting Solar Farms Using Deep LearningAstraea, Inc.
 
2018 IEEE WIE Presentation - Dr. Kimberly Scott
2018 IEEE WIE Presentation - Dr. Kimberly Scott2018 IEEE WIE Presentation - Dr. Kimberly Scott
2018 IEEE WIE Presentation - Dr. Kimberly ScottAstraea, Inc.
 
2018 Charlottesville Open Data Challenge - Alex Miller
2018 Charlottesville Open Data Challenge - Alex Miller2018 Charlottesville Open Data Challenge - Alex Miller
2018 Charlottesville Open Data Challenge - Alex MillerAstraea, Inc.
 
Using Deep Learning to Derive 3D Cities from Satellite Imagery
Using Deep Learning to Derive 3D Cities from Satellite ImageryUsing Deep Learning to Derive 3D Cities from Satellite Imagery
Using Deep Learning to Derive 3D Cities from Satellite ImageryAstraea, Inc.
 
RasterFrames: Enabling Global-Scale Geospatial Machine Learning
RasterFrames: Enabling Global-Scale Geospatial Machine LearningRasterFrames: Enabling Global-Scale Geospatial Machine Learning
RasterFrames: Enabling Global-Scale Geospatial Machine LearningAstraea, Inc.
 
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...Machine Learning, FOSS, and open data to map deforestation trends in the Braz...
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...Astraea, Inc.
 

More from Astraea, Inc. (7)

Building a Geospatial Analysis Platform - Dr. Kimberly Scott
Building a Geospatial Analysis Platform - Dr. Kimberly ScottBuilding a Geospatial Analysis Platform - Dr. Kimberly Scott
Building a Geospatial Analysis Platform - Dr. Kimberly Scott
 
Detecting Solar Farms Using Deep Learning
Detecting Solar Farms Using Deep LearningDetecting Solar Farms Using Deep Learning
Detecting Solar Farms Using Deep Learning
 
2018 IEEE WIE Presentation - Dr. Kimberly Scott
2018 IEEE WIE Presentation - Dr. Kimberly Scott2018 IEEE WIE Presentation - Dr. Kimberly Scott
2018 IEEE WIE Presentation - Dr. Kimberly Scott
 
2018 Charlottesville Open Data Challenge - Alex Miller
2018 Charlottesville Open Data Challenge - Alex Miller2018 Charlottesville Open Data Challenge - Alex Miller
2018 Charlottesville Open Data Challenge - Alex Miller
 
Using Deep Learning to Derive 3D Cities from Satellite Imagery
Using Deep Learning to Derive 3D Cities from Satellite ImageryUsing Deep Learning to Derive 3D Cities from Satellite Imagery
Using Deep Learning to Derive 3D Cities from Satellite Imagery
 
RasterFrames: Enabling Global-Scale Geospatial Machine Learning
RasterFrames: Enabling Global-Scale Geospatial Machine LearningRasterFrames: Enabling Global-Scale Geospatial Machine Learning
RasterFrames: Enabling Global-Scale Geospatial Machine Learning
 
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...Machine Learning, FOSS, and open data to map deforestation trends in the Braz...
Machine Learning, FOSS, and open data to map deforestation trends in the Braz...
 

Recently uploaded

AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!Memoori
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptxFIDO Alliance
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
Microsoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdfMicrosoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdfOverkill Security
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc
 
الأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهلهالأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهلهMohamed Sweelam
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfdanishmna97
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistandanishmna97
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 

Recently uploaded (20)

AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Microsoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdfMicrosoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdf
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
الأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهلهالأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهله
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistan
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 

2018 Charlottesville Open Data Challenge - Team DSB

  • 1. 1 Charlottesville Open Data Challenge Team DSB Matt Miller, Nikhil Shetty
  • 2. 2 OBSERVATIONS IN NUMBER OF CLIENTS DATA • High variance from April to June indicate either special events (holiday, festival, event in downtown mall), beautiful weather drawing visitors to the downtown mall, and/or surprise inclement weather forcing visitors indoors and onto Wi-Fi • No observable increasing or decreasing trend in overall time series; the slope of the plotted trendline is not statistically significant Monticello Wine Trail Festival Tom Tom Founder’s Festival Pride Festival
  • 3. 3 NUMBER OF CLIENTS & WEATHER DATA • Monthly trend in number of clients reveals correlation with weather data. Number of clients rises and falls with temperature • April-August: High • Sept-Oct : Medium • Nov-Mar: Low • Precipitation, observed at a daily level, does not seem to have a consistent effect on the number of clients. More granular, hourly data may be more predictive • The number of clients is highest in the months of April to August – a time when most UVa students are out of town. Thus, UVa students are not a significant percentage of Wi-Fi clients at the downtown mall
  • 4. 4 STRONG WEEKLY SEASONALITY OBSERVED IN THE CLIENTS DATA • Number of clients exhibits strong weekly seasonality – increases steadily through the week starting on Sunday, peaks on Friday and settles down at the end of the week • Fridays are the most popular days on downtown mall, particularly from April to September, during “Fridays After Five”
  • 5. 5 SESSIONS DATA CLOSELY FOLLOWS CLIENTS DATA • Number of sessions is highly correlated with number of clients • The histogram of sessions per client follows a near normal distribution indicating there are no additional factors affecting number of sessions beyond those captured in the number of clients Note: The data for number of sessions is missing for the months of Jan and half of Feb. Therefore the # sessions values in Jan & Feb are low.
  • 6. 6 OBSERVATIONS IN USAGE DATA • Usage data is inconsistent with clients data. Usage is highest in Oct-Nov while clients are highest in Apr-Aug, indicating that the drivers of usage differ from drivers of clients • No global trend observed in usage data • Downloads are roughly 85% of total data usage, with uploads comprising the remainder. This ratio shifts slightly towards uploads on Friday, Saturday, and Sunday
  • 7. 7 NO WEEKLY SEASONALITY IN USAGE • The number of clients is highest on Fridays and Saturdays, but data usage does not peak on those days. Thus, weekend visitors drive up the number of clients but are light consumers of Wi-Fi data • Therefore, clients can be broken down into two segments: • Segment 1 – Weekend visitors, large in number but light users of data • Segment 2 – Likely local residents/businesses, small in number but heavy users of data
  • 8. 8 DAILY SEASONALITY IN USAGE DATA Total usage follows a daily seasonality peaking between 10am-6pm EST (9am-5pm with daylight savings) each day. Since these are non-peak hours for visitors, it reinforces the hypothesis that local residents and/or businesses (Segment 2) are the biggest consumers of Wi-Fi data Note: The time on the x-axis is UTC time zone
  • 9. 9 PARKING TICKET DATA ACTS AS A PROXY FOR DOWNTOWN MALL ACTIVITY Heatmap of Parking Tickets Issued 2017 • Parking tickets are issued Mon-Fri • Data set is publicly available through City of Charlottesville Open Data Portal 0 50 100 150 200 250 300 350 400 450 500 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Parking Tickets by Hour of Day and Day of Week Mon Tue Wed Thu Fri
  • 10. 10 COMPARISON OF WEEKLY SEASONALITY IS INCONCLUSIVE On a daily level, parking tickets track more closely with data usage than with sessions or clients, but still the relationship is weak Note: Weekends excluded because very few parking tickets are issued on weekends - 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 0 20 40 60 80 100 120 140 160 180 Mon Tue Wed Thu Fri DataUsage(MB) Tickets,Clients,Sessions Average Parking Tickets Verses Wi-Fi Clients, Sessions, and Usage Tickets Clients Sessions (x10^-1) Data Usage
  • 11. 11 PARKING TICKETS SHOW A MEANINGFUL CORRELATION TO DATA USAGE AT 4-HOUR GRANULARITY Note: Weekends excluded because very few parking tickets are issued on weekends y = 9982.5x + 533600 R² = 0.0297 - 1,000,000 2,000,000 3,000,000 4,000,000 5,000,000 6,000,000 7,000,000 0 20 40 60 80 100 DataUsage(B) Parking Tickets 4-Hour Data Usage vs Parking Tickets y = 0.3945x + 11.469 R² = 0.0362 0 2 4 6 8 10 12 14 16 18 0 1 2 3 4 5 LN(DataUsage) LN(Parking Tickets + 1) Log-Log Transform 4-Hour Data Usage vs Parking Tickets • Parking tickets partially explain visitors to the downtown mall, and therefore data usage • If client and session data were available with 4-hour granularity, we could more rigorously test this claim and tease out the relationship between tickets and data usage versus tickets and clients
  • 12. 12
  • 13. 13 NO OBSERVED SEASONALITY IN CLIENTS ACROSS DAYS OF MONTH