SlideShare a Scribd company logo
1 of 21
Download to read offline
Big Data Conference 2013:
Analytics and Applications for Federal Big Data

Data Tactics Corp: A Blended Approach to Big
Data Analytics
!

Richard Heimann,
Data Scientist at Data Tactics Corporation
!

Data Tactics Analytics Practice
The Team:
(Nathan D., Shrayes R., David P., Adam VE., Geoffrey B., Rich H.)

Graduates from top universities...


!
Advanced degrees include:

mathematics, computer science, astrophysics, electrical
engineering, mechanical engineering, statistics, social sciences.

!
Base competencies (horizontals): clustering, association rules,
regression, naive bayesian classifier, decision trees, time-series,
text analysis.

!
Going beyond the base (verticals)...
th

an

pl

st

RT

CA

Ra

ru

nd
om
se
ct
nt
ni
co
ur
ng
im Fo
ns
al
en res
alg
tra
eq
ta
t
in
or
ua
na
ed
ith
tio
to
lys
m
op
n
pi
ec
s
is
m
tim
c
on
od
m
om
od iza
eli
ng
els tion fac
et
sp
ri
to
s
ra
at cs
ial
na
ec
di
lys
au
ba
m
on
is
to
ye
en
om
re
sia
sio
gr
et
n
es
na
ric
st
siv
lr
at
s
ed
ist
e
m
uc
lat
ics
od
tio PC
en
els
n
tc
A
las
IC
s
A
as
an
hi
tro
gr
aly
er
ph
ap
ar
ys sis
ch
h
th
ica
ica
eo
lt
lm
ry
im
od
DL
alg
enu IRT
els
se
IS
or
m
A
rie
ith
er
s
m
ica
an
s
l in
aly
te
sis
m
gr
ba
ixt
at
gg
ur
io
SV
e
in
n
m
g/
M
te
od
bo
ch
m
els
os
ni
ax
qu
tin
en
es
g
t

pa

Horizontals & Verticals

Clustering || Regression || Decision Trees || Text Analysis

Association Rules || Naive Bayesian Classifier || Time Series Analysis
Data Tactics Analytics Practice
Hierarchy of Data Scientists
Why Analytics [Business]???
Why are analytics important? 

(Business, Analytics, Practical)

!
!

!

"We need to stop reinventing the cloud
and start using it!"
(Dave Boyd)
!
!
!
!
Why Analytics [Analytics]???
Why are analytics important? 

(Business, Analytics, Practical)
!
!
No Free Lunch (NFL): no algorithm performs better than
any other when their performance is averaged uniformly
over all possible problems of a particular type. Algorithms
must be designed for a particular domain or style of
problem, and that there is no such thing as a general
purpose algorithm.

!
!
!
Why Analytics [Practical]???
Academic Publications Scale

N

Web Scales
IC Scales

t

If this guy doesn’t scale - none of us do.

t
algo to users > algo to data
Development
Deployment
Machine

User

Parallel

Distributed

Objective

Subjective

M/R

HDFS

Valid

Useful

MPP

SOA

Nontrivial

Novel

Accurate

Comprehensible

GPU
Shiny
Open Sourced by RStudio in November 2012

!
Not the first to wrap R in the browser but perhaps the
easiest for R developers 

!
Don’t need to know HTML, CSS and javascript to get
started 

!
Reactive Programming model 

!
Web sockets for communication
server.R
# Define server logic required to generate and plot a random
# distribution!
shinyServer(function(input, output) {!
!
# Expression that generates a plot of the distribution.!
# renderPlot:!
#!
# 1: Is "reactive" and will therefore automatically !
#
re-executed when inputs change.!
# 2: Its output type is a plot. !
!
output$distPlot <- renderPlot({!
!
# generate an rnorm distribution and plot it!
dist <- rnorm(input$obs)!
hist(dist)!
})!
})
ui.R
library(shiny)!

!

# Define UI for application that plots random distributions !
shinyUI(pageWithSidebar(!
!
# Application title:!
headerPanel("My Shiny App!"),!
!
# Sidebar with a slider input for number of observations:!
sidebarPanel(!
sliderInput("obs", !
"Number of observations:", !
min = 0, !
max = 1000, !
value = 500)!
),!
# Show a plot of the generated distribution:!
mainPanel(!
plotOutput("distPlot")!
)!
))
ui.R
headerPanel()

sidebarPanel()

mainPanel()
server.R + ui.R = microscope
adjustable parameters (knobs): 0 < knobs < small k
knobs = lighting, varying objectives, focusing (fine and course)

!
knobs: 

fine and course filtering: 

geography

time

variable of interest 

observations of interest

promote significant (objective) patterns

change model parameters
BDE + Shiny
Overlapping Solutions
Multiple models allow more nuanced
learning from data.

Latent Spatial Traffic Patterns

!

Convergent results serve as crossvalidation.

!

2

Points of divergence provide additional
insights and allow models to be
calibrated further.

!

Different models can provide answers to
different questions or answers to the
same question for different analysts.

!

Multi-method excels to diverse teams
with mutable missions.

!
smooth + rough = data
!

New paradigm where the question, “Are
there multiple, overlapping ways to solve
this problem” dominate.

3

1
Overlapping Solutions
Are there multiple, overlapping ways to solve this problem?

yt
ic

yt

al

A


An

An

B

al

ic

A+B

+

+

B

C

A+B+C

A

C

Analytic C
Summary:

# our blended approach !
dt.philosophy <- lm(analytics ~ bigdata +
smalldata + objective +
subjective:overlapping.solutions,
data=data)
Overlapping Solutions
Data Science for Government (DS4G)
About (DS4G):

!

1: Improve on definitions of analytics.

2: Outline optimal interactions with Data Scientists.

3: Provide a life-cycle for Data Science.

4: Most importantly, share a taxonomy to identify analytical questions one
could ask of data (Causal Effects, Classification, Outlier Detection, Big Data and
Analytics, Measurement Models, & Text Analysis)

!

Presented by Data Tactics Analytics Team

Location: TBD 

Time: 1Q 2014

Duration: ~ 5 hrs.

Cost: FREE

Audience: Government managers and Data Tactics partners with their
customers.
LUBAP goes wild!
421 attending!

http://www.meetup.com/Data-Science-DC/events/146953142/
Thank you...	

Questions?
Homepage: http://www.data-tactics.com
Blog: http://datatactics.blogspot.com
Twitter: @DataTactics
Slideshare: http://www.slideshare.net/DataTactics/presentations
Or, me (Rich Heimann): rheimann@data-tactics-corp.com

More Related Content

What's hot

Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Sri Ambati
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Andre Freitas
 
Moving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow ExtendedMoving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow Extended
Jonathan Mugan
 

What's hot (13)

Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
 
Data science
Data scienceData science
Data science
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
 
Model evaluation in the land of deep learning
Model evaluation in the land of deep learningModel evaluation in the land of deep learning
Model evaluation in the land of deep learning
 
Feature Engineering for Machine Learning at QConSP
Feature Engineering for Machine Learning at QConSPFeature Engineering for Machine Learning at QConSP
Feature Engineering for Machine Learning at QConSP
 
Moving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow ExtendedMoving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow Extended
 
[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies[系列活動] 資料探勘速遊 - Session4 case-studies
[系列活動] 資料探勘速遊 - Session4 case-studies
 
Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...
Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...
Keynote by Agus Sudjianto, Wells Fargo - Interpretable Machine Learning - H2O...
 
Poster
PosterPoster
Poster
 
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1
 
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in RGentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
 

Viewers also liked (11)

Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtc
 
Οι Λάπωνες
Οι ΛάπωνεςΟι Λάπωνες
Οι Λάπωνες
 
Ontology and Reports
Ontology and ReportsOntology and Reports
Ontology and Reports
 
Data Tactics Open Source Brief
Data Tactics Open Source BriefData Tactics Open Source Brief
Data Tactics Open Source Brief
 
Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1
 
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
 
Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3
 
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATANETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
 
ODSC_Cherven_20160518
ODSC_Cherven_20160518ODSC_Cherven_20160518
ODSC_Cherven_20160518
 
Why L-3 Data Tactics Data Science?
Why L-3 Data Tactics Data Science?Why L-3 Data Tactics Data Science?
Why L-3 Data Tactics Data Science?
 
Horizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataHorizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence Data
 

Similar to Big Data Conference

392_SannaReddyBharath (1)
392_SannaReddyBharath (1)392_SannaReddyBharath (1)
392_SannaReddyBharath (1)
bharath reddy
 
438_AmeeruddinMohammed
438_AmeeruddinMohammed438_AmeeruddinMohammed
438_AmeeruddinMohammed
Ameeruddin MD
 

Similar to Big Data Conference (20)

566_SriramDandamudi_CEE
566_SriramDandamudi_CEE566_SriramDandamudi_CEE
566_SriramDandamudi_CEE
 
587_EswarPrasadReddyMachireddy_CEE
587_EswarPrasadReddyMachireddy_CEE587_EswarPrasadReddyMachireddy_CEE
587_EswarPrasadReddyMachireddy_CEE
 
The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration
 
662_AravindKumarN_CEE
662_AravindKumarN_CEE662_AravindKumarN_CEE
662_AravindKumarN_CEE
 
671_JeevanRavula_CEE
671_JeevanRavula_CEE671_JeevanRavula_CEE
671_JeevanRavula_CEE
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?
 
598_RamaSrikanthJakkam_CEE
598_RamaSrikanthJakkam_CEE598_RamaSrikanthJakkam_CEE
598_RamaSrikanthJakkam_CEE
 
603_SaiKiranPutta_CEE
603_SaiKiranPutta_CEE603_SaiKiranPutta_CEE
603_SaiKiranPutta_CEE
 
392_SannaReddyBharath (1)
392_SannaReddyBharath (1)392_SannaReddyBharath (1)
392_SannaReddyBharath (1)
 
How to Become a Big Data Professional.pdf
How to Become a Big Data Professional.pdfHow to Become a Big Data Professional.pdf
How to Become a Big Data Professional.pdf
 
Data Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneData Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZone
 
Data analysis
Data analysisData analysis
Data analysis
 
Data Analytics_BigData Cert
Data Analytics_BigData CertData Analytics_BigData Cert
Data Analytics_BigData Cert
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
421_PrakashMudholkar
421_PrakashMudholkar421_PrakashMudholkar
421_PrakashMudholkar
 
402_DheerajKura
402_DheerajKura402_DheerajKura
402_DheerajKura
 
438_AmeeruddinMohammed
438_AmeeruddinMohammed438_AmeeruddinMohammed
438_AmeeruddinMohammed
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
 
(Big) Data (Science) Skills
(Big) Data (Science) Skills(Big) Data (Science) Skills
(Big) Data (Science) Skills
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data science
 

More from DataTactics (11)

C Star Analytic Presentation
C Star Analytic PresentationC Star Analytic Presentation
C Star Analytic Presentation
 
Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka
 
Data Tactics Analytics Practice
Data Tactics Analytics PracticeData Tactics Analytics Practice
Data Tactics Analytics Practice
 
Discontinuities Demo
Discontinuities DemoDiscontinuities Demo
Discontinuities Demo
 
DLISA
DLISADLISA
DLISA
 
Analytics Brownbag
Analytics Brownbag Analytics Brownbag
Analytics Brownbag
 
Big Data Taxonomy 8/26/2013
Big Data Taxonomy 8/26/2013Big Data Taxonomy 8/26/2013
Big Data Taxonomy 8/26/2013
 
Data Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionData Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and Description
 
Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013
 
Capabilities Brief Analytics
Capabilities Brief AnalyticsCapabilities Brief Analytics
Capabilities Brief Analytics
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

Big Data Conference

  • 1. Big Data Conference 2013: Analytics and Applications for Federal Big Data Data Tactics Corp: A Blended Approach to Big Data Analytics ! Richard Heimann, Data Scientist at Data Tactics Corporation
  • 2. ! Data Tactics Analytics Practice The Team: (Nathan D., Shrayes R., David P., Adam VE., Geoffrey B., Rich H.) Graduates from top universities... ! Advanced degrees include: mathematics, computer science, astrophysics, electrical engineering, mechanical engineering, statistics, social sciences. ! Base competencies (horizontals): clustering, association rules, regression, naive bayesian classifier, decision trees, time-series, text analysis. ! Going beyond the base (verticals)...
  • 3. th an pl st RT CA Ra ru nd om se ct nt ni co ur ng im Fo ns al en res alg tra eq ta t in or ua na ed ith tio to lys m op n pi ec s is m tim c on od m om od iza eli ng els tion fac et sp ri to s ra at cs ial na ec di lys au ba m on is to ye en om re sia sio gr et n es na ric st siv lr at s ed ist e m uc lat ics od tio PC en els n tc A las IC s A as an hi tro gr aly er ph ap ar ys sis ch h th ica ica eo lt lm ry im od DL alg enu IRT els se IS or m A rie ith er s m ica an s l in aly te sis m gr ba ixt at gg ur io SV e in n m g/ M te od bo ch m els os ni ax qu tin en es g t pa Horizontals & Verticals Clustering || Regression || Decision Trees || Text Analysis Association Rules || Naive Bayesian Classifier || Time Series Analysis
  • 4. Data Tactics Analytics Practice Hierarchy of Data Scientists
  • 5. Why Analytics [Business]??? Why are analytics important? (Business, Analytics, Practical) ! ! ! "We need to stop reinventing the cloud and start using it!" (Dave Boyd) ! ! ! !
  • 6. Why Analytics [Analytics]??? Why are analytics important? (Business, Analytics, Practical) ! ! No Free Lunch (NFL): no algorithm performs better than any other when their performance is averaged uniformly over all possible problems of a particular type. Algorithms must be designed for a particular domain or style of problem, and that there is no such thing as a general purpose algorithm. ! ! !
  • 7. Why Analytics [Practical]??? Academic Publications Scale N Web Scales IC Scales t If this guy doesn’t scale - none of us do. t
  • 8. algo to users > algo to data Development Deployment Machine User Parallel Distributed Objective Subjective M/R HDFS Valid Useful MPP SOA Nontrivial Novel Accurate Comprehensible GPU
  • 9. Shiny Open Sourced by RStudio in November 2012 ! Not the first to wrap R in the browser but perhaps the easiest for R developers ! Don’t need to know HTML, CSS and javascript to get started ! Reactive Programming model ! Web sockets for communication
  • 10. server.R # Define server logic required to generate and plot a random # distribution! shinyServer(function(input, output) {! ! # Expression that generates a plot of the distribution.! # renderPlot:! #! # 1: Is "reactive" and will therefore automatically ! # re-executed when inputs change.! # 2: Its output type is a plot. ! ! output$distPlot <- renderPlot({! ! # generate an rnorm distribution and plot it! dist <- rnorm(input$obs)! hist(dist)! })! })
  • 11. ui.R library(shiny)! ! # Define UI for application that plots random distributions ! shinyUI(pageWithSidebar(! ! # Application title:! headerPanel("My Shiny App!"),! ! # Sidebar with a slider input for number of observations:! sidebarPanel(! sliderInput("obs", ! "Number of observations:", ! min = 0, ! max = 1000, ! value = 500)! ),! # Show a plot of the generated distribution:! mainPanel(! plotOutput("distPlot")! )! ))
  • 13. server.R + ui.R = microscope adjustable parameters (knobs): 0 < knobs < small k knobs = lighting, varying objectives, focusing (fine and course) ! knobs: fine and course filtering: geography time variable of interest observations of interest promote significant (objective) patterns change model parameters
  • 15. Overlapping Solutions Multiple models allow more nuanced learning from data. Latent Spatial Traffic Patterns ! Convergent results serve as crossvalidation. ! 2 Points of divergence provide additional insights and allow models to be calibrated further. ! Different models can provide answers to different questions or answers to the same question for different analysts. ! Multi-method excels to diverse teams with mutable missions. ! smooth + rough = data ! New paradigm where the question, “Are there multiple, overlapping ways to solve this problem” dominate. 3 1
  • 16. Overlapping Solutions Are there multiple, overlapping ways to solve this problem? yt ic yt al A An An B al ic A+B + + B C A+B+C A C Analytic C
  • 17. Summary: # our blended approach ! dt.philosophy <- lm(analytics ~ bigdata + smalldata + objective + subjective:overlapping.solutions, data=data)
  • 19. Data Science for Government (DS4G) About (DS4G): ! 1: Improve on definitions of analytics. 2: Outline optimal interactions with Data Scientists. 3: Provide a life-cycle for Data Science. 4: Most importantly, share a taxonomy to identify analytical questions one could ask of data (Causal Effects, Classification, Outlier Detection, Big Data and Analytics, Measurement Models, & Text Analysis) ! Presented by Data Tactics Analytics Team Location: TBD Time: 1Q 2014 Duration: ~ 5 hrs. Cost: FREE Audience: Government managers and Data Tactics partners with their customers.
  • 20. LUBAP goes wild! 421 attending! http://www.meetup.com/Data-Science-DC/events/146953142/
  • 21. Thank you... Questions? Homepage: http://www.data-tactics.com Blog: http://datatactics.blogspot.com Twitter: @DataTactics Slideshare: http://www.slideshare.net/DataTactics/presentations Or, me (Rich Heimann): rheimann@data-tactics-corp.com