SlideShare a Scribd company logo
1 of 30
Download to read offline
Data Science is More Than Just
Statistics
Please visit us at stand N310 opposite!
Data science = Statistics?
2 MoreThanJustStats.nb
Data science = Computation with
data
MoreThanJustStats.nb 3
Computation ⊃ {Statistics,
modeling, visualization, machine
learning, signal processing,
geometry, image processing,
maths, semantics, networks,
queues, geodesy, random
processes, audio, survival analysi
4 MoreThanJustStats.nb
Use computation
– to find things to count
Example: Text
Out[ ]=
Word frequency in Lord of the Flies
In[ ]:= MaximalBy[TextSentences[lotf], Classify["Sentiment", #, {"Probability", "Positive"}] &]
Out[ ]= {We are going to have fun on this island!}
MoreThanJustStats.nb 5
Out[ ]=
Example: Images
In[1]:= image =
6 MoreThanJustStats.nb
In[2]:= TabView
"Blobs" → i = Image[Rasterize[Graphics[{Disk[], Disk[{0.7, 0}, 1]}]]],
"Distances" → i2 = ImageAdjust@DistanceTransform[ColorNegate@i],
"Maxima" → Dilation[MaxDetect[i2], 2]

Out[2]=
Blobs Distances Maxima
In[4]:= Showimage, GraphicsText[Style["×", FontColor → White], #] & /@
idata = DeleteDuplicates[Last /@ ComponentMeasurements[
MaxDetect[DistanceTransform[DistanceTransform[DeleteSmallComponents[
ColorNegate[DeleteSmallComponents[Binarize[image]]], 100]]]],
"Centroid"], EuclideanDistance[##] < 6 &]
Out[4]=
×
××
× ××
×
×
×
× ×× ×× × ×× × ×× × ×× × ××
× ×
×
MoreThanJustStats.nb 7
In[5]:= SmoothHistogram3D[idata, BoxRatios → {3, 2, 1}]
Out[5]=
8 MoreThanJustStats.nb
Use computation
– to inject context
Example: London bikes
In[15]:= currentBikeData = Import["http://api.citybik.es/barclays-cycle-hire.json"];
In[16]:= currentBikeData[[1]]
Out[16]= {bikes → 10, name → 000989 - Murray Grove , Hoxton, idx → 0, lat → 51 530 890,
timestamp → 2018-04-25T12:11:42.632000Z, lng → -89 782, id → 0, free → 17, number → 63}
In[13]:= Dataset[Association /@ currentBikeData]
Out[13]=
△
▽
bikes name idx
10 000989 - Murray Grove , Hoxton 0
8 200069 - Knaresborough Place, Earl's Court 1
0 300057 - Westbourne Park Road, Portobello 2
29 000981 - British Museum, Bloomsbury 3
8 001083 - Commercial Street, Shoreditch 4
8 001027 - Warwick Avenue Station, Maida Vale 5
25 000971 - Godliman Street, St. Paul's 6
27 000974 - Guilford Street , Bloomsbury 7
6 001060 - Torrens Street, Angel 8
9 001038 - Harrington Square 1, Camden Town 9
1 001070 - Bricklayers Arms, Borough 10
12 001047 - Falkirk Street, Hoxton 11
0 001041 - Westbourne Grove, Bayswater 12
18 001042 - Woodstock Street, Mayfair 13
14 001049 - Finsbury Leisure Centre, St. Luke's 14
4 001037 - Park Lane , Hyde Park 15
15 001050 - Park Road (Baker Street), The Regent's Park 16
7 000973 - Bethnal Green Road, Shoreditch 17
17 001053 - Clerkenwell Green, Clerkenwell 18
9 001078 - Lambeth Road, Vauxhall 19
showing 1–20 of 784
MoreThanJustStats.nb 9
In[14]:= Legended
GeoGraphicsAbsolutePointSize[10], ColorData["DarkRainbow"]
#1[[2]]  0.001 + #1[[2]] + #1[[3]], Point[#[[1]]] & /@
QuietGeoPosition{"lat", "lng"}  1 000 000., "bikes", "free" /.
currentBikeData, PlotLabel → "Availability of bicycles in London",
ImageSize → 600, BarLegend[{"DarkRainbow", {0, 35}}]
Out[14]=
Example: Accidents
Data Local density Local points All density
10 MoreThanJustStats.nb
MoreThanJustStats.nb 11
0 5 10 15 20
0.75
1.00
1.25
1.50
1.75
Relative Danger Near Fitch Learning
0:00 4:49 9:38 14:27 19:16 0
0
200
400
600
800
1000
Accidents By Time Of Day
12 MoreThanJustStats.nb
Use computation
– to change the viewpoint
Example: Supersonic car
2000 4000 6000 8000
time
-500
500
1000
1500
2000
MoreThanJustStats.nb 13
Apply some calculus
2000 4000 6000 8000
-0.5
0.5
Apply some signal processing
2000 4000 6000 8000 10000
12
11
10
9
8
7
6
5
4
3
2
1
Load on front left wheel
14 MoreThanJustStats.nb
Use computation
– to inject a new viewpoint
Example: Finance
Correlation Threshold 0.1 0.3 0.45 0.5 0.55
Portfolio Correlation
ATHX PFBX NKTR PRGX THLD REGN IDXX WBMD TSRA CSII
ATHX 1. -0.155746 0.192554 0.0346706 0.199476 0.215293 0.055311 0.0598668 0.0667703 0.256818
PFBX -0.155746 1. 0.0551318 -0.0238231 0.0864465 -0.0371775 0.0372411 0.0297157 0.0501491 -0.0263306
NKTR 0.192554 0.0551318 1. 0.173956 0.408224 0.430739 0.252799 0.244477 0.240393 0.340634
PRGX 0.0346706 -0.0238231 0.173956 1. 0.107611 0.152259 0.187019 0.147743 0.224447 0.233454
THLD 0.199476 0.0864465 0.408224 0.107611 1. 0.340108 0.194287 0.231985 0.209756 0.280469
REGN 0.215293 -0.0371775 0.430739 0.152259 0.340108 1. 0.328254 0.21232 0.195491 0.282689
IDXX 0.055311 0.0372411 0.252799 0.187019 0.194287 0.328254 1. 0.21433 0.186771 0.287849
WBMD 0.0598668 0.0297157 0.244477 0.147743 0.231985 0.21232 0.21433 1. 0.203065 0.202096
TSRA 0.0667703 0.0501491 0.240393 0.224447 0.209756 0.195491 0.186771 0.203065 1. 0.252917
CSII 0.256818 -0.0263306 0.340634 0.233454 0.280469 0.282689 0.287849 0.202096 0.252917 1.
CFBK -0.0202622 0.047074 0.045182 -0.0116863 -0.00255542 -0.000900992 -0.0169211 -0.00205736 0.0692481 -0.0932367
SHOO 0.0666881 0.0877096 0.137083 0.246094 0.0914212 0.17177 0.172684 0.146134 0.156682 0.224934
NCMI -0.00881864 0.0229376 0.0943068 0.26504 0.117862 -0.0183203 0.120724 0.118175 0.205415 0.199415
BMRN 0.210411 0.11726 0.427514 0.222912 0.377895 0.555568 0.42669 0.2383 0.222794 0.395739
MGLN 0.0861668 0.0244271 0.269405 0.228394 0.217646 0.253427 0.435732 0.241649 0.322608 0.290727
CRDC -0.0246006 0.0999991 0.0603159 0.09501 0.00876257 -0.0257783 0.00368514 0.0795676 0.00799325 -0.0202939
SURG 0.0793505 -0.0290028 0.0249819 -0.0184576 0.0949218 0.120998 0.0330482 0.165451 0.0105265 0.128889
HFFC 0.0159068 0.0453208 0.056407 -0.133485 0.0857123 0.0728086 0.0179226 -0.0582327 0.0575192 0.0567152
LTRX 0.160061 0.0175897 0.11774 0.0961482 0.22694 0.0688794 -0.0311349 0.197049 0.211728 0.134419
MFNC -0.102578 0.00566642 0.070634 0.0481224 0.0240564 0.0164561 -0.0230522 -0.0500249 0.0625739 0.0602153
SHLO 0.137258 0.00834566 0.259109 0.255858 0.22768 0.170004 0.184693 0.262615 0.283098 0.19992
LABC 0.0582621 -0.0770103 0.0336639 -0.0219564 -0.0555295 0.0168567 -0.035154 -0.000979898 -0.0312983 -0.0374193
SNFCA 0.106298 -0.0796818 0.0353886 -0.0379301 0.102008 0.0133965 -0.0160458 0.064443 0.00624401 0.0744798
ASTI 0.0296936 -0.10231 0.0579198 0.130839 0.170478 0.159669 0.0242666 0.060238 0.103079 0.0802336
STRM 0.103553 -0.027044 0.015168 0.0813282 0.0710839 0.0916784 0.11691 0.0622404 0.10743 0.03938
RLOC 0.0673226 0.12551 0.174601 0.0897312 0.204838 0.117319 0.115966 0.214765 0.200785 0.178144
HSKA 0.0696611 0.0310147 0.0391807 0.0775681 0.140841 0.185131 -0.0176538 0.0500578 0.0610371 0.10533
NWFL -0.145537 0.12346 -0.0149798 -0.0737182 -0.0392499 -0.0887732 0.0110006 0.00857444 -0.0703103 -0.0690263
IFON 0.173712 0.0506925 0.112228 0.118179 0.168008 0.183346 0.146735 0.103064 0.0700051 0.1546
GENC -0.00454872 0.111557 0.0929517 0.118871 0.0893299 0.0951799 0.0683046 0.162016 -0.0174889 0.0333547
VSCI 0.0934438 -0.122724 -0.0337841 -0.11932 -0.0100928 -0.0106619 0.0262221 0.0221221 -0.0462289 0.00243774
HNSN 0.0728787 -0.0202281 0.0217612 0.0479466 0.039112 0.0492618 0.0121075 -0.00590073 0.0782787 0.177993
AMGN 0.144263 0.0524144 0.434697 0.164497 0.363856 0.567489 0.345919 0.270955 0.339446 0.383243
TOPS 0.0165096 0.0629182 0.0856118 -0.0616743 0.0295842 -0.006062 -0.0335838 0.143903 0.0738433 -0.0262043
UTSI 0.0574989 0.0159024 0.0864641 0.0859226 0.117787 0.0371759 -0.0538848 0.137577 0.0161543 0.0243469
INFN 0.136798 -0.172368 0.304939 0.257152 0.275845 0.267789 0.20102 0.283399 0.254023 0.283986
SAAS 0.100494 0.0334147 0.284865 0.292893 0.343925 0.250811 0.263846 0.298446 0.235286 0.290863
KRNY 0.00778643 0.0275327 0.261778 0.327411 0.227934 0.16828 0.237698 0.192625 0.336821 0.250728
ACUR 0.0599088 0.0654299 -0.0287304 0.0594885 0.0324952 0.0809222 0.0524753 0.065247 -0.0453231 0.107133
BERK -0.0263423 0.0857371 -0.0267503 -0.0363498 0.0123773 0.0174475 0.0648471 -0.0242965 0.079019 -0.045243
QCCO -0.036335 -0.0247951 0.125918 0.0854495 0.0777864 0.0406812 0.113156 0.0767831 0.0439639 0.0665833
LINC 0.109821 0.0471629 0.0797274 0.0278627 0.0350469 0.0664748 0.0578671 0.00775897 0.11751 0.0977395
KOPN 0.185468 0.00644997 0.398103 0.277598 0.350097 0.290742 0.260754 0.312968 0.341296 0.394899
ICCC 0.152975 -0.14512 0.00441828 0.0278111 0.0303676 -0.0192398 -0.0681873 0.0840796 0.0132102 0.0659621
Data Visualized Graph Communities
MoreThanJustStats.nb 15
16 MoreThanJustStats.nb
Use computation
– to separate signal from noise
MoreThanJustStats.nb 17
4.91 s | 8192
Sound Frequency shifts Model Result
18 MoreThanJustStats.nb
So why is most data science just
counting?
MoreThanJustStats.nb 19
So why is most data science just
counting?
Computation can be difficult
You have to know what’s possible
20 MoreThanJustStats.nb
The role of automation
– making computation easier
Example: Titanic
In[ ]:= Short[tData = ExampleData[{"MachineLearning", "Titanic"}, "Data"]]
Out[ ]//Short= {{1st, 29., female} → survived,
{1st, 0.9167, male} → survived, 1306, {3rd, 29., male} → died}
In[ ]:= titanicSurvival = Classify[tData]
Out[ ]= ClassifierFunction
Input type: {Nominal, Numerical, Nominal}
Classes: died, survived

In[ ]:= titanicSurvival[{"1st", 46, "male"}]
Out[ ]= died
In[ ]:= Plot[titanicSurvival[{"1st", age, "male"}, {"Probability", "died"}], {age, 0, 60}]
Out[ ]=
10 20 30 40 50 60
0.4
0.5
0.6
0.7
MoreThanJustStats.nb 21
Example: Day & night
In[ ]:= daynight = Classify
 → "Night", → "Day", → "Night", → "Night", → "Day",
→ "Night", → "Day", → "Day", → "Night", → "Night",
→ "Day", → "Night", → "Night", → "Day", → "Night",
→ "Night", → "Day", → "Day", → "Day", → "Day",
→ "Night", → "Night", → "Day", → "Night", → "Night",
→ "Day", → "Day", → "Day", → "Night", → "Day"
Out[ ]= ClassifierFunction
Input type: Image
Classes: Day, Night

In[ ]:= daynight , , ,
, , 
Out[ ]= {Day, Night, Day, Night, Night, Night}
22 MoreThanJustStats.nb
The role of automation
– automating insights
Example: Image identification
Example: Supervising the computer
Data = 0
Reset
Capture: Rock Paper Scissors Watch Stop
Train
MoreThanJustStats.nb 23
Example: No supervision - “Hands off the wheel”
Dogs
In[ ]:= Dataset[dogs]
Out[ ]=
24 MoreThanJustStats.nb
In[ ]:= FeatureSpacePlot[Take[dogs, 60], LabelingSize → 70]
Out[ ]=
In[ ]:= nearestDog = FeatureNearest[dogs]
Out[ ]= NearestFunction
Input type: Image
Output property: Element
Unable to store data in notebook.

In[ ]:= Grid[{testDogs, First /@ nearestDog[testDogs]}]
Out[ ]=
MoreThanJustStats.nb 25
Financial Assets
AMEX:MSN
MSADX
MSAIX
MSBNX
MSBYX
MSCCUX
MSCDX
MSCFX
MSDIX
MSDVX
MSEDX
MSEEX
MSEGX
MSELX
MSENX
MSEPX
MSFAX
MSFIX
MSFYX
MSGFX
MSGIX
MSGOX
MSGVX
MSHEX
MSHYX
MSIAX
MSIDX
MSIIX
MSIJX
MSIRX
MSIZX
MSJLX
MSMBX
MSMUX
MSNEOX
MSOCX
MSPDX
MSPMX
MSPRX
MSRBX
MSSAX
MSSFX
MSSIX
MSTCX
MSTIX
MSTLX
MSTZX
MSUAX
MSUGPX
MSULX
MSVCX
MSVNX
NASDAQ:MSEX
NASDAQ:MSON
NYSE:MSB
NYSE:MSPENYSE:MSPF
NYSE:MSPG
NYSE:MSPI
NYSE:MS-PK
26 MoreThanJustStats.nb
MoreThanJustStats.nb 27
The role of automation
– after the computation
Deployment
Firewall
App deployment
bikeApp = DynamicModule
{url = "http://api.citybik.es/barclays-cycle-hire.json", city = "London"},
ColumnActionMenu"Choose city", SortBy"city" ⧴ city = "city";
url = "url" /.
Import["http://api.citybik.es/networks.json", "JSON"], First,
DynamicDataset[Association /@ Import[url, "JSON"]]
Legended[GeoGraphics[#, ImageSize → 600,
PlotLabel → "Availability of bicycles in " <> city],
BarLegend[{"DarkRainbow", {0, 100}}]] &,
AbsolutePointSize[10], ColorData["DarkRainbow"]#bikes   #bikes + #free,
PointGeoPosition{#lat, #lng}  1 000 000. &,
SynchronousUpdating → False, TrackedSymbols ⧴ {url, city}
CloudDeploy[bikeApp, Permissions → "Public"]
28 MoreThanJustStats.nb
API deployment
In[ ]:= CloudDeploy[
APIFunction[{"class" → "String", "age" → "Number", "sex" → "String"},
Function[titanicSurvival[{#class, #age, #sex}]],
AllowedCloudExtraParameters → All],
"TitanicPredictor",
Permissions → "Public"
]
Out[ ]= CloudObjecthttps://www.wolframcloud.com/objects/jonm/TitanicPredictor
In[ ]:= EmbedCode[%, "Java"]
Out[ ]=
Embeddable Code
Use the code below to call the Wolfram Cloud function from Java:
Code
Copy to Clipboard
if (_conn.getResponseCode() != 200) {
throw new IOException(_conn.getResponseMessage());
}
BufferedReader _rdr = new BufferedReader(new
InputStreamReader(_conn.getInputStream()));
StringBuilder _sb = new StringBuilder();
String _line;
while ((_line = _rdr.readLine()) != null) {
_sb.append(_line);
}
_rdr.close();
_conn.disconnect();
return _sb.toString();
}
}
MoreThanJustStats.nb 29
Breaking the Boundaries of
Traditional Data Science
The toolset is HUGE - use more of it
Automation makes the toolset accessible
The human’s role is to ask deeper questions of the data
30 MoreThanJustStats.nb

More Related Content

Similar to Data Science Is More Than Just Statistics

Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...
Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...
Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...Muhamad Rizky
 
Ass 1 f12-11 report engineering dynamics problems
Ass 1  f12-11 report engineering dynamics problemsAss 1  f12-11 report engineering dynamics problems
Ass 1 f12-11 report engineering dynamics problemsHammad ur Rehman
 
Problema 6.2 método newton raphson
Problema 6.2 método newton raphson Problema 6.2 método newton raphson
Problema 6.2 método newton raphson Yoel Diomedez
 
Data analysis presentation slide
Data analysis presentation slideData analysis presentation slide
Data analysis presentation slideSaifurRahman226
 
Simulation of gym waiting times
Simulation of gym waiting timesSimulation of gym waiting times
Simulation of gym waiting timesOkko Hakola
 
Rethinking Technical Analysis
Rethinking Technical AnalysisRethinking Technical Analysis
Rethinking Technical AnalysisThomas Starke
 
Tablas normal chi cuadrado y t student 1-semana 6
Tablas normal chi cuadrado y t student 1-semana 6Tablas normal chi cuadrado y t student 1-semana 6
Tablas normal chi cuadrado y t student 1-semana 6Karla Diaz
 
Laporan 1 Sistem dan Signal (DFT Ms. Excel)
Laporan 1 Sistem dan Signal (DFT Ms. Excel)Laporan 1 Sistem dan Signal (DFT Ms. Excel)
Laporan 1 Sistem dan Signal (DFT Ms. Excel)Bayu Nurcahyo
 
Analisis gap dan thurstone
Analisis gap dan thurstoneAnalisis gap dan thurstone
Analisis gap dan thurstoneM Taufiq Budi H
 
Using R for Cyber Security Part 2
Using R for Cyber Security Part 2Using R for Cyber Security Part 2
Using R for Cyber Security Part 2Ajay Ohri
 
(597922903) trabajo sobre aguas residuales 1
(597922903) trabajo sobre aguas residuales 1(597922903) trabajo sobre aguas residuales 1
(597922903) trabajo sobre aguas residuales 1Juliana Hincapié Rojo
 
Ee spreadsheet functions and factors
Ee spreadsheet functions and factorsEe spreadsheet functions and factors
Ee spreadsheet functions and factorsjohnjackson319
 

Similar to Data Science Is More Than Just Statistics (20)

Naca 4 digit-delta
Naca 4 digit-deltaNaca 4 digit-delta
Naca 4 digit-delta
 
Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...
Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...
Ground Vibration Control Using Signature Hole Method - Thesis BE Mining, Univ...
 
Ass 1 f12-11 report engineering dynamics problems
Ass 1  f12-11 report engineering dynamics problemsAss 1  f12-11 report engineering dynamics problems
Ass 1 f12-11 report engineering dynamics problems
 
Problema 6.2 método newton raphson
Problema 6.2 método newton raphson Problema 6.2 método newton raphson
Problema 6.2 método newton raphson
 
Data analysis presentation slide
Data analysis presentation slideData analysis presentation slide
Data analysis presentation slide
 
F&O Report 29 August 2013 Mansukh Investment and Trading Solution
F&O Report 29 August 2013 Mansukh Investment and Trading SolutionF&O Report 29 August 2013 Mansukh Investment and Trading Solution
F&O Report 29 August 2013 Mansukh Investment and Trading Solution
 
F&O Report 07 August 2013 Mansukh Investment and Trading Solution
F&O Report 07 August 2013 Mansukh Investment and Trading SolutionF&O Report 07 August 2013 Mansukh Investment and Trading Solution
F&O Report 07 August 2013 Mansukh Investment and Trading Solution
 
Simulation of gym waiting times
Simulation of gym waiting timesSimulation of gym waiting times
Simulation of gym waiting times
 
GARCH
GARCHGARCH
GARCH
 
Examen4taunit
Examen4taunitExamen4taunit
Examen4taunit
 
F&O Report 24 July 2013 Mansukh Investment and Trading Solution
F&O Report 24 July 2013 Mansukh Investment and Trading SolutionF&O Report 24 July 2013 Mansukh Investment and Trading Solution
F&O Report 24 July 2013 Mansukh Investment and Trading Solution
 
Rethinking Technical Analysis
Rethinking Technical AnalysisRethinking Technical Analysis
Rethinking Technical Analysis
 
Tablas normal chi cuadrado y t student 1-semana 6
Tablas normal chi cuadrado y t student 1-semana 6Tablas normal chi cuadrado y t student 1-semana 6
Tablas normal chi cuadrado y t student 1-semana 6
 
F&O Report 28 August 2013 Mansukh Investment and Trading Solution
F&O Report 28 August 2013 Mansukh Investment and Trading SolutionF&O Report 28 August 2013 Mansukh Investment and Trading Solution
F&O Report 28 August 2013 Mansukh Investment and Trading Solution
 
Laporan 1 Sistem dan Signal (DFT Ms. Excel)
Laporan 1 Sistem dan Signal (DFT Ms. Excel)Laporan 1 Sistem dan Signal (DFT Ms. Excel)
Laporan 1 Sistem dan Signal (DFT Ms. Excel)
 
Analisis gap dan thurstone
Analisis gap dan thurstoneAnalisis gap dan thurstone
Analisis gap dan thurstone
 
Using R for Cyber Security Part 2
Using R for Cyber Security Part 2Using R for Cyber Security Part 2
Using R for Cyber Security Part 2
 
(597922903) trabajo sobre aguas residuales 1
(597922903) trabajo sobre aguas residuales 1(597922903) trabajo sobre aguas residuales 1
(597922903) trabajo sobre aguas residuales 1
 
Future levels 16.08.13
Future levels 16.08.13Future levels 16.08.13
Future levels 16.08.13
 
Ee spreadsheet functions and factors
Ee spreadsheet functions and factorsEe spreadsheet functions and factors
Ee spreadsheet functions and factors
 

More from Digital Transformation EXPO Event Series

Who’s afraid of GDPR: the application of Legitimate Interest in B2B marketing
Who’s afraid of GDPR: the application of Legitimate Interest in B2B marketingWho’s afraid of GDPR: the application of Legitimate Interest in B2B marketing
Who’s afraid of GDPR: the application of Legitimate Interest in B2B marketingDigital Transformation EXPO Event Series
 
Unleashing the Potential of Object Storage & Accelerating Cloud-First Initiat...
Unleashing the Potential of Object Storage & Accelerating Cloud-First Initiat...Unleashing the Potential of Object Storage & Accelerating Cloud-First Initiat...
Unleashing the Potential of Object Storage & Accelerating Cloud-First Initiat...Digital Transformation EXPO Event Series
 
Cloud in the Spotlight: How a National Institution ripped up the rule book wi...
Cloud in the Spotlight: How a National Institution ripped up the rule book wi...Cloud in the Spotlight: How a National Institution ripped up the rule book wi...
Cloud in the Spotlight: How a National Institution ripped up the rule book wi...Digital Transformation EXPO Event Series
 
Splunk for AIOps: Reduce IT outages through prediction with machine learning
Splunk for AIOps: Reduce IT outages through prediction with machine learningSplunk for AIOps: Reduce IT outages through prediction with machine learning
Splunk for AIOps: Reduce IT outages through prediction with machine learningDigital Transformation EXPO Event Series
 
Bringing Enterprise to the Blockchain - Moving from Science Experiment to Pra...
Bringing Enterprise to the Blockchain - Moving from Science Experiment to Pra...Bringing Enterprise to the Blockchain - Moving from Science Experiment to Pra...
Bringing Enterprise to the Blockchain - Moving from Science Experiment to Pra...Digital Transformation EXPO Event Series
 
AI is moving from its academic roots to the forefront of business and industry
AI is moving from its academic roots to the forefront of business and industryAI is moving from its academic roots to the forefront of business and industry
AI is moving from its academic roots to the forefront of business and industryDigital Transformation EXPO Event Series
 
Why Your Business Can’t Ignore the Need for a Password Manager Any Longer
Why Your Business Can’t Ignore the Need for a Password Manager Any LongerWhy Your Business Can’t Ignore the Need for a Password Manager Any Longer
Why Your Business Can’t Ignore the Need for a Password Manager Any LongerDigital Transformation EXPO Event Series
 

More from Digital Transformation EXPO Event Series (20)

Who’s afraid of GDPR: the application of Legitimate Interest in B2B marketing
Who’s afraid of GDPR: the application of Legitimate Interest in B2B marketingWho’s afraid of GDPR: the application of Legitimate Interest in B2B marketing
Who’s afraid of GDPR: the application of Legitimate Interest in B2B marketing
 
Unleashing the Potential of Object Storage & Accelerating Cloud-First Initiat...
Unleashing the Potential of Object Storage & Accelerating Cloud-First Initiat...Unleashing the Potential of Object Storage & Accelerating Cloud-First Initiat...
Unleashing the Potential of Object Storage & Accelerating Cloud-First Initiat...
 
The Future of SD-WAN: WAN Transformation in the Cloud and Mobile Era
The Future of SD-WAN: WAN Transformation in the Cloud and Mobile EraThe Future of SD-WAN: WAN Transformation in the Cloud and Mobile Era
The Future of SD-WAN: WAN Transformation in the Cloud and Mobile Era
 
Cloud in the Spotlight: How a National Institution ripped up the rule book wi...
Cloud in the Spotlight: How a National Institution ripped up the rule book wi...Cloud in the Spotlight: How a National Institution ripped up the rule book wi...
Cloud in the Spotlight: How a National Institution ripped up the rule book wi...
 
What happens if you’re not ready for the GDPR?
What happens if you’re not ready for the GDPR?What happens if you’re not ready for the GDPR?
What happens if you’re not ready for the GDPR?
 
Moving Beyond the Router to a Thin-branch or Application-driven SD-WAN
Moving Beyond the Router to a Thin-branch or Application-driven SD-WANMoving Beyond the Router to a Thin-branch or Application-driven SD-WAN
Moving Beyond the Router to a Thin-branch or Application-driven SD-WAN
 
A modern approach to cloud computing
A modern approach to cloud computing A modern approach to cloud computing
A modern approach to cloud computing
 
Citrix NetScaler SD-WAN - What’s New, What’s Hot?
Citrix NetScaler SD-WAN - What’s New, What’s Hot?Citrix NetScaler SD-WAN - What’s New, What’s Hot?
Citrix NetScaler SD-WAN - What’s New, What’s Hot?
 
Evolving the WAN for the Cloud, using SD-WAN & NFV
Evolving the WAN for the Cloud, using SD-WAN & NFV Evolving the WAN for the Cloud, using SD-WAN & NFV
Evolving the WAN for the Cloud, using SD-WAN & NFV
 
Splunk for AIOps: Reduce IT outages through prediction with machine learning
Splunk for AIOps: Reduce IT outages through prediction with machine learningSplunk for AIOps: Reduce IT outages through prediction with machine learning
Splunk for AIOps: Reduce IT outages through prediction with machine learning
 
Lean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science teamLean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science team
 
Top 5 Lessons Learned in Deploying AI in the Real World
Top 5 Lessons Learned in Deploying AI in the Real WorldTop 5 Lessons Learned in Deploying AI in the Real World
Top 5 Lessons Learned in Deploying AI in the Real World
 
Bringing Enterprise to the Blockchain - Moving from Science Experiment to Pra...
Bringing Enterprise to the Blockchain - Moving from Science Experiment to Pra...Bringing Enterprise to the Blockchain - Moving from Science Experiment to Pra...
Bringing Enterprise to the Blockchain - Moving from Science Experiment to Pra...
 
Breaking down the Microsoft AI Platform
Breaking down the Microsoft AI Platform Breaking down the Microsoft AI Platform
Breaking down the Microsoft AI Platform
 
The convergence of Data Science and Software Development
The convergence of Data Science and Software DevelopmentThe convergence of Data Science and Software Development
The convergence of Data Science and Software Development
 
The future impact of AI in cybercrime
The future impact of AI in cybercrimeThe future impact of AI in cybercrime
The future impact of AI in cybercrime
 
Digital Innovation in Medical Gases
Digital Innovation in Medical GasesDigital Innovation in Medical Gases
Digital Innovation in Medical Gases
 
AI is moving from its academic roots to the forefront of business and industry
AI is moving from its academic roots to the forefront of business and industryAI is moving from its academic roots to the forefront of business and industry
AI is moving from its academic roots to the forefront of business and industry
 
Why Your Business Can’t Ignore the Need for a Password Manager Any Longer
Why Your Business Can’t Ignore the Need for a Password Manager Any LongerWhy Your Business Can’t Ignore the Need for a Password Manager Any Longer
Why Your Business Can’t Ignore the Need for a Password Manager Any Longer
 
A case for Managed Detection and Response
A case for Managed Detection and ResponseA case for Managed Detection and Response
A case for Managed Detection and Response
 

Recently uploaded

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 

Recently uploaded (20)

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 

Data Science Is More Than Just Statistics

  • 1. Data Science is More Than Just Statistics Please visit us at stand N310 opposite!
  • 2. Data science = Statistics? 2 MoreThanJustStats.nb
  • 3. Data science = Computation with data MoreThanJustStats.nb 3
  • 4. Computation ⊃ {Statistics, modeling, visualization, machine learning, signal processing, geometry, image processing, maths, semantics, networks, queues, geodesy, random processes, audio, survival analysi 4 MoreThanJustStats.nb
  • 5. Use computation – to find things to count Example: Text Out[ ]= Word frequency in Lord of the Flies In[ ]:= MaximalBy[TextSentences[lotf], Classify["Sentiment", #, {"Probability", "Positive"}] &] Out[ ]= {We are going to have fun on this island!} MoreThanJustStats.nb 5
  • 6. Out[ ]= Example: Images In[1]:= image = 6 MoreThanJustStats.nb
  • 7. In[2]:= TabView "Blobs" → i = Image[Rasterize[Graphics[{Disk[], Disk[{0.7, 0}, 1]}]]], "Distances" → i2 = ImageAdjust@DistanceTransform[ColorNegate@i], "Maxima" → Dilation[MaxDetect[i2], 2]  Out[2]= Blobs Distances Maxima In[4]:= Showimage, GraphicsText[Style["×", FontColor → White], #] & /@ idata = DeleteDuplicates[Last /@ ComponentMeasurements[ MaxDetect[DistanceTransform[DistanceTransform[DeleteSmallComponents[ ColorNegate[DeleteSmallComponents[Binarize[image]]], 100]]]], "Centroid"], EuclideanDistance[##] < 6 &] Out[4]= × ×× × ×× × × × × ×× ×× × ×× × ×× × ×× × ×× × × × MoreThanJustStats.nb 7
  • 8. In[5]:= SmoothHistogram3D[idata, BoxRatios → {3, 2, 1}] Out[5]= 8 MoreThanJustStats.nb
  • 9. Use computation – to inject context Example: London bikes In[15]:= currentBikeData = Import["http://api.citybik.es/barclays-cycle-hire.json"]; In[16]:= currentBikeData[[1]] Out[16]= {bikes → 10, name → 000989 - Murray Grove , Hoxton, idx → 0, lat → 51 530 890, timestamp → 2018-04-25T12:11:42.632000Z, lng → -89 782, id → 0, free → 17, number → 63} In[13]:= Dataset[Association /@ currentBikeData] Out[13]= △ ▽ bikes name idx 10 000989 - Murray Grove , Hoxton 0 8 200069 - Knaresborough Place, Earl's Court 1 0 300057 - Westbourne Park Road, Portobello 2 29 000981 - British Museum, Bloomsbury 3 8 001083 - Commercial Street, Shoreditch 4 8 001027 - Warwick Avenue Station, Maida Vale 5 25 000971 - Godliman Street, St. Paul's 6 27 000974 - Guilford Street , Bloomsbury 7 6 001060 - Torrens Street, Angel 8 9 001038 - Harrington Square 1, Camden Town 9 1 001070 - Bricklayers Arms, Borough 10 12 001047 - Falkirk Street, Hoxton 11 0 001041 - Westbourne Grove, Bayswater 12 18 001042 - Woodstock Street, Mayfair 13 14 001049 - Finsbury Leisure Centre, St. Luke's 14 4 001037 - Park Lane , Hyde Park 15 15 001050 - Park Road (Baker Street), The Regent's Park 16 7 000973 - Bethnal Green Road, Shoreditch 17 17 001053 - Clerkenwell Green, Clerkenwell 18 9 001078 - Lambeth Road, Vauxhall 19 showing 1–20 of 784 MoreThanJustStats.nb 9
  • 10. In[14]:= Legended GeoGraphicsAbsolutePointSize[10], ColorData["DarkRainbow"] #1[[2]]  0.001 + #1[[2]] + #1[[3]], Point[#[[1]]] & /@ QuietGeoPosition{"lat", "lng"}  1 000 000., "bikes", "free" /. currentBikeData, PlotLabel → "Availability of bicycles in London", ImageSize → 600, BarLegend[{"DarkRainbow", {0, 35}}] Out[14]= Example: Accidents Data Local density Local points All density 10 MoreThanJustStats.nb
  • 12. 0 5 10 15 20 0.75 1.00 1.25 1.50 1.75 Relative Danger Near Fitch Learning 0:00 4:49 9:38 14:27 19:16 0 0 200 400 600 800 1000 Accidents By Time Of Day 12 MoreThanJustStats.nb
  • 13. Use computation – to change the viewpoint Example: Supersonic car 2000 4000 6000 8000 time -500 500 1000 1500 2000 MoreThanJustStats.nb 13
  • 14. Apply some calculus 2000 4000 6000 8000 -0.5 0.5 Apply some signal processing 2000 4000 6000 8000 10000 12 11 10 9 8 7 6 5 4 3 2 1 Load on front left wheel 14 MoreThanJustStats.nb
  • 15. Use computation – to inject a new viewpoint Example: Finance Correlation Threshold 0.1 0.3 0.45 0.5 0.55 Portfolio Correlation ATHX PFBX NKTR PRGX THLD REGN IDXX WBMD TSRA CSII ATHX 1. -0.155746 0.192554 0.0346706 0.199476 0.215293 0.055311 0.0598668 0.0667703 0.256818 PFBX -0.155746 1. 0.0551318 -0.0238231 0.0864465 -0.0371775 0.0372411 0.0297157 0.0501491 -0.0263306 NKTR 0.192554 0.0551318 1. 0.173956 0.408224 0.430739 0.252799 0.244477 0.240393 0.340634 PRGX 0.0346706 -0.0238231 0.173956 1. 0.107611 0.152259 0.187019 0.147743 0.224447 0.233454 THLD 0.199476 0.0864465 0.408224 0.107611 1. 0.340108 0.194287 0.231985 0.209756 0.280469 REGN 0.215293 -0.0371775 0.430739 0.152259 0.340108 1. 0.328254 0.21232 0.195491 0.282689 IDXX 0.055311 0.0372411 0.252799 0.187019 0.194287 0.328254 1. 0.21433 0.186771 0.287849 WBMD 0.0598668 0.0297157 0.244477 0.147743 0.231985 0.21232 0.21433 1. 0.203065 0.202096 TSRA 0.0667703 0.0501491 0.240393 0.224447 0.209756 0.195491 0.186771 0.203065 1. 0.252917 CSII 0.256818 -0.0263306 0.340634 0.233454 0.280469 0.282689 0.287849 0.202096 0.252917 1. CFBK -0.0202622 0.047074 0.045182 -0.0116863 -0.00255542 -0.000900992 -0.0169211 -0.00205736 0.0692481 -0.0932367 SHOO 0.0666881 0.0877096 0.137083 0.246094 0.0914212 0.17177 0.172684 0.146134 0.156682 0.224934 NCMI -0.00881864 0.0229376 0.0943068 0.26504 0.117862 -0.0183203 0.120724 0.118175 0.205415 0.199415 BMRN 0.210411 0.11726 0.427514 0.222912 0.377895 0.555568 0.42669 0.2383 0.222794 0.395739 MGLN 0.0861668 0.0244271 0.269405 0.228394 0.217646 0.253427 0.435732 0.241649 0.322608 0.290727 CRDC -0.0246006 0.0999991 0.0603159 0.09501 0.00876257 -0.0257783 0.00368514 0.0795676 0.00799325 -0.0202939 SURG 0.0793505 -0.0290028 0.0249819 -0.0184576 0.0949218 0.120998 0.0330482 0.165451 0.0105265 0.128889 HFFC 0.0159068 0.0453208 0.056407 -0.133485 0.0857123 0.0728086 0.0179226 -0.0582327 0.0575192 0.0567152 LTRX 0.160061 0.0175897 0.11774 0.0961482 0.22694 0.0688794 -0.0311349 0.197049 0.211728 0.134419 MFNC -0.102578 0.00566642 0.070634 0.0481224 0.0240564 0.0164561 -0.0230522 -0.0500249 0.0625739 0.0602153 SHLO 0.137258 0.00834566 0.259109 0.255858 0.22768 0.170004 0.184693 0.262615 0.283098 0.19992 LABC 0.0582621 -0.0770103 0.0336639 -0.0219564 -0.0555295 0.0168567 -0.035154 -0.000979898 -0.0312983 -0.0374193 SNFCA 0.106298 -0.0796818 0.0353886 -0.0379301 0.102008 0.0133965 -0.0160458 0.064443 0.00624401 0.0744798 ASTI 0.0296936 -0.10231 0.0579198 0.130839 0.170478 0.159669 0.0242666 0.060238 0.103079 0.0802336 STRM 0.103553 -0.027044 0.015168 0.0813282 0.0710839 0.0916784 0.11691 0.0622404 0.10743 0.03938 RLOC 0.0673226 0.12551 0.174601 0.0897312 0.204838 0.117319 0.115966 0.214765 0.200785 0.178144 HSKA 0.0696611 0.0310147 0.0391807 0.0775681 0.140841 0.185131 -0.0176538 0.0500578 0.0610371 0.10533 NWFL -0.145537 0.12346 -0.0149798 -0.0737182 -0.0392499 -0.0887732 0.0110006 0.00857444 -0.0703103 -0.0690263 IFON 0.173712 0.0506925 0.112228 0.118179 0.168008 0.183346 0.146735 0.103064 0.0700051 0.1546 GENC -0.00454872 0.111557 0.0929517 0.118871 0.0893299 0.0951799 0.0683046 0.162016 -0.0174889 0.0333547 VSCI 0.0934438 -0.122724 -0.0337841 -0.11932 -0.0100928 -0.0106619 0.0262221 0.0221221 -0.0462289 0.00243774 HNSN 0.0728787 -0.0202281 0.0217612 0.0479466 0.039112 0.0492618 0.0121075 -0.00590073 0.0782787 0.177993 AMGN 0.144263 0.0524144 0.434697 0.164497 0.363856 0.567489 0.345919 0.270955 0.339446 0.383243 TOPS 0.0165096 0.0629182 0.0856118 -0.0616743 0.0295842 -0.006062 -0.0335838 0.143903 0.0738433 -0.0262043 UTSI 0.0574989 0.0159024 0.0864641 0.0859226 0.117787 0.0371759 -0.0538848 0.137577 0.0161543 0.0243469 INFN 0.136798 -0.172368 0.304939 0.257152 0.275845 0.267789 0.20102 0.283399 0.254023 0.283986 SAAS 0.100494 0.0334147 0.284865 0.292893 0.343925 0.250811 0.263846 0.298446 0.235286 0.290863 KRNY 0.00778643 0.0275327 0.261778 0.327411 0.227934 0.16828 0.237698 0.192625 0.336821 0.250728 ACUR 0.0599088 0.0654299 -0.0287304 0.0594885 0.0324952 0.0809222 0.0524753 0.065247 -0.0453231 0.107133 BERK -0.0263423 0.0857371 -0.0267503 -0.0363498 0.0123773 0.0174475 0.0648471 -0.0242965 0.079019 -0.045243 QCCO -0.036335 -0.0247951 0.125918 0.0854495 0.0777864 0.0406812 0.113156 0.0767831 0.0439639 0.0665833 LINC 0.109821 0.0471629 0.0797274 0.0278627 0.0350469 0.0664748 0.0578671 0.00775897 0.11751 0.0977395 KOPN 0.185468 0.00644997 0.398103 0.277598 0.350097 0.290742 0.260754 0.312968 0.341296 0.394899 ICCC 0.152975 -0.14512 0.00441828 0.0278111 0.0303676 -0.0192398 -0.0681873 0.0840796 0.0132102 0.0659621 Data Visualized Graph Communities MoreThanJustStats.nb 15
  • 17. Use computation – to separate signal from noise MoreThanJustStats.nb 17
  • 18. 4.91 s | 8192 Sound Frequency shifts Model Result 18 MoreThanJustStats.nb
  • 19. So why is most data science just counting? MoreThanJustStats.nb 19
  • 20. So why is most data science just counting? Computation can be difficult You have to know what’s possible 20 MoreThanJustStats.nb
  • 21. The role of automation – making computation easier Example: Titanic In[ ]:= Short[tData = ExampleData[{"MachineLearning", "Titanic"}, "Data"]] Out[ ]//Short= {{1st, 29., female} → survived, {1st, 0.9167, male} → survived, 1306, {3rd, 29., male} → died} In[ ]:= titanicSurvival = Classify[tData] Out[ ]= ClassifierFunction Input type: {Nominal, Numerical, Nominal} Classes: died, survived  In[ ]:= titanicSurvival[{"1st", 46, "male"}] Out[ ]= died In[ ]:= Plot[titanicSurvival[{"1st", age, "male"}, {"Probability", "died"}], {age, 0, 60}] Out[ ]= 10 20 30 40 50 60 0.4 0.5 0.6 0.7 MoreThanJustStats.nb 21
  • 22. Example: Day & night In[ ]:= daynight = Classify  → "Night", → "Day", → "Night", → "Night", → "Day", → "Night", → "Day", → "Day", → "Night", → "Night", → "Day", → "Night", → "Night", → "Day", → "Night", → "Night", → "Day", → "Day", → "Day", → "Day", → "Night", → "Night", → "Day", → "Night", → "Night", → "Day", → "Day", → "Day", → "Night", → "Day" Out[ ]= ClassifierFunction Input type: Image Classes: Day, Night  In[ ]:= daynight , , , , ,  Out[ ]= {Day, Night, Day, Night, Night, Night} 22 MoreThanJustStats.nb
  • 23. The role of automation – automating insights Example: Image identification Example: Supervising the computer Data = 0 Reset Capture: Rock Paper Scissors Watch Stop Train MoreThanJustStats.nb 23
  • 24. Example: No supervision - “Hands off the wheel” Dogs In[ ]:= Dataset[dogs] Out[ ]= 24 MoreThanJustStats.nb
  • 25. In[ ]:= FeatureSpacePlot[Take[dogs, 60], LabelingSize → 70] Out[ ]= In[ ]:= nearestDog = FeatureNearest[dogs] Out[ ]= NearestFunction Input type: Image Output property: Element Unable to store data in notebook.  In[ ]:= Grid[{testDogs, First /@ nearestDog[testDogs]}] Out[ ]= MoreThanJustStats.nb 25
  • 28. The role of automation – after the computation Deployment Firewall App deployment bikeApp = DynamicModule {url = "http://api.citybik.es/barclays-cycle-hire.json", city = "London"}, ColumnActionMenu"Choose city", SortBy"city" ⧴ city = "city"; url = "url" /. Import["http://api.citybik.es/networks.json", "JSON"], First, DynamicDataset[Association /@ Import[url, "JSON"]] Legended[GeoGraphics[#, ImageSize → 600, PlotLabel → "Availability of bicycles in " <> city], BarLegend[{"DarkRainbow", {0, 100}}]] &, AbsolutePointSize[10], ColorData["DarkRainbow"]#bikes   #bikes + #free, PointGeoPosition{#lat, #lng}  1 000 000. &, SynchronousUpdating → False, TrackedSymbols ⧴ {url, city} CloudDeploy[bikeApp, Permissions → "Public"] 28 MoreThanJustStats.nb
  • 29. API deployment In[ ]:= CloudDeploy[ APIFunction[{"class" → "String", "age" → "Number", "sex" → "String"}, Function[titanicSurvival[{#class, #age, #sex}]], AllowedCloudExtraParameters → All], "TitanicPredictor", Permissions → "Public" ] Out[ ]= CloudObjecthttps://www.wolframcloud.com/objects/jonm/TitanicPredictor In[ ]:= EmbedCode[%, "Java"] Out[ ]= Embeddable Code Use the code below to call the Wolfram Cloud function from Java: Code Copy to Clipboard if (_conn.getResponseCode() != 200) { throw new IOException(_conn.getResponseMessage()); } BufferedReader _rdr = new BufferedReader(new InputStreamReader(_conn.getInputStream())); StringBuilder _sb = new StringBuilder(); String _line; while ((_line = _rdr.readLine()) != null) { _sb.append(_line); } _rdr.close(); _conn.disconnect(); return _sb.toString(); } } MoreThanJustStats.nb 29
  • 30. Breaking the Boundaries of Traditional Data Science The toolset is HUGE - use more of it Automation makes the toolset accessible The human’s role is to ask deeper questions of the data 30 MoreThanJustStats.nb