SlideShare a Scribd company logo
What is Data?
An Existential [but useful] Exploration
“Russell Foltz-Smith”
• Self Appointed Artist
• Green Eyes, Red Head
• 5’10”, 243lbs
• 22 tattoos
• Married, 2 kids
• @UChicago Math BA
• 43 Theater Shows
• Son
• 42 moves, 5 states
Finance
Developer
Product, BI
VP Product
VP Tech, Content
CTO
Biz Dev
President
SVP, Data
In place of facts, we now live in a world of data. Instead of
trusted measures and methodologies being used to produce numbers, a dizzying array of numbers is produced by default, to be mined,
visualised, analysed and interpreted however we wish. If risk modelling (using notions of statistical normality) was the defining research
technique of the 19th and 20th centuries, sentiment analysis is the defining one of the emerging digital era. We no longer
have stable, ‘factual’ representations of the world, but unprecedented new capacities to
sense and monitor what is bubbling up where, who’s feeling what, what’s the general vibe.
Financial markets are themselves far more like tools of sentiment analysis (representing the mood of investors) than producers of ‘facts’. This is why it was so absurd to
look to currency markets and spread-betters for the truth of what would happen in the referendum: they could only give a sense of what certain people at felt would happen
in the referendum at certain times. Given the absence of any trustworthy facts (in the form of polls), they could then only provide a sense of how investors felt about
Britain’s national mood: a sentiment regarding a sentiment. As the 23rd June turned into 24th June, it became manifestly clear that prediction markets are little more than
an aggregative representation of the same feelings and moods that one might otherwise detect via twitter. They’re not in the
business of truth-telling, but of mood-tracking.
-Will Davies (http://www.perc.org.uk/project_posts/thoughts-on-the-sociology-of-brexit/)
Data is.
There exists data about
everything and anything.
Including data.
Data is relational residue.
Data About Other
Specific Data Types Specific Detectors
Data As Other.
Organs – rendered data by 3d printers
Surrealism turned real.
Old Master Painters reanimated as New Master Data Scientists
Interiors augmented.
Weapons are just printed data.
Data As Detector
Generalized DetectorsGeneralized Data
Networks are DATA and DETECTOR.
Data As Complex Recursion
(w. errors)Gene replication.
Error Data is the animating aspect of evolution and learning.
Printers Printing Printers Printing Printers…
&
All complex processes produce mutations (errors, residue).
(halting problem)
Data As Marks in a substrate.
Differences in a medium. Transmittable Marks.
The pages of words and
music and math symbols
are just ordered marks.
They are data in a huge number of
different ways.
 Linear codex of left to right
English writing
 Musical Notation
 Mathematical Symbols
 Music and Math Theories
 The words definitions
 …
A detector needs context/previous
exposure to this data as well as use
of the data through other detectors
to understand/interpret this data.
(play music on a musical
instrument...)
Data As Maps.
(Projective Geometry)
Data maps mediums
between mediums.
The world’s resources
are controlled by
projection maps.
High dimensional data
(complex real world stuff)
requires topological
approaches for any
sense.
Data Becomes Through Experimentation.
Data isn’t data until something
happens....
Emmer wheat… mutated into big seeds
that found reasonable ground… got
noticed by some Neolithic folks… and
then civilization happened.
Gravitational waves aren’t
data until something really
big happens, like a couple
of blackholes getting
together.
… and there’s a detector to detect it.
Data is:
About Others and Itself
Of Relations
Relates Others To Others
Becomes Through
Experiment
Learning is Noticing and Experimenting with Difference.
(What was that glitch? Is that glitch a glitch in other mediums?)
Data Science is Experimentation
Simulation
Art
Transmitting Phenomena From One Medium to Another
and testing to see what maintains coherence, meaning, use.
Data that is really data is ROBUST through media.
Art is The Future of Data Science (always has been)
Art is about making something happen.
Noticing what’s noticed.
Recursing noticing & happening.
1203
painting
s
2
art
shows
13
hrs of
video
155
sale
s
+38%
single day
revenue increase
for show locations
14,000,000
+
marks
Analyze all my marks.
Analyze all show behavior
Do people want to learn about the Self or the Truth?
But! Truth had pull! 90% of humans are right
handed/footed (ants are left first biased too!).
73%73%73% 27%
All Businesses Are Art
Business as Art As Data Science
If a business produces
something robust, it will
transmit across mediums.
And this exercise is getting
really complicated.
It really is about noticing...
… the ever changing relationships
between things.
Another Surprising Example of
Noticing
...a new product in video/tv content is to ACCELERATE the viewing
speed.
Dilemma.
How do you QA
and do data in
an infinite
virtual world?
How do you QA
and ensure safety
in infinite live
video?
“What might just happen is the proliferation of archi-tectural clones
around the globe, of transparent, interactive, mobile, fun buildings
modeled on networks and virtual realities—by which a whole society
basically gives itself the comedy of culture, the comedy of communication,
the comedy of the virtual (just as it gives itself the comedy of politics, for
that matter). “
Architecture: Truth or Radicalism?
by Jean Baudrillard
Data Science is Simulation is Art.
We will assess CONSEQUENCES through
integrated simulations of new relationships
deployed in virtual reality with complex adaptive
avatars.
• Unsupervised learning driven avatars in virtual universes trying out our ideas
• Instigating and Tracking behavior… “Virtual Skinner Boxes”
• Assessments of Acceptable Risk Networks lead to adoption and integration of
policy/tech/software from the virtual to the real
Insight from noticing what’s
noticed.
We make data detectors. We detect data. We publish data to be detected.
http://datalooksdope.com/
Hadoop is essential.
Hadoop (the ecosystem) remains
the most generalized collection of data detectors.
Machine Learners
SOME EXPERIENCE
Thank
You
Business:
www.fabricinteractive.com/casezero
Displeasure:
@un1crom (on instagram and twitter)
Criticism:
www.worksonbecoming.com

More Related Content

Viewers also liked

Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
 
Typography and Its Importance in Information Hierarchy
Typography and Its Importance in Information HierarchyTypography and Its Importance in Information Hierarchy
Typography and Its Importance in Information Hierarchy
UXPA Boston
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
Chris Nauroth
 
Simplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & TroubleshootingSimplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & Troubleshooting
DataWorks Summit/Hadoop Summit
 
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
DataWorks Summit/Hadoop Summit
 
Extreme Analytics @ eBay
Extreme Analytics @ eBayExtreme Analytics @ eBay
Extreme Analytics @ eBay
DataWorks Summit/Hadoop Summit
 
HDFS Analysis for Small Files
HDFS Analysis for Small FilesHDFS Analysis for Small Files
HDFS Analysis for Small Files
DataWorks Summit/Hadoop Summit
 
Understanding Data
Understanding Data Understanding Data
Understanding Data
Kingsley Uyi Idehen
 
End-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentEnd-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service Deployment
DataWorks Summit/Hadoop Summit
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
DataWorks Summit/Hadoop Summit
 
Knowledge Management Models
Knowledge Management ModelsKnowledge Management Models
Knowledge Management Models
Tilahun Teffera
 
A Multi Colored YARN
A Multi Colored YARNA Multi Colored YARN
A Multi Colored YARN
DataWorks Summit/Hadoop Summit
 

Viewers also liked (12)

Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Typography and Its Importance in Information Hierarchy
Typography and Its Importance in Information HierarchyTypography and Its Importance in Information Hierarchy
Typography and Its Importance in Information Hierarchy
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
 
Simplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & TroubleshootingSimplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & Troubleshooting
 
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
 
Extreme Analytics @ eBay
Extreme Analytics @ eBayExtreme Analytics @ eBay
Extreme Analytics @ eBay
 
HDFS Analysis for Small Files
HDFS Analysis for Small FilesHDFS Analysis for Small Files
HDFS Analysis for Small Files
 
Understanding Data
Understanding Data Understanding Data
Understanding Data
 
End-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service DeploymentEnd-to-End Security and Auditing in a Big Data as a Service Deployment
End-to-End Security and Auditing in a Big Data as a Service Deployment
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
Knowledge Management Models
Knowledge Management ModelsKnowledge Management Models
Knowledge Management Models
 
A Multi Colored YARN
A Multi Colored YARNA Multi Colored YARN
A Multi Colored YARN
 

Similar to What is Data?

Human-machine Inter-agencies
Human-machine Inter-agenciesHuman-machine Inter-agencies
Human-machine Inter-agencies
mo-seph
 
Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...
Micah Altman
 
Big data, human agency, critical realism and the future of the social sciences
Big data, human agency, critical realism and the future of the social sciencesBig data, human agency, critical realism and the future of the social sciences
Big data, human agency, critical realism and the future of the social sciences
Mark Carrigan
 
How to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceabilityHow to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceability
INRIA - ENS Lyon
 
How the Net Betrays Discourse
How the Net Betrays DiscourseHow the Net Betrays Discourse
How the Net Betrays Discourse
David Brin
 
Visualization as a Digital Humanities ______ ?
Visualization as a Digital Humanities ______ ?Visualization as a Digital Humanities ______ ?
Visualization as a Digital Humanities ______ ?
Tara Zepel
 
Computing, cognition and the future of knowing,. by IBM
Computing, cognition and the future of knowing,. by IBMComputing, cognition and the future of knowing,. by IBM
Computing, cognition and the future of knowing,. by IBM
Virginia Fernandez
 
Learning to trust artificial intelligence systems accountability, compliance ...
Learning to trust artificial intelligence systems accountability, compliance ...Learning to trust artificial intelligence systems accountability, compliance ...
Learning to trust artificial intelligence systems accountability, compliance ...
Diego Alberto Tamayo
 
How to Fail Interdisciplinarily
How to Fail InterdisciplinarilyHow to Fail Interdisciplinarily
How to Fail Interdisciplinarily
David Newbury
 
ENP_Dutch_Infoday_PHuijnen
ENP_Dutch_Infoday_PHuijnen ENP_Dutch_Infoday_PHuijnen
ENP_Dutch_Infoday_PHuijnen
Europeana Newspapers
 
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
Darlene Cavalier
 
Data socialscienceprogramme
Data socialscienceprogrammeData socialscienceprogramme
Data socialscienceprogramme
dan mcquillan
 
Information Cartilage: Context, Intelligent Systems, and IA
Information Cartilage: Context, Intelligent Systems, and IA Information Cartilage: Context, Intelligent Systems, and IA
Information Cartilage: Context, Intelligent Systems, and IA
Thomas Wendt
 
TED Wiley Visualizing .docx
TED  Wiley Visualizing .docxTED  Wiley Visualizing .docx
TED Wiley Visualizing .docx
ssuserf9c51d
 
Bi(G) data: opportunities for BI Professionals
Bi(G) data: opportunities for BI ProfessionalsBi(G) data: opportunities for BI Professionals
Bi(G) data: opportunities for BI Professionals
Albert Besselse
 
From data visualization to data humanism
From data visualization to data humanismFrom data visualization to data humanism
From data visualization to data humanism
Raquel Herrera Ferrer
 
Corso pisa-2 dh-2017
Corso pisa-2 dh-2017Corso pisa-2 dh-2017
Corso pisa-2 dh-2017
Luca De Biase
 
Data, Science, Society - Claudio Gutierrez, University of Chile
Data, Science, Society - Claudio Gutierrez, University of ChileData, Science, Society - Claudio Gutierrez, University of Chile
Data, Science, Society - Claudio Gutierrez, University of Chile
LEARN Project
 
Escaping greatdivide coimbra
Escaping greatdivide coimbraEscaping greatdivide coimbra
Escaping greatdivide coimbra
INRIA - ENS Lyon
 
eROI + Portland Design Week: Unflattering Data
eROI + Portland Design Week: Unflattering DataeROI + Portland Design Week: Unflattering Data
eROI + Portland Design Week: Unflattering Data
eROI
 

Similar to What is Data? (20)

Human-machine Inter-agencies
Human-machine Inter-agenciesHuman-machine Inter-agencies
Human-machine Inter-agencies
 
Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...
 
Big data, human agency, critical realism and the future of the social sciences
Big data, human agency, critical realism and the future of the social sciencesBig data, human agency, critical realism and the future of the social sciences
Big data, human agency, critical realism and the future of the social sciences
 
How to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceabilityHow to follow actors through their traces. Exploiting digital traceability
How to follow actors through their traces. Exploiting digital traceability
 
How the Net Betrays Discourse
How the Net Betrays DiscourseHow the Net Betrays Discourse
How the Net Betrays Discourse
 
Visualization as a Digital Humanities ______ ?
Visualization as a Digital Humanities ______ ?Visualization as a Digital Humanities ______ ?
Visualization as a Digital Humanities ______ ?
 
Computing, cognition and the future of knowing,. by IBM
Computing, cognition and the future of knowing,. by IBMComputing, cognition and the future of knowing,. by IBM
Computing, cognition and the future of knowing,. by IBM
 
Learning to trust artificial intelligence systems accountability, compliance ...
Learning to trust artificial intelligence systems accountability, compliance ...Learning to trust artificial intelligence systems accountability, compliance ...
Learning to trust artificial intelligence systems accountability, compliance ...
 
How to Fail Interdisciplinarily
How to Fail InterdisciplinarilyHow to Fail Interdisciplinarily
How to Fail Interdisciplinarily
 
ENP_Dutch_Infoday_PHuijnen
ENP_Dutch_Infoday_PHuijnen ENP_Dutch_Infoday_PHuijnen
ENP_Dutch_Infoday_PHuijnen
 
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
Citizen Science overview for ASU HSD598 graduate course, "Citizen Science"
 
Data socialscienceprogramme
Data socialscienceprogrammeData socialscienceprogramme
Data socialscienceprogramme
 
Information Cartilage: Context, Intelligent Systems, and IA
Information Cartilage: Context, Intelligent Systems, and IA Information Cartilage: Context, Intelligent Systems, and IA
Information Cartilage: Context, Intelligent Systems, and IA
 
TED Wiley Visualizing .docx
TED  Wiley Visualizing .docxTED  Wiley Visualizing .docx
TED Wiley Visualizing .docx
 
Bi(G) data: opportunities for BI Professionals
Bi(G) data: opportunities for BI ProfessionalsBi(G) data: opportunities for BI Professionals
Bi(G) data: opportunities for BI Professionals
 
From data visualization to data humanism
From data visualization to data humanismFrom data visualization to data humanism
From data visualization to data humanism
 
Corso pisa-2 dh-2017
Corso pisa-2 dh-2017Corso pisa-2 dh-2017
Corso pisa-2 dh-2017
 
Data, Science, Society - Claudio Gutierrez, University of Chile
Data, Science, Society - Claudio Gutierrez, University of ChileData, Science, Society - Claudio Gutierrez, University of Chile
Data, Science, Society - Claudio Gutierrez, University of Chile
 
Escaping greatdivide coimbra
Escaping greatdivide coimbraEscaping greatdivide coimbra
Escaping greatdivide coimbra
 
eROI + Portland Design Week: Unflattering Data
eROI + Portland Design Week: Unflattering DataeROI + Portland Design Week: Unflattering Data
eROI + Portland Design Week: Unflattering Data
 

More from DataWorks Summit/Hadoop Summit

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit/Hadoop Summit
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Jeffrey Haguewood
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 

Recently uploaded (20)

HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 

What is Data?

  • 1. What is Data? An Existential [but useful] Exploration
  • 2. “Russell Foltz-Smith” • Self Appointed Artist • Green Eyes, Red Head • 5’10”, 243lbs • 22 tattoos • Married, 2 kids • @UChicago Math BA • 43 Theater Shows • Son • 42 moves, 5 states Finance Developer Product, BI VP Product VP Tech, Content CTO Biz Dev President SVP, Data
  • 3. In place of facts, we now live in a world of data. Instead of trusted measures and methodologies being used to produce numbers, a dizzying array of numbers is produced by default, to be mined, visualised, analysed and interpreted however we wish. If risk modelling (using notions of statistical normality) was the defining research technique of the 19th and 20th centuries, sentiment analysis is the defining one of the emerging digital era. We no longer have stable, ‘factual’ representations of the world, but unprecedented new capacities to sense and monitor what is bubbling up where, who’s feeling what, what’s the general vibe. Financial markets are themselves far more like tools of sentiment analysis (representing the mood of investors) than producers of ‘facts’. This is why it was so absurd to look to currency markets and spread-betters for the truth of what would happen in the referendum: they could only give a sense of what certain people at felt would happen in the referendum at certain times. Given the absence of any trustworthy facts (in the form of polls), they could then only provide a sense of how investors felt about Britain’s national mood: a sentiment regarding a sentiment. As the 23rd June turned into 24th June, it became manifestly clear that prediction markets are little more than an aggregative representation of the same feelings and moods that one might otherwise detect via twitter. They’re not in the business of truth-telling, but of mood-tracking. -Will Davies (http://www.perc.org.uk/project_posts/thoughts-on-the-sociology-of-brexit/)
  • 4. Data is. There exists data about everything and anything. Including data. Data is relational residue.
  • 5. Data About Other Specific Data Types Specific Detectors
  • 6. Data As Other. Organs – rendered data by 3d printers Surrealism turned real. Old Master Painters reanimated as New Master Data Scientists Interiors augmented. Weapons are just printed data.
  • 7. Data As Detector Generalized DetectorsGeneralized Data Networks are DATA and DETECTOR.
  • 8. Data As Complex Recursion (w. errors)Gene replication. Error Data is the animating aspect of evolution and learning. Printers Printing Printers Printing Printers… & All complex processes produce mutations (errors, residue). (halting problem)
  • 9. Data As Marks in a substrate. Differences in a medium. Transmittable Marks. The pages of words and music and math symbols are just ordered marks. They are data in a huge number of different ways.  Linear codex of left to right English writing  Musical Notation  Mathematical Symbols  Music and Math Theories  The words definitions  … A detector needs context/previous exposure to this data as well as use of the data through other detectors to understand/interpret this data. (play music on a musical instrument...)
  • 10. Data As Maps. (Projective Geometry) Data maps mediums between mediums. The world’s resources are controlled by projection maps. High dimensional data (complex real world stuff) requires topological approaches for any sense.
  • 11. Data Becomes Through Experimentation. Data isn’t data until something happens.... Emmer wheat… mutated into big seeds that found reasonable ground… got noticed by some Neolithic folks… and then civilization happened. Gravitational waves aren’t data until something really big happens, like a couple of blackholes getting together. … and there’s a detector to detect it.
  • 12. Data is: About Others and Itself Of Relations Relates Others To Others Becomes Through Experiment Learning is Noticing and Experimenting with Difference. (What was that glitch? Is that glitch a glitch in other mediums?) Data Science is Experimentation Simulation Art Transmitting Phenomena From One Medium to Another and testing to see what maintains coherence, meaning, use. Data that is really data is ROBUST through media.
  • 13. Art is The Future of Data Science (always has been) Art is about making something happen. Noticing what’s noticed. Recursing noticing & happening. 1203 painting s 2 art shows 13 hrs of video 155 sale s +38% single day revenue increase for show locations 14,000,000 + marks Analyze all my marks. Analyze all show behavior
  • 14. Do people want to learn about the Self or the Truth? But! Truth had pull! 90% of humans are right handed/footed (ants are left first biased too!). 73%73%73% 27%
  • 15. All Businesses Are Art Business as Art As Data Science If a business produces something robust, it will transmit across mediums. And this exercise is getting really complicated.
  • 16. It really is about noticing... … the ever changing relationships between things.
  • 17. Another Surprising Example of Noticing ...a new product in video/tv content is to ACCELERATE the viewing speed.
  • 18. Dilemma. How do you QA and do data in an infinite virtual world? How do you QA and ensure safety in infinite live video? “What might just happen is the proliferation of archi-tectural clones around the globe, of transparent, interactive, mobile, fun buildings modeled on networks and virtual realities—by which a whole society basically gives itself the comedy of culture, the comedy of communication, the comedy of the virtual (just as it gives itself the comedy of politics, for that matter). “ Architecture: Truth or Radicalism? by Jean Baudrillard
  • 19. Data Science is Simulation is Art. We will assess CONSEQUENCES through integrated simulations of new relationships deployed in virtual reality with complex adaptive avatars. • Unsupervised learning driven avatars in virtual universes trying out our ideas • Instigating and Tracking behavior… “Virtual Skinner Boxes” • Assessments of Acceptable Risk Networks lead to adoption and integration of policy/tech/software from the virtual to the real
  • 20.
  • 21. Insight from noticing what’s noticed. We make data detectors. We detect data. We publish data to be detected. http://datalooksdope.com/
  • 22. Hadoop is essential. Hadoop (the ecosystem) remains the most generalized collection of data detectors.