SlideShare a Scribd company logo
1 of 48
Big Data and the SP
Theory of Intelligence
Varsha Prabhakaran
S8 CSE B
Roll No: 43
1
Contents
Introduction
SP Theory of Intelligence
Problems of Big Data
Volume
Efficiency
Transmission
Variety
Veracity
Visualization
2
Introduction
SP theory of intelligence be applied to the management
and analysis of big data
Overcomes the problem of variety in big data.
Analysis of streaming data- velocity
Economies in the transmission of data
Veracity in big data.
Visualization of knowledge structures and inferential
processes 3
SP Theory of Intelligence
4
SP Theory of Intelligence
Designed to simplify and integrate concepts across artificial
intelligence, mainstream computing, and human perception
and cognition.
Product of an extensive program of development and testing
via the SP computer model.
Knowledge represented with arrays of atomic symbols in one
or two dimensions called “patterns”.
Processing are done by compressing information
Via the matching and unification of patterns.
Via the building of multiple alignments .
5
Benefits of the SP Theory
Conceptual simplicity combined with descriptive and
explanatory power across several aspects of intelligence.
Simplification of computing systems, including software.
Deeper insights and better solutions in several areas of
application.
Seamless integration of structures and functions within and
between different areas of application
6
SIMPLIFICATION OF COMPUTING
SYSTEMS
7
MULTIPLE ALIGNMENT: A CONCEPT
BORROWED FROM BIOINFORMATICS
8
Multiple Alignment
The system aims to find multiple alignments that enable a
New pattern to be encoded economically in terms of one or
more Old patterns
Multiple alignment provides the key to:
Versatility in representing different kinds of knowledge.
Versatility in different kinds of processing in AI and mainstream
computing.
9
Multiple Alignment
S → NP V NP
NP → D N
D → t h i s
D → t h a t
N → g i r l
N → b o y
V → l o v e s
V → h a t e s
10
Multiple Alignment
11
Multiple Alignment
12
S 0 1 0 1 0 #S
Multiple Alignment
Compression difference:
CD = BN-BE
BN :total number of bits in those symbol in the New pattern that are
aligned with Old symbols in the alignment
BE :the total number of bits in the symbols in the code pattern
Compression ratio:
CR = BN/BE;
13
Multiple Alignment
BN is calculated as:
h
BN = Ʃ Ci
i=1
Ci is the size of the code for ith symbol in a sequence, H1...Hh, com-
prising those symbols within the New pattern that are aligned with Old
symbols
14
Multiple Alignment
BE is calculated as:
s
BE = Ʃ Ci
i=1
where Ci is the size of the code for ith symbol in the sequence
of s symbols in the code pattern derived from the multiple
alignment.
15
Multiple Alignment
16
17
18
Big Data
19
Problems of Big Data and Solutions
Volume: big data is … BIG!
Efficiency in computation and the use of energy.
Unsupervised learning: discovering ‘natural’ structures in
data.
Transmission of information and the use of energy.
Variety: in kinds of data, formats, and modes of processing.
Veracity: errors and uncertainties in data.
Interpretation of data: pattern recognition, reasoning
Velocity: analysis of streaming data.
Visualization: representing structures and processes
20
Volume: Making Big Data Smaller
“Very-large-scale data sets introduce many data management
challenges.”
Information compression.
Direct benefits in storage, management and transmission.
Indirect benefits
efficiency in computation and the use of energy
unsupervised learning
additional economies in transmission and the use of energy
assistance in the management of errors and uncertainties in
data
processes of interpretation.
27
Energy, Speed and Bulk
In the SP theory, a process of searching for matching patterns
is central in all kinds of ‘processing’ or ‘computing’.
This means that anything that increases the efficiency of
searching will increase computational efficiency and,
probably, cut the use of energy:
Reducing the volume of big data.
Exploiting ***probabilities***.
Cutting out some searching.
22
Efficiency via Reduction in Volume
Information compression is central in how the
SP system works:
Reducing the size of big data.
Reducing the size of search terms.
Both these things can increase the efficiency
of searching, meaning gains in computational
efficiency and cuts in the use of energy.
23
Efficiency Via Probabilities
24
Efficiency Via Probabilities
25
26
Efficiency Via Probabilities
26
27
Efficiency Via Probabilities
27
Efficiency Via Probabilities
Statistical knowledge flows directly from:
Information compression in the SP system and
The intimate connection between information compression and
concepts of prediction and probability.
There is great potential to cut out unnecessary searching, with
consequent gains in efficiency.
Potential for savings at all levels and in all parts of the system
and on many fronts in its stored knowledge.
28
Efficiency via a Synergy with Data-Centric
Computing
29
Efficiency via a Synergy with Data-Centric
Computing
In SP-neural, SP patterns may be realized as
neuronal pattern assemblies.
There would be close integration of data and
processing, as in data-centric computing.
Direct connections may cut out some
searching
30
Unsupervised learning
Lossless compression of a body of information
Information compression, or “minimum length encoding”
remains the key.
Matching and unification of patterns
SP computer model has already demonstrated an ability
to discover generative grammars, including segmental
structures, classes of structure, and abstract patterns.
For body of information, I, the products of learning are:
a grammar (G) and an encoding (E) of I in terms of G
31
Product of Learning
32
Transmission of Data
• By making big data smaller (“Volume”).
• By separating grammar (G) from encoding (E), as in some
dictionary techniques and analysis/synthesis schemes.
• Efficiency in transmission can mean cuts in the use of energy.
33
Transmission of Data
34
Transmission of Data
Simplicity of a focus on the matching and unification of
patterns.
Aims to discover structures that are, quotes, “natural”.
Brain-inspired “DONSVIC” principle can mean relatively
high levels of information compression.
Potential for G to include structures not recognized by
most compression algorithms, such as:
Generic 3D models of objects and scenes.
Generic sequential redundancies across sequences of frames. 35
Overcoming Problems of Variety of Big
Data
Diverse kinds of data: the world’s many languages, spoken or
written; static and moving images; music as sound and music
in its written form; numbers and mathematical notations;
tables; charts; graphs; networks; trees; grammars; computer
programs; and more.
There are often several different computer formats for each
kind of data. With images, for example: JPEG, TIFF, WMF,
BMP, GIF, EPS, PDF, PNG, PBM, and more.
Adding to the complexity is that each kind of data and each
format normally requires its own special mode of processing.
THIS IS A MESS! It needs cleaning up.
Although some kinds of diversity are useful, there is a case for
developing a universal framework for the representation and
processing of diverse kinds of knowledge (UFK). 36
Universal Framework for the Representation
and Processing of Knowledge(UFK)
Potential benefits of a UFK in: ● Learning structure in data ●
Interpretation of data; ● Data fusion; ● Understanding and
translation of natural languages; ● The semantic web and
internet of things; ● Long-term preservation of data; ●
Seamless integration in the representation and processing of
diverse kinds of knowledge.
Most concepts are an amalgam of diverse kinds of knowledge
(which implies some uniformity in the representation and
processing of diverse kinds of knowledge).
The SP system is a good candidate for the role of UFK
because of its versatility in the representation and processing
of diverse kinds of knowledge.
37
How Variety Hinders Learning
Discovering the association between lightning and thunder is
likely to be difficult when:
Lightning appears in big data as a static image in one of several formats; or
in a moving image in one of several formats; or it is described, in spoken
or written form, as any one of such things as “firebolt”, “fulmination”, “la
foudre”, “der Blitz”, “lluched”, “a big flash in the sky”, or indeed
“lightning”.
Thunder is represented in one of several different audio formats; or it is
described, in spoken or written form, as “thunder”, “gök gürültüsü”, “le
tonnerre”, “a great rumble”, and so on.
If learning and discovery processes are going to work
effectively, we need to get behind these surface forms and
focus on the underlying meanings. This can be done using a
UFK.
38
Veracity
“In building a statistical model from any data source, one must
often deal with the fact that data are imperfect. Real-world
data are corrupted with noise. … Measurement processes are
inherently noisy, data can be recorded with error, and parts of
the data may be missing.”
In tasks such as parsing or pattern recognition, the SP system
is robust in the face of errors of omission, addition, or
substitution.
39
Veracity
40
Veracity
When we learn a first language (L):
We learn from a finite sample.
We generalize (to L) without over-generalising.
We learn ‘correct’ knowledge despite ‘dirty data’.
41
Veracity
For any body of data, I, principles of minimum-length
encoding provide the key:
Aim to minimize the overall size of G and E.
G is a distillation or ‘essence’ of I, that excludes most
‘errors’ and generalizes beyond I.
E + G is a lossless compression of I including typos etc but
without generalizations.
Systematic distortions remain a problem. 42
Interpretation of Data
Processing I in conjunction with a pre-established grammar (G) to
create a relatively compact encoding (E) of I
Depending on the nature of I and G, the process of interpretation
may be seen to achieve:
Pattern recognition
Information retrieval
Parsing and production of natural language
Translation from one representation to another
Planning
Problem solving
43
Velocity: Analysis of Streaming Data
In the context of big data, “velocity” means the analysis
of streaming data as it is received.
“This is the way humans process information.”
This style of analysis is at the heart of how the SP
system has been designed.
Unsupervised learning.
44
Visualizations
The SP system is well suited to visualization for these reasons:
Transparency in the representation of knowledge.
Transparency in processing.
The system is designed to discover ‘natural’ structures in data.
There is clear potential to integrate visualization with the
statistical techniques that lie at the heart of how the SP system
works.
45
Conclusion
Designed to simplify and integrate concepts across artificial
intelligence, mainstream computing, and human perception
and cognition, has potential in the management and analysis of
big data.
The SP system has potential as a universal framework for the
representation and processing of diverse kinds of knowledge
(UFK), helping to reduce the problem of variety in big data
the great diversity of formalisms and formats for knowledge,
and how they are processed.
46
Bibliography
www.cognitionresearch.org/sp.htm .
Article: “Big data and the SP theory of
intelligence”, J G Wolff, IEEE Access, 2, 301-
315, 2014.
47
48

More Related Content

What's hot

Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIPramit Choudhary
 
Scene Description From Images To Sentences
Scene Description From Images To SentencesScene Description From Images To Sentences
Scene Description From Images To SentencesIRJET Journal
 
Iaetsd design of image steganography using haar dwt
Iaetsd design of image steganography using haar dwtIaetsd design of image steganography using haar dwt
Iaetsd design of image steganography using haar dwtIaetsd Iaetsd
 
An Iterative Improved k-means Clustering
An Iterative Improved k-means ClusteringAn Iterative Improved k-means Clustering
An Iterative Improved k-means ClusteringIDES Editor
 
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELSREPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELScscpconf
 
A New Approach of Cryptographic Technique Using Simple ECC & ECF
A New Approach of Cryptographic Technique Using Simple ECC & ECFA New Approach of Cryptographic Technique Using Simple ECC & ECF
A New Approach of Cryptographic Technique Using Simple ECC & ECFIJAEMSJORNAL
 
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...webwinkelvakdag
 
UNDERSTANDING NEGATIVE SAMPLING IN KNOWLEDGE GRAPH EMBEDDING
UNDERSTANDING NEGATIVE SAMPLING IN KNOWLEDGE GRAPH EMBEDDINGUNDERSTANDING NEGATIVE SAMPLING IN KNOWLEDGE GRAPH EMBEDDING
UNDERSTANDING NEGATIVE SAMPLING IN KNOWLEDGE GRAPH EMBEDDINGijaia
 
Soft computing
Soft computingSoft computing
Soft computingCSS
 
Hedging Predictions in Machine Learning
Hedging Predictions in Machine LearningHedging Predictions in Machine Learning
Hedging Predictions in Machine Learningbutest
 
Fundamentals of the fuzzy logic based generalized theory of decisions
Fundamentals of the fuzzy logic based generalized theory of decisionsFundamentals of the fuzzy logic based generalized theory of decisions
Fundamentals of the fuzzy logic based generalized theory of decisionsSpringer
 
Image compression in digital image processing
Image compression in digital image processingImage compression in digital image processing
Image compression in digital image processingDHIVYADEVAKI
 
Conceptual design of edge adaptive steganography scheme based on advanced lsb...
Conceptual design of edge adaptive steganography scheme based on advanced lsb...Conceptual design of edge adaptive steganography scheme based on advanced lsb...
Conceptual design of edge adaptive steganography scheme based on advanced lsb...IAEME Publication
 
Text Steganography Using Compression and Random Number Generators
Text Steganography Using Compression and Random Number GeneratorsText Steganography Using Compression and Random Number Generators
Text Steganography Using Compression and Random Number GeneratorsEditor IJCATR
 
Journal - DATA HIDING SECURITY USING BIT MATCHING-BASED STEGANOGRAPHY AND CR...
Journal - DATA HIDING SECURITY USING BIT MATCHING-BASED  STEGANOGRAPHY AND CR...Journal - DATA HIDING SECURITY USING BIT MATCHING-BASED  STEGANOGRAPHY AND CR...
Journal - DATA HIDING SECURITY USING BIT MATCHING-BASED STEGANOGRAPHY AND CR...Budi Prasetiyo
 
IRJET- Application of Machine Learning for Data Security
IRJET-  	  Application of Machine Learning for Data SecurityIRJET-  	  Application of Machine Learning for Data Security
IRJET- Application of Machine Learning for Data SecurityIRJET Journal
 
B03208016
B03208016B03208016
B03208016inventy
 

What's hot (20)

Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AI
 
Scene Description From Images To Sentences
Scene Description From Images To SentencesScene Description From Images To Sentences
Scene Description From Images To Sentences
 
Iaetsd design of image steganography using haar dwt
Iaetsd design of image steganography using haar dwtIaetsd design of image steganography using haar dwt
Iaetsd design of image steganography using haar dwt
 
An Iterative Improved k-means Clustering
An Iterative Improved k-means ClusteringAn Iterative Improved k-means Clustering
An Iterative Improved k-means Clustering
 
Ijariie1132
Ijariie1132Ijariie1132
Ijariie1132
 
Marvin_Capstone
Marvin_CapstoneMarvin_Capstone
Marvin_Capstone
 
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELSREPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
 
A New Approach of Cryptographic Technique Using Simple ECC & ECF
A New Approach of Cryptographic Technique Using Simple ECC & ECFA New Approach of Cryptographic Technique Using Simple ECC & ECF
A New Approach of Cryptographic Technique Using Simple ECC & ECF
 
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...
 
UNDERSTANDING NEGATIVE SAMPLING IN KNOWLEDGE GRAPH EMBEDDING
UNDERSTANDING NEGATIVE SAMPLING IN KNOWLEDGE GRAPH EMBEDDINGUNDERSTANDING NEGATIVE SAMPLING IN KNOWLEDGE GRAPH EMBEDDING
UNDERSTANDING NEGATIVE SAMPLING IN KNOWLEDGE GRAPH EMBEDDING
 
Soft computing
Soft computingSoft computing
Soft computing
 
Hedging Predictions in Machine Learning
Hedging Predictions in Machine LearningHedging Predictions in Machine Learning
Hedging Predictions in Machine Learning
 
Fundamentals of the fuzzy logic based generalized theory of decisions
Fundamentals of the fuzzy logic based generalized theory of decisionsFundamentals of the fuzzy logic based generalized theory of decisions
Fundamentals of the fuzzy logic based generalized theory of decisions
 
Image compression in digital image processing
Image compression in digital image processingImage compression in digital image processing
Image compression in digital image processing
 
Conceptual design of edge adaptive steganography scheme based on advanced lsb...
Conceptual design of edge adaptive steganography scheme based on advanced lsb...Conceptual design of edge adaptive steganography scheme based on advanced lsb...
Conceptual design of edge adaptive steganography scheme based on advanced lsb...
 
Text Steganography Using Compression and Random Number Generators
Text Steganography Using Compression and Random Number GeneratorsText Steganography Using Compression and Random Number Generators
Text Steganography Using Compression and Random Number Generators
 
Journal - DATA HIDING SECURITY USING BIT MATCHING-BASED STEGANOGRAPHY AND CR...
Journal - DATA HIDING SECURITY USING BIT MATCHING-BASED  STEGANOGRAPHY AND CR...Journal - DATA HIDING SECURITY USING BIT MATCHING-BASED  STEGANOGRAPHY AND CR...
Journal - DATA HIDING SECURITY USING BIT MATCHING-BASED STEGANOGRAPHY AND CR...
 
IRJET- Application of Machine Learning for Data Security
IRJET-  	  Application of Machine Learning for Data SecurityIRJET-  	  Application of Machine Learning for Data Security
IRJET- Application of Machine Learning for Data Security
 
1283920.1283924
1283920.12839241283920.1283924
1283920.1283924
 
B03208016
B03208016B03208016
B03208016
 

Similar to Big data and SP Theory of Intelligence

Autonomous robot & sp theory of intelligence
Autonomous robot & sp theory of intelligenceAutonomous robot & sp theory of intelligence
Autonomous robot & sp theory of intelligenceChristy Abraham Joy
 
The Smart Way To Invest in AI and ML_SFStartupDay
The Smart Way To Invest in AI and ML_SFStartupDayThe Smart Way To Invest in AI and ML_SFStartupDay
The Smart Way To Invest in AI and ML_SFStartupDayAmazon Web Services
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningPramit Choudhary
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep LearningAndre Freitas
 
Semantics of the Black-Box: Using knowledge-infused learning approach to make...
Semantics of the Black-Box: Using knowledge-infused learning approach to make...Semantics of the Black-Box: Using knowledge-infused learning approach to make...
Semantics of the Black-Box: Using knowledge-infused learning approach to make...Amit Sheth
 
Semantics of the Black-Box: Using knowledge-infused learning approach to make...
Semantics of the Black-Box: Using knowledge-infused learning approach to make...Semantics of the Black-Box: Using knowledge-infused learning approach to make...
Semantics of the Black-Box: Using knowledge-infused learning approach to make...Artificial Intelligence Institute at UofSC
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data AnalysisKaty Allen
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentationDavid Raj Kanthi
 
Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...BaoTramDuong2
 
Futuristic knowledge management ppt bec bagalkot mba
Futuristic knowledge management ppt bec bagalkot mbaFuturistic knowledge management ppt bec bagalkot mba
Futuristic knowledge management ppt bec bagalkot mbaBabasab Patil
 
Assessing data dissemination strategies
Assessing data dissemination strategiesAssessing data dissemination strategies
Assessing data dissemination strategiesOpen University, KMi
 
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and SecurityOntology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and SecurityBarry Smith
 
Information Upload and retrieval using SP Theory of Intelligence
Information Upload and retrieval using SP Theory of IntelligenceInformation Upload and retrieval using SP Theory of Intelligence
Information Upload and retrieval using SP Theory of IntelligenceINFOGAIN PUBLICATION
 
Knowledge Management in the AI Driven Scintific System
Knowledge Management in the AI Driven Scintific SystemKnowledge Management in the AI Driven Scintific System
Knowledge Management in the AI Driven Scintific SystemSubhasis Dasgupta
 
Causal networks, learning and inference - Introduction
Causal networks, learning and inference - IntroductionCausal networks, learning and inference - Introduction
Causal networks, learning and inference - IntroductionFabio Stella
 
Knowledge Graph Research and Innovation Challenges
Knowledge Graph Research and Innovation ChallengesKnowledge Graph Research and Innovation Challenges
Knowledge Graph Research and Innovation ChallengesSören Auer
 
A holistic approach to distribute dimensionality reduction of big dat,big dat...
A holistic approach to distribute dimensionality reduction of big dat,big dat...A holistic approach to distribute dimensionality reduction of big dat,big dat...
A holistic approach to distribute dimensionality reduction of big dat,big dat...Nexgen Technology
 
The FAIR data movement and 22 Feb 2023.pdf
The FAIR data movement and 22 Feb 2023.pdfThe FAIR data movement and 22 Feb 2023.pdf
The FAIR data movement and 22 Feb 2023.pdfAlan Morrison
 

Similar to Big data and SP Theory of Intelligence (20)

Autonomous robot & sp theory of intelligence
Autonomous robot & sp theory of intelligenceAutonomous robot & sp theory of intelligence
Autonomous robot & sp theory of intelligence
 
The Smart Way To Invest in AI and ML_SFStartupDay
The Smart Way To Invest in AI and ML_SFStartupDayThe Smart Way To Invest in AI and ML_SFStartupDay
The Smart Way To Invest in AI and ML_SFStartupDay
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
Semantics of the Black-Box: Using knowledge-infused learning approach to make...
Semantics of the Black-Box: Using knowledge-infused learning approach to make...Semantics of the Black-Box: Using knowledge-infused learning approach to make...
Semantics of the Black-Box: Using knowledge-infused learning approach to make...
 
Semantics of the Black-Box: Using knowledge-infused learning approach to make...
Semantics of the Black-Box: Using knowledge-infused learning approach to make...Semantics of the Black-Box: Using knowledge-infused learning approach to make...
Semantics of the Black-Box: Using knowledge-infused learning approach to make...
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysis
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentation
 
Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...
 
Futuristic knowledge management ppt bec bagalkot mba
Futuristic knowledge management ppt bec bagalkot mbaFuturistic knowledge management ppt bec bagalkot mba
Futuristic knowledge management ppt bec bagalkot mba
 
Assessing data dissemination strategies
Assessing data dissemination strategiesAssessing data dissemination strategies
Assessing data dissemination strategies
 
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and SecurityOntology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
 
Information Upload and retrieval using SP Theory of Intelligence
Information Upload and retrieval using SP Theory of IntelligenceInformation Upload and retrieval using SP Theory of Intelligence
Information Upload and retrieval using SP Theory of Intelligence
 
Knowledge Management in the AI Driven Scintific System
Knowledge Management in the AI Driven Scintific SystemKnowledge Management in the AI Driven Scintific System
Knowledge Management in the AI Driven Scintific System
 
Cognitive data
Cognitive dataCognitive data
Cognitive data
 
Causal networks, learning and inference - Introduction
Causal networks, learning and inference - IntroductionCausal networks, learning and inference - Introduction
Causal networks, learning and inference - Introduction
 
Chapter 2 - EMTE.pptx
Chapter 2 - EMTE.pptxChapter 2 - EMTE.pptx
Chapter 2 - EMTE.pptx
 
Knowledge Graph Research and Innovation Challenges
Knowledge Graph Research and Innovation ChallengesKnowledge Graph Research and Innovation Challenges
Knowledge Graph Research and Innovation Challenges
 
A holistic approach to distribute dimensionality reduction of big dat,big dat...
A holistic approach to distribute dimensionality reduction of big dat,big dat...A holistic approach to distribute dimensionality reduction of big dat,big dat...
A holistic approach to distribute dimensionality reduction of big dat,big dat...
 
The FAIR data movement and 22 Feb 2023.pdf
The FAIR data movement and 22 Feb 2023.pdfThe FAIR data movement and 22 Feb 2023.pdf
The FAIR data movement and 22 Feb 2023.pdf
 

Recently uploaded

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 

Recently uploaded (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 

Big data and SP Theory of Intelligence

  • 1. Big Data and the SP Theory of Intelligence Varsha Prabhakaran S8 CSE B Roll No: 43 1
  • 2. Contents Introduction SP Theory of Intelligence Problems of Big Data Volume Efficiency Transmission Variety Veracity Visualization 2
  • 3. Introduction SP theory of intelligence be applied to the management and analysis of big data Overcomes the problem of variety in big data. Analysis of streaming data- velocity Economies in the transmission of data Veracity in big data. Visualization of knowledge structures and inferential processes 3
  • 4. SP Theory of Intelligence 4
  • 5. SP Theory of Intelligence Designed to simplify and integrate concepts across artificial intelligence, mainstream computing, and human perception and cognition. Product of an extensive program of development and testing via the SP computer model. Knowledge represented with arrays of atomic symbols in one or two dimensions called “patterns”. Processing are done by compressing information Via the matching and unification of patterns. Via the building of multiple alignments . 5
  • 6. Benefits of the SP Theory Conceptual simplicity combined with descriptive and explanatory power across several aspects of intelligence. Simplification of computing systems, including software. Deeper insights and better solutions in several areas of application. Seamless integration of structures and functions within and between different areas of application 6
  • 8. MULTIPLE ALIGNMENT: A CONCEPT BORROWED FROM BIOINFORMATICS 8
  • 9. Multiple Alignment The system aims to find multiple alignments that enable a New pattern to be encoded economically in terms of one or more Old patterns Multiple alignment provides the key to: Versatility in representing different kinds of knowledge. Versatility in different kinds of processing in AI and mainstream computing. 9
  • 10. Multiple Alignment S → NP V NP NP → D N D → t h i s D → t h a t N → g i r l N → b o y V → l o v e s V → h a t e s 10
  • 13. Multiple Alignment Compression difference: CD = BN-BE BN :total number of bits in those symbol in the New pattern that are aligned with Old symbols in the alignment BE :the total number of bits in the symbols in the code pattern Compression ratio: CR = BN/BE; 13
  • 14. Multiple Alignment BN is calculated as: h BN = Ʃ Ci i=1 Ci is the size of the code for ith symbol in a sequence, H1...Hh, com- prising those symbols within the New pattern that are aligned with Old symbols 14
  • 15. Multiple Alignment BE is calculated as: s BE = Ʃ Ci i=1 where Ci is the size of the code for ith symbol in the sequence of s symbols in the code pattern derived from the multiple alignment. 15
  • 17. 17
  • 18. 18
  • 20. Problems of Big Data and Solutions Volume: big data is … BIG! Efficiency in computation and the use of energy. Unsupervised learning: discovering ‘natural’ structures in data. Transmission of information and the use of energy. Variety: in kinds of data, formats, and modes of processing. Veracity: errors and uncertainties in data. Interpretation of data: pattern recognition, reasoning Velocity: analysis of streaming data. Visualization: representing structures and processes 20
  • 21. Volume: Making Big Data Smaller “Very-large-scale data sets introduce many data management challenges.” Information compression. Direct benefits in storage, management and transmission. Indirect benefits efficiency in computation and the use of energy unsupervised learning additional economies in transmission and the use of energy assistance in the management of errors and uncertainties in data processes of interpretation. 27
  • 22. Energy, Speed and Bulk In the SP theory, a process of searching for matching patterns is central in all kinds of ‘processing’ or ‘computing’. This means that anything that increases the efficiency of searching will increase computational efficiency and, probably, cut the use of energy: Reducing the volume of big data. Exploiting ***probabilities***. Cutting out some searching. 22
  • 23. Efficiency via Reduction in Volume Information compression is central in how the SP system works: Reducing the size of big data. Reducing the size of search terms. Both these things can increase the efficiency of searching, meaning gains in computational efficiency and cuts in the use of energy. 23
  • 28. Efficiency Via Probabilities Statistical knowledge flows directly from: Information compression in the SP system and The intimate connection between information compression and concepts of prediction and probability. There is great potential to cut out unnecessary searching, with consequent gains in efficiency. Potential for savings at all levels and in all parts of the system and on many fronts in its stored knowledge. 28
  • 29. Efficiency via a Synergy with Data-Centric Computing 29
  • 30. Efficiency via a Synergy with Data-Centric Computing In SP-neural, SP patterns may be realized as neuronal pattern assemblies. There would be close integration of data and processing, as in data-centric computing. Direct connections may cut out some searching 30
  • 31. Unsupervised learning Lossless compression of a body of information Information compression, or “minimum length encoding” remains the key. Matching and unification of patterns SP computer model has already demonstrated an ability to discover generative grammars, including segmental structures, classes of structure, and abstract patterns. For body of information, I, the products of learning are: a grammar (G) and an encoding (E) of I in terms of G 31
  • 33. Transmission of Data • By making big data smaller (“Volume”). • By separating grammar (G) from encoding (E), as in some dictionary techniques and analysis/synthesis schemes. • Efficiency in transmission can mean cuts in the use of energy. 33
  • 35. Transmission of Data Simplicity of a focus on the matching and unification of patterns. Aims to discover structures that are, quotes, “natural”. Brain-inspired “DONSVIC” principle can mean relatively high levels of information compression. Potential for G to include structures not recognized by most compression algorithms, such as: Generic 3D models of objects and scenes. Generic sequential redundancies across sequences of frames. 35
  • 36. Overcoming Problems of Variety of Big Data Diverse kinds of data: the world’s many languages, spoken or written; static and moving images; music as sound and music in its written form; numbers and mathematical notations; tables; charts; graphs; networks; trees; grammars; computer programs; and more. There are often several different computer formats for each kind of data. With images, for example: JPEG, TIFF, WMF, BMP, GIF, EPS, PDF, PNG, PBM, and more. Adding to the complexity is that each kind of data and each format normally requires its own special mode of processing. THIS IS A MESS! It needs cleaning up. Although some kinds of diversity are useful, there is a case for developing a universal framework for the representation and processing of diverse kinds of knowledge (UFK). 36
  • 37. Universal Framework for the Representation and Processing of Knowledge(UFK) Potential benefits of a UFK in: ● Learning structure in data ● Interpretation of data; ● Data fusion; ● Understanding and translation of natural languages; ● The semantic web and internet of things; ● Long-term preservation of data; ● Seamless integration in the representation and processing of diverse kinds of knowledge. Most concepts are an amalgam of diverse kinds of knowledge (which implies some uniformity in the representation and processing of diverse kinds of knowledge). The SP system is a good candidate for the role of UFK because of its versatility in the representation and processing of diverse kinds of knowledge. 37
  • 38. How Variety Hinders Learning Discovering the association between lightning and thunder is likely to be difficult when: Lightning appears in big data as a static image in one of several formats; or in a moving image in one of several formats; or it is described, in spoken or written form, as any one of such things as “firebolt”, “fulmination”, “la foudre”, “der Blitz”, “lluched”, “a big flash in the sky”, or indeed “lightning”. Thunder is represented in one of several different audio formats; or it is described, in spoken or written form, as “thunder”, “gök gürültüsü”, “le tonnerre”, “a great rumble”, and so on. If learning and discovery processes are going to work effectively, we need to get behind these surface forms and focus on the underlying meanings. This can be done using a UFK. 38
  • 39. Veracity “In building a statistical model from any data source, one must often deal with the fact that data are imperfect. Real-world data are corrupted with noise. … Measurement processes are inherently noisy, data can be recorded with error, and parts of the data may be missing.” In tasks such as parsing or pattern recognition, the SP system is robust in the face of errors of omission, addition, or substitution. 39
  • 41. Veracity When we learn a first language (L): We learn from a finite sample. We generalize (to L) without over-generalising. We learn ‘correct’ knowledge despite ‘dirty data’. 41
  • 42. Veracity For any body of data, I, principles of minimum-length encoding provide the key: Aim to minimize the overall size of G and E. G is a distillation or ‘essence’ of I, that excludes most ‘errors’ and generalizes beyond I. E + G is a lossless compression of I including typos etc but without generalizations. Systematic distortions remain a problem. 42
  • 43. Interpretation of Data Processing I in conjunction with a pre-established grammar (G) to create a relatively compact encoding (E) of I Depending on the nature of I and G, the process of interpretation may be seen to achieve: Pattern recognition Information retrieval Parsing and production of natural language Translation from one representation to another Planning Problem solving 43
  • 44. Velocity: Analysis of Streaming Data In the context of big data, “velocity” means the analysis of streaming data as it is received. “This is the way humans process information.” This style of analysis is at the heart of how the SP system has been designed. Unsupervised learning. 44
  • 45. Visualizations The SP system is well suited to visualization for these reasons: Transparency in the representation of knowledge. Transparency in processing. The system is designed to discover ‘natural’ structures in data. There is clear potential to integrate visualization with the statistical techniques that lie at the heart of how the SP system works. 45
  • 46. Conclusion Designed to simplify and integrate concepts across artificial intelligence, mainstream computing, and human perception and cognition, has potential in the management and analysis of big data. The SP system has potential as a universal framework for the representation and processing of diverse kinds of knowledge (UFK), helping to reduce the problem of variety in big data the great diversity of formalisms and formats for knowledge, and how they are processed. 46
  • 47. Bibliography www.cognitionresearch.org/sp.htm . Article: “Big data and the SP theory of intelligence”, J G Wolff, IEEE Access, 2, 301- 315, 2014. 47
  • 48. 48