Perception, representation,
structure, and recognition
Zahra Sadeghi
Quillan,1966
Knowledge structure
• taxonomic hierarchy can provide an
efficient mechanism for storing and
retrieving semantic information.
• General benefits of this structure
• Inheritance property
• Economy of use
• Generalization
• Semantic deficit
• Cognitive development
E. Rosch
M. R. Quillian
E.
Warrington
need for a more flexible structure of
conceptual knowledge
• the superordinate level contains a very wide range of object categories with different appearances and shapes
2 2
1
, _
, _
( _ )
( , ) Re( ( , )) Im( ( , ))
( , ) tan (Im( ( , ) / Re( ( , ))
(1) | ( , ) |
(2) log(1 ( , ))
(3)
x y input image
x y input image
FI F input image
magnitude x y FI x y FI x y
phase x y FI x y FI x y
FreqFeat magnitude x y
FreqFeat magnitude x y
FreqFeat




 


 


, _
| ( , ) |
x y input image
phase x y

 
}
6
/
5
,
3
/
2
,
2
/
,
3
/
,
6
/
,
0
{
}
2
,
5
.
1
,
1
,
5
{.
|
)
,
,
,
(
|
)
,
(
)
,
,
_
(
)
,
,
,
(
_
,















s
s
y
x
GI
s
y
gaborEnerg
s
image
input
G
o
s
y
x
GI
image
input
y
x
( ).log( ( )
entropy H I H I
 

(1) ( _ _ )
(2) ( _ _ )
EntropyFeat entropy Input image RGB
EntropyFeat entropy Input image gray


Diversity= the number of filled bins of histogram
Variability=the number of peaks of histogram
(1) (i _ _ )
(2) var ( _ _ )
colorHfeat diversity nput image RGB
colorHfeat iablity input image RGB


frequency orientation
color entropy
0
0.2
0.4
0.6
0.8
1
RI F P R
FreqFeat
GaborFeat
EntropyFeat
colorHFeat 0
0.2
0.4
0.6
0.8
1
RI F P R
FreqFeat
GaborFeat
EntropyFeat
colorHFeat
for im = 1 to size(ImageDataset) do
InputImage = ImageDataset(im)
BF = BasicF eatures(InputImage)
CF(im; :) = ComplexF eatures(BF)
end for
Cluster(CF)
Global strategy
for im = 1 to size(ImageDataset) do
image = ImageDataset(im)
for i = 1 to exploringWindowsNum do
InputImage = randomPatch(image)
BF(i; :) = BasicF eatures(InputImage)
CFi(i; :) = ComplexF eatures(BF; inputImage)
end for
CF(im; :) = Average(CFi)
end for
Cluster(CF)
Local strategy
0
0.2
0.4
0.6
0.8
1
RI F P R
FreqFeat
GaborFeat
EntropyFeat
colorHFeat
0
0.2
0.4
0.6
0.8
1
RI F P R
FreqFeat
GaborFetat
EntropyFeat
colorHFeat
Hemera
Caltech-coil
• We compared the distinction power of energy of low
visual features, i.e., color, orientation, and frequency
feature sets in an unsupervised manner.
Superordinate level: Artificial/Natural
Sadeghi, Z., Ahmadabadi, M. N., & Araabi, B. N. (2013). Unsupervised categorization of objects into artificial and natural
superordinate classes using features from low-level vision. International Journal of Image Processing, 7(4), 339-352.
Basic (intermediate) level: Animal/Plant
0 50 100
0
50
100
vertical projection
0 50 100
0
50
horizontal projection
0 50 100
0
50
100
left profile
0 50 100
0
50
100
right profile
0 50 100
20
40
60
top profile
0 50 100
20
40
60
bottom profile
0 50 100
0
50
100
vertical projection
0 50 100
0
50
100
horizontal projection
0 50 100
0
50
100
left profile
0 50 100
0
50
right profile
0 50 100
0
50
100
top profile
0 50 100
0
50
100
bottom profile
raw 25 50 75 100
0.7
0.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86
0.88
hidden units
accuracy
h
v
l
r
t
b
unsupervised h v l r t b [h,v,t,b]
P 68.57 69.17 60.94 59.08 67.77 65.07 70.21
R 59.98 59.70 53.70 58.87 58.02 55.68 60.39
F1-score 63.99 64.09 57.09 58.97 62.52 60.01 64.93
accuracy 60.57 60.93 52.86 52.17 59.37 56.66 61.91
Shape descriptors: moment, profile an projection shape
descriptors
0
10
20
30
40
50
boys
girls
Children’s mental object representations: shape-concept
is the dominant strategy
Sadeghi, Z. (2019). Visual Categorization of Objects into Animal and Plant Classes Using Global Shape
Descriptors. arXiv preprint arXiv:1901.11398.
Utilizing hierarchical information
we defined dictionary of features based on PCA approach in two modes of flat and hierarchical subspaces.
Our aim was to point out the effect of contextual prior information in improving the accuracy of recognition.
0 2 4 6 8 10 12 14
10
20
30
40
0 2 4 6 8 10 12 14
0
20
40
60
0 2 4 6 8 10 12 14
0
20
40
60
0 2 4 6 8 10 12 14
0
20
40
60
0 2 4 6 8 10 12 14
10
20
30
40
0 2 4 6 8 10 12 14
10
20
30
40
0
20
40
60
cugar-body
elephant
flamingo
gerenuk
pigeon
rooster
bonsai
joshua
t
ree
lotus
strawberry
sunflower
water-lilly
0
20
40
60
cugar-body
elephant
flamingo
gerenuk
pigeon
rooster
bonsai
joshua
t
ree
lotus
strawberry
sunflower
water-lilly
cugar-body
elephant
flamingo
gerenuk
pigeon
rooster
bonsai
joshua
t
ree
lotus
strawberry
sunflower
water-lilly
flat
conceptual
Total #eigenvectors Half #eigenvectors
#nt=20
#nt=18
#nt=30
#nt=24
flat mode
( )
[ S, S] conceptual mode
T
AP
T T
A P
u S
subFeat S
u u

 

SVM classification
Flat mode
Hierarchical mode
Eigen spaces
Plant subspace
Animal subspace
Sadeghi, Z., Araabi, B. N., & Ahmadabadi, M. N. (2015). A computational approach towards visual object
recognition at taxonomic levels of concepts. Computational intelligence and neuroscience, 2015, 72-72.
Sadeghi, 2016
• Human knowledge structure of
visual form and appearance
information from a behavioral
dataset
• We trained a 3 layer deep belief
network on this dataset and
performed an unsupervised
learning scheme on the obtained
deep representations.
• There’s a progressive differentiation in layers of deep network that is trained on a data with hierarchical structure.
• In a distributed approach such as a neural network, connections are potentially sensitive to many kinds of structure.
Developmental learning in DNNs
• There’s a progression in
depth in hidden layers of
DBN where low level layers
represent finer distinctions
and high level layers
represent coarser
distinctions
Sadeghi, Z. (2016). Deep learning and developmental learning: emergence of fine-to-coarse conceptual categories at layers of deep belief network. Perception, 45(9), 1036-
1045.
7
SIMILARITYMATRIXFOROBJECTSIN
THESCENEDATASET
SIMILARITY MATRIX FOROBJECTSIN
THEFEATUREDATASET
Featural vs. distributed approach in object representation
SIMILARITYMATRIXFOROBJECTSIN
THEVERBALDATASET
• like the feature similarity approach, representations
derived from the scene dataset revealed strong
relationships within category co-ordinates.
• unlike the feature approach, analysis of the scene
dataset also captured information about cross-category
associations.
• structure of semantic relationships
• Taxonomic similarity: object concepts are related to the
degree to which they share basic properties or features
• Associative similarity: concepts(words) are related to the
degree they occur in similar linguistic contexts
• Here, we investigated whether the second approach could be
applied to non-verbal patterns of object co-occurrence in
natural environments.
Sadeghi, Z., McClelland, J. L., & Hoffman, P. (2015). You shall know an object by the company it keeps: An investigation of semantic
representations derived from object co-occurrence in visual scenes. Neuropsychologia, 76, 52-61.
correlations matrix
hierarchical tree captures many strong similarity relations
(reflected by dark red color near the main diagonal) but also
misses many others (dark red colors not near the main diagonal).
Human knowledge
structure
Agglomerative clustering
A discrete structure may often provide an imperfect guide to the
full structure present in a data set.
we offer the view that semantic structure might best be captured by
a more flexible system of representation that can be sensitive to
multiple types of structure that may be present in a data set.
• The first dimension identifies aquatic vs. non-aquatic mammals;
• the second identifies predators vs. prey;
• and the third picks out the size dimension.
these are cross-cutting dimensions
McClelland, J. L., Sadeghi, Z., & Saxe, A. M. (2017). A Critique of Pure Hierarchy:
Uncovering Cross-Cutting Structure in a Natural Dataset. In Neurocomputational
Models of Cognitive Development and Processing: Proceedings of the 14th Neural
Computation and Psychology Workshop (pp. 51-68).
Neurons’ encoding and representation
• quantification of the degree of neurons’ responses to different classes
)
(
)
(
)
(
i
i
i
n
sparsity
n
flatness
n
MF 
• Localist representation: neurons respond selectively to one thing
• Distributed representation: each unit or neuron is involved in coding many
different things
Multi-faceted degree quantifies the number of sources of information that stimulates each neuron
Rodrigo Quian Quiroga, Itzhak Fried and Christof Koch, 2013
SF Neuron MF Neuron
IC visualization can reveal the multi-responsive
feature of neurons based on the similarity
of the patterns captured by each component.
Sadeghi, Z. (2020). Conceptual Content in Deep Convolutional Neural Networks: An analysis into multi-
faceted properties of neurons. In Proceedings of International Joint Conference on Computational
Intelligence: IJCCI 2019 (pp. 19-31). Springer Singapore.
Attention in Object recognition
• Comparing the performance of model and human in rapid object recognition in the task of animal/non-animal classification
Linsely et al., 2018
Eberhardt et al., 2014
Importance maps from human view points in ordered time frame
While recognition accuracy increases with higher stages of
visual processing, human decisions agreed best with
predictions from intermediate stages.
• cueing deep nets to more meaningful
image regions derived from experimental
study
• Comparing performance of human and
model in different levels of occlusion
10
Sadeghi, Z. (2020). An Investigation on Performance of Attention Deep Neural Networks in Rapid Object
Recognition. In Intelligent Computing Systems: Third International Symposium, ISICS 2020, Sharjah,
United Arab Emirates, March 18–19, 2020, Proceedings 3 (pp. 1-10). Springer International Publishing.
+ +
inconsistent case consistent case
Occluded object recognition
Hit const vs
hit inconst
Miss const vs
miss inconst
Sup hit const vs
sup hit inconst
Sup miss const vs
sup miss inconst
Hypo_pos1 vs
hypo_neg1
Hypo_pos2 vs
hypo_neg2
Resp-time
const vs
inconst
p-val 0.0027 0.0027 0.0027 0.0027 0.0027 0.0027 4.6921e-11
11
Sadeghi, Z. (2020). The effect of top-down
attention in occluded object recognition. arXiv
preprint arXiv:2007.10232.
Visualization and information analysis
Sadeghi, Z. (2019, September). An Information Analysis
Approach into Feature Understanding of Convolutional
Deep Neural Networks. In Machine Learning, Optimization,
and Data Science: 5th International Conference, LOD
2019, Siena, Italy, September 10–13, 2019,
Proceedings (pp. 36-44).

Perception, representation, structure, and recognition

  • 1.
  • 2.
    Quillan,1966 Knowledge structure • taxonomichierarchy can provide an efficient mechanism for storing and retrieving semantic information. • General benefits of this structure • Inheritance property • Economy of use • Generalization • Semantic deficit • Cognitive development E. Rosch M. R. Quillian E. Warrington need for a more flexible structure of conceptual knowledge
  • 3.
    • the superordinatelevel contains a very wide range of object categories with different appearances and shapes 2 2 1 , _ , _ ( _ ) ( , ) Re( ( , )) Im( ( , )) ( , ) tan (Im( ( , ) / Re( ( , )) (1) | ( , ) | (2) log(1 ( , )) (3) x y input image x y input image FI F input image magnitude x y FI x y FI x y phase x y FI x y FI x y FreqFeat magnitude x y FreqFeat magnitude x y FreqFeat             , _ | ( , ) | x y input image phase x y    } 6 / 5 , 3 / 2 , 2 / , 3 / , 6 / , 0 { } 2 , 5 . 1 , 1 , 5 {. | ) , , , ( | ) , ( ) , , _ ( ) , , , ( _ ,                s s y x GI s y gaborEnerg s image input G o s y x GI image input y x ( ).log( ( ) entropy H I H I    (1) ( _ _ ) (2) ( _ _ ) EntropyFeat entropy Input image RGB EntropyFeat entropy Input image gray   Diversity= the number of filled bins of histogram Variability=the number of peaks of histogram (1) (i _ _ ) (2) var ( _ _ ) colorHfeat diversity nput image RGB colorHfeat iablity input image RGB   frequency orientation color entropy 0 0.2 0.4 0.6 0.8 1 RI F P R FreqFeat GaborFeat EntropyFeat colorHFeat 0 0.2 0.4 0.6 0.8 1 RI F P R FreqFeat GaborFeat EntropyFeat colorHFeat for im = 1 to size(ImageDataset) do InputImage = ImageDataset(im) BF = BasicF eatures(InputImage) CF(im; :) = ComplexF eatures(BF) end for Cluster(CF) Global strategy for im = 1 to size(ImageDataset) do image = ImageDataset(im) for i = 1 to exploringWindowsNum do InputImage = randomPatch(image) BF(i; :) = BasicF eatures(InputImage) CFi(i; :) = ComplexF eatures(BF; inputImage) end for CF(im; :) = Average(CFi) end for Cluster(CF) Local strategy 0 0.2 0.4 0.6 0.8 1 RI F P R FreqFeat GaborFeat EntropyFeat colorHFeat 0 0.2 0.4 0.6 0.8 1 RI F P R FreqFeat GaborFetat EntropyFeat colorHFeat Hemera Caltech-coil • We compared the distinction power of energy of low visual features, i.e., color, orientation, and frequency feature sets in an unsupervised manner. Superordinate level: Artificial/Natural Sadeghi, Z., Ahmadabadi, M. N., & Araabi, B. N. (2013). Unsupervised categorization of objects into artificial and natural superordinate classes using features from low-level vision. International Journal of Image Processing, 7(4), 339-352.
  • 4.
    Basic (intermediate) level:Animal/Plant 0 50 100 0 50 100 vertical projection 0 50 100 0 50 horizontal projection 0 50 100 0 50 100 left profile 0 50 100 0 50 100 right profile 0 50 100 20 40 60 top profile 0 50 100 20 40 60 bottom profile 0 50 100 0 50 100 vertical projection 0 50 100 0 50 100 horizontal projection 0 50 100 0 50 100 left profile 0 50 100 0 50 right profile 0 50 100 0 50 100 top profile 0 50 100 0 50 100 bottom profile raw 25 50 75 100 0.7 0.72 0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.88 hidden units accuracy h v l r t b unsupervised h v l r t b [h,v,t,b] P 68.57 69.17 60.94 59.08 67.77 65.07 70.21 R 59.98 59.70 53.70 58.87 58.02 55.68 60.39 F1-score 63.99 64.09 57.09 58.97 62.52 60.01 64.93 accuracy 60.57 60.93 52.86 52.17 59.37 56.66 61.91 Shape descriptors: moment, profile an projection shape descriptors 0 10 20 30 40 50 boys girls Children’s mental object representations: shape-concept is the dominant strategy Sadeghi, Z. (2019). Visual Categorization of Objects into Animal and Plant Classes Using Global Shape Descriptors. arXiv preprint arXiv:1901.11398.
  • 5.
    Utilizing hierarchical information wedefined dictionary of features based on PCA approach in two modes of flat and hierarchical subspaces. Our aim was to point out the effect of contextual prior information in improving the accuracy of recognition. 0 2 4 6 8 10 12 14 10 20 30 40 0 2 4 6 8 10 12 14 0 20 40 60 0 2 4 6 8 10 12 14 0 20 40 60 0 2 4 6 8 10 12 14 0 20 40 60 0 2 4 6 8 10 12 14 10 20 30 40 0 2 4 6 8 10 12 14 10 20 30 40 0 20 40 60 cugar-body elephant flamingo gerenuk pigeon rooster bonsai joshua t ree lotus strawberry sunflower water-lilly 0 20 40 60 cugar-body elephant flamingo gerenuk pigeon rooster bonsai joshua t ree lotus strawberry sunflower water-lilly cugar-body elephant flamingo gerenuk pigeon rooster bonsai joshua t ree lotus strawberry sunflower water-lilly flat conceptual Total #eigenvectors Half #eigenvectors #nt=20 #nt=18 #nt=30 #nt=24 flat mode ( ) [ S, S] conceptual mode T AP T T A P u S subFeat S u u     SVM classification Flat mode Hierarchical mode Eigen spaces Plant subspace Animal subspace Sadeghi, Z., Araabi, B. N., & Ahmadabadi, M. N. (2015). A computational approach towards visual object recognition at taxonomic levels of concepts. Computational intelligence and neuroscience, 2015, 72-72.
  • 6.
    Sadeghi, 2016 • Humanknowledge structure of visual form and appearance information from a behavioral dataset • We trained a 3 layer deep belief network on this dataset and performed an unsupervised learning scheme on the obtained deep representations. • There’s a progressive differentiation in layers of deep network that is trained on a data with hierarchical structure. • In a distributed approach such as a neural network, connections are potentially sensitive to many kinds of structure. Developmental learning in DNNs • There’s a progression in depth in hidden layers of DBN where low level layers represent finer distinctions and high level layers represent coarser distinctions Sadeghi, Z. (2016). Deep learning and developmental learning: emergence of fine-to-coarse conceptual categories at layers of deep belief network. Perception, 45(9), 1036- 1045.
  • 7.
    7 SIMILARITYMATRIXFOROBJECTSIN THESCENEDATASET SIMILARITY MATRIX FOROBJECTSIN THEFEATUREDATASET Featuralvs. distributed approach in object representation SIMILARITYMATRIXFOROBJECTSIN THEVERBALDATASET • like the feature similarity approach, representations derived from the scene dataset revealed strong relationships within category co-ordinates. • unlike the feature approach, analysis of the scene dataset also captured information about cross-category associations. • structure of semantic relationships • Taxonomic similarity: object concepts are related to the degree to which they share basic properties or features • Associative similarity: concepts(words) are related to the degree they occur in similar linguistic contexts • Here, we investigated whether the second approach could be applied to non-verbal patterns of object co-occurrence in natural environments. Sadeghi, Z., McClelland, J. L., & Hoffman, P. (2015). You shall know an object by the company it keeps: An investigation of semantic representations derived from object co-occurrence in visual scenes. Neuropsychologia, 76, 52-61.
  • 8.
    correlations matrix hierarchical treecaptures many strong similarity relations (reflected by dark red color near the main diagonal) but also misses many others (dark red colors not near the main diagonal). Human knowledge structure Agglomerative clustering A discrete structure may often provide an imperfect guide to the full structure present in a data set. we offer the view that semantic structure might best be captured by a more flexible system of representation that can be sensitive to multiple types of structure that may be present in a data set. • The first dimension identifies aquatic vs. non-aquatic mammals; • the second identifies predators vs. prey; • and the third picks out the size dimension. these are cross-cutting dimensions McClelland, J. L., Sadeghi, Z., & Saxe, A. M. (2017). A Critique of Pure Hierarchy: Uncovering Cross-Cutting Structure in a Natural Dataset. In Neurocomputational Models of Cognitive Development and Processing: Proceedings of the 14th Neural Computation and Psychology Workshop (pp. 51-68).
  • 9.
    Neurons’ encoding andrepresentation • quantification of the degree of neurons’ responses to different classes ) ( ) ( ) ( i i i n sparsity n flatness n MF  • Localist representation: neurons respond selectively to one thing • Distributed representation: each unit or neuron is involved in coding many different things Multi-faceted degree quantifies the number of sources of information that stimulates each neuron Rodrigo Quian Quiroga, Itzhak Fried and Christof Koch, 2013 SF Neuron MF Neuron IC visualization can reveal the multi-responsive feature of neurons based on the similarity of the patterns captured by each component. Sadeghi, Z. (2020). Conceptual Content in Deep Convolutional Neural Networks: An analysis into multi- faceted properties of neurons. In Proceedings of International Joint Conference on Computational Intelligence: IJCCI 2019 (pp. 19-31). Springer Singapore.
  • 10.
    Attention in Objectrecognition • Comparing the performance of model and human in rapid object recognition in the task of animal/non-animal classification Linsely et al., 2018 Eberhardt et al., 2014 Importance maps from human view points in ordered time frame While recognition accuracy increases with higher stages of visual processing, human decisions agreed best with predictions from intermediate stages. • cueing deep nets to more meaningful image regions derived from experimental study • Comparing performance of human and model in different levels of occlusion 10 Sadeghi, Z. (2020). An Investigation on Performance of Attention Deep Neural Networks in Rapid Object Recognition. In Intelligent Computing Systems: Third International Symposium, ISICS 2020, Sharjah, United Arab Emirates, March 18–19, 2020, Proceedings 3 (pp. 1-10). Springer International Publishing.
  • 11.
    + + inconsistent caseconsistent case Occluded object recognition Hit const vs hit inconst Miss const vs miss inconst Sup hit const vs sup hit inconst Sup miss const vs sup miss inconst Hypo_pos1 vs hypo_neg1 Hypo_pos2 vs hypo_neg2 Resp-time const vs inconst p-val 0.0027 0.0027 0.0027 0.0027 0.0027 0.0027 4.6921e-11 11 Sadeghi, Z. (2020). The effect of top-down attention in occluded object recognition. arXiv preprint arXiv:2007.10232.
  • 12.
    Visualization and informationanalysis Sadeghi, Z. (2019, September). An Information Analysis Approach into Feature Understanding of Convolutional Deep Neural Networks. In Machine Learning, Optimization, and Data Science: 5th International Conference, LOD 2019, Siena, Italy, September 10–13, 2019, Proceedings (pp. 36-44).