3th edition of International Conference on Advanced Intelligent Systems for
Sustainable Development
December 21-26, 2020, Tangier, Morocco
A Hybrid Deep Learning Network
CNN-SVM for 3D Mesh Segmentation
Laboratory of Mathematics, Computer Science and Engineering Sciences
(LMCSES)
Presented By:
ABOUQORA Younes
1
2
Plan
3
4
5
2
1 Goal
Related Works
Proposed Techniques
Learning Processes
Test and Results
6 Conclusion
3
Goal
Input 3D Object Output Labeled
3D Object
head
torso
Right
foot
Left foot
Left hand
Right hand
? ?
Segmented
3D Object
 Segmentation and Assignment of a label to each part of a 3D Object
4
[Kraevoy et al.
2007]
Shuffler
[Dai et al. 2018]
Joint segmentation
[Jun Yang 2016]
Consistent segmentation
[Xu et al. 2010]
Style separation
[Kalogerakis et al. 2010]
Supervised segmentation
[van Kaick et al. 2011]
Supervised correspondence
Related work: 3D Mesh Segmentation
5
 We propose an automatic approach for 3D object-parts Labeling based on the
form of a segment.
C = {Head, Torso, Right foot, Left foot , Right hand, Left hand}
c1
c2
c4
c3
c5
c6
Proposed Techniques : Labeling problem
statement
Head
Torso
Right foot
Left foot
Right hand
Left hand
6
Proposed Techniques : Features extraction
Shape Analysis
Local Shape Properties
 Curvature
Shape Diameter Function
Diffusion Distance
Shape contexts
Geodesic distance
Spin images
[Shapira et al. 2008]
[Rusinkiewicz 2004]
[de Goes et al. 2008]
7
Learning process : CNN-SVM Architecture
 Concatenate and re-organize the extracted features to form a 20×30 [Guo and
al. 2015] feature matrix.
 Learn the deep model to predict the probability distribution function.
 Using the SVM classifier with a linear kernel as a last layer to assign the good
face label
 Refined using contextual information through graph cut post process
8
Learning process : SVM Prediction
 The classifier takes as last layer the extracted features from the pre-trained
CNN.
 SVM provide stable results and trains faster for complex training data with
many labels.
 For the parameters optimization process, we use 10-fold cross validation
technique to fix the C and values ;
The test error of SVM for and C
values
Basic Concept of SVM
9
Refinement Process : graph cut
Data term Smoothness term
Data term
Smoothness term
𝜓𝑖 (𝑥𝑖 )=− 𝑙𝑜𝑔 (𝑝 (𝑥𝑖|𝑙)),
𝜓𝑖𝑗 ( 𝑥𝑖 , 𝑥 𝑗 )=− 𝑙𝑜𝑔¿ ¿
How to solve this optimisation problem?
 Transform into α-expansion problem
 Solve it using α-expansion algorithm
10
Test and Results : Dataset
 19 classes each composed of 20 meshes from the Princeton Shape Benchmark
[ Shilane et al. 2007] Using the ground-truth labeling created by Kalogerakis et al.
[2010].
SHREC’07: Generic 3D Watertight Meshes
Giorgi et al. 2007
11
Test and Results : Labeling results
 Qualitative Evaluation
 The following figure shows the 3D meshes used for training and the Visual
results obtained for labeling:
Training data set Testing results
Head
Torso
Right foot
Left foot
Left hand
Right hand
12
Test and Results : Refinement visualisation
(a) Ground truth (b) Unrefined (c) Refined
Our 3D mesh segmentation with and without graph cut in the smoothness term. The segment
boundaries (red circles) are poor when graph cut is not used (b), and very good when it is
used (c)
13
Test and Results : Performance
Category Kalogerakis and al.
2010
Guo and al. 2015 ShapePFCN 2017 George and al.
2018
Ours
Human 93.20 91.22 93.80 89.81 94.03
Cup 99.60 99.73 93.70 99.73 99.76
Glasses 97.20 97.60 96.30 97.09 97.47
Airplane 96.10 96.67 92.50 96.52 97.38
Ant 98.80 98.80 98.90 98.75 98.78
Chair 98.40 98.67 98.10 98.41 99.14
Octopus 98.40 98.79 98.10 98.41 98.97
Table 99.30 99.55 99.30 99.55 99.74
Teddy 98.10 98.24 96.50 99.55 98.50
Hand 88.70 88.71 88.70 89.81 89.88
Plier 96.20 96.22 95.70 95.61 96.34
Fish 95.60 95.64 95.90 96.44 97.23
Bird 87.90 88.35 86.30 91.67 91.96
Armadillo 90.10 92.27 93.30 93.74 93.94
Bust 62.10 69.84 66.40 - 71.11
Mech 90.50 95.60 97.90 - 99.60
Bearing 86.60 92.46 91.20 - 93.41
Vase 85.80 89.11 85.70 85.75 91
Fourleg 86.20 87.02 89.50 86.74 93.23
Average 92.04 93.39 92.51 94.84 94.81
 Quantitative Evaluation
segmentation
accuracy
(%)
14
 Complementary local features combing geometric and spatial information
 Learning high-level features from the low-level features using CNN network
 Prediction label using the Linear SVM classifier
 Graph cut refinement based α-expansion
 Good segmentation performance compared with state of the art
Conclusion
15
Thank you
Laboratory of Mathematics, Computer Science and Engineering
Sciences (LMCSES)

Deep learning CNN-SVM for 3D mesh segmentation.pptx

  • 1.
    3th edition ofInternational Conference on Advanced Intelligent Systems for Sustainable Development December 21-26, 2020, Tangier, Morocco A Hybrid Deep Learning Network CNN-SVM for 3D Mesh Segmentation Laboratory of Mathematics, Computer Science and Engineering Sciences (LMCSES) Presented By: ABOUQORA Younes 1
  • 2.
    2 Plan 3 4 5 2 1 Goal Related Works ProposedTechniques Learning Processes Test and Results 6 Conclusion
  • 3.
    3 Goal Input 3D ObjectOutput Labeled 3D Object head torso Right foot Left foot Left hand Right hand ? ? Segmented 3D Object  Segmentation and Assignment of a label to each part of a 3D Object
  • 4.
    4 [Kraevoy et al. 2007] Shuffler [Daiet al. 2018] Joint segmentation [Jun Yang 2016] Consistent segmentation [Xu et al. 2010] Style separation [Kalogerakis et al. 2010] Supervised segmentation [van Kaick et al. 2011] Supervised correspondence Related work: 3D Mesh Segmentation
  • 5.
    5  We proposean automatic approach for 3D object-parts Labeling based on the form of a segment. C = {Head, Torso, Right foot, Left foot , Right hand, Left hand} c1 c2 c4 c3 c5 c6 Proposed Techniques : Labeling problem statement Head Torso Right foot Left foot Right hand Left hand
  • 6.
    6 Proposed Techniques :Features extraction Shape Analysis Local Shape Properties  Curvature Shape Diameter Function Diffusion Distance Shape contexts Geodesic distance Spin images [Shapira et al. 2008] [Rusinkiewicz 2004] [de Goes et al. 2008]
  • 7.
    7 Learning process :CNN-SVM Architecture  Concatenate and re-organize the extracted features to form a 20×30 [Guo and al. 2015] feature matrix.  Learn the deep model to predict the probability distribution function.  Using the SVM classifier with a linear kernel as a last layer to assign the good face label  Refined using contextual information through graph cut post process
  • 8.
    8 Learning process :SVM Prediction  The classifier takes as last layer the extracted features from the pre-trained CNN.  SVM provide stable results and trains faster for complex training data with many labels.  For the parameters optimization process, we use 10-fold cross validation technique to fix the C and values ; The test error of SVM for and C values Basic Concept of SVM
  • 9.
    9 Refinement Process :graph cut Data term Smoothness term Data term Smoothness term 𝜓𝑖 (𝑥𝑖 )=− 𝑙𝑜𝑔 (𝑝 (𝑥𝑖|𝑙)), 𝜓𝑖𝑗 ( 𝑥𝑖 , 𝑥 𝑗 )=− 𝑙𝑜𝑔¿ ¿ How to solve this optimisation problem?  Transform into α-expansion problem  Solve it using α-expansion algorithm
  • 10.
    10 Test and Results: Dataset  19 classes each composed of 20 meshes from the Princeton Shape Benchmark [ Shilane et al. 2007] Using the ground-truth labeling created by Kalogerakis et al. [2010]. SHREC’07: Generic 3D Watertight Meshes Giorgi et al. 2007
  • 11.
    11 Test and Results: Labeling results  Qualitative Evaluation  The following figure shows the 3D meshes used for training and the Visual results obtained for labeling: Training data set Testing results Head Torso Right foot Left foot Left hand Right hand
  • 12.
    12 Test and Results: Refinement visualisation (a) Ground truth (b) Unrefined (c) Refined Our 3D mesh segmentation with and without graph cut in the smoothness term. The segment boundaries (red circles) are poor when graph cut is not used (b), and very good when it is used (c)
  • 13.
    13 Test and Results: Performance Category Kalogerakis and al. 2010 Guo and al. 2015 ShapePFCN 2017 George and al. 2018 Ours Human 93.20 91.22 93.80 89.81 94.03 Cup 99.60 99.73 93.70 99.73 99.76 Glasses 97.20 97.60 96.30 97.09 97.47 Airplane 96.10 96.67 92.50 96.52 97.38 Ant 98.80 98.80 98.90 98.75 98.78 Chair 98.40 98.67 98.10 98.41 99.14 Octopus 98.40 98.79 98.10 98.41 98.97 Table 99.30 99.55 99.30 99.55 99.74 Teddy 98.10 98.24 96.50 99.55 98.50 Hand 88.70 88.71 88.70 89.81 89.88 Plier 96.20 96.22 95.70 95.61 96.34 Fish 95.60 95.64 95.90 96.44 97.23 Bird 87.90 88.35 86.30 91.67 91.96 Armadillo 90.10 92.27 93.30 93.74 93.94 Bust 62.10 69.84 66.40 - 71.11 Mech 90.50 95.60 97.90 - 99.60 Bearing 86.60 92.46 91.20 - 93.41 Vase 85.80 89.11 85.70 85.75 91 Fourleg 86.20 87.02 89.50 86.74 93.23 Average 92.04 93.39 92.51 94.84 94.81  Quantitative Evaluation segmentation accuracy (%)
  • 14.
    14  Complementary localfeatures combing geometric and spatial information  Learning high-level features from the low-level features using CNN network  Prediction label using the Linear SVM classifier  Graph cut refinement based α-expansion  Good segmentation performance compared with state of the art Conclusion
  • 15.
    15 Thank you Laboratory ofMathematics, Computer Science and Engineering Sciences (LMCSES)

Editor's Notes

  • #1 Hello every One I m Youness ABOUQORA, from Laboratory of Mathematics, Computer Science and Engineering Sciences in Hassan 1st university, my presentation for today is about «A Hybrid Deep Learning Network CNN-SVM for 3D Mesh Segmentation”
  • #2 The Outlines of our presentation is follows : First the goal second Related Works And Proposed Techniques And Learning Processes And Test and Results Finaly Conclusion
  • #3 The goal of our work is implementing a deep architecture to segment and label 3D shape parts. To do this, our processing stages are firstly feature vectors extraction combining local geometric features and spatial information. Then, a CNNs are trained in a supervised manner connected to Support Vector Machine (SVM) classifier as a last layer for training feature vectors to generate a label vector for each triangle to indicate its probability. Eventually, a graph-based mesh labeling algorithm is adopted to optimize the labels of triangles by considering the label consistencies.
  • #4 There have been several related works in a 3D Mesh Segmentation . Some works provide consistent segmentation for pair of shapes but cannot guarantee the consistency across the whole set. Some works do intend to get consistent segmentation across the set, however they assume that there is a global rigid alignment between all those shapes, which is quite restricted. The supervised methods can provide better results, but the require a large number of manually segmented shapes as training data,PSB and COSEG have sufficiently many samples to train a network with reasonable generalizability.
  • #5 Here is our problem statement. Our goal is to specify label for each mesh surface, giving pretty fine set of labels. The label -which is c for each mesh face is depends on the other line surface geometry around. Its also depends to the other labels of neighbouring faces. Therefore, we need to optimise all labels somehow jointly. You know there to do this we use a C-where is big C is a set of possible labels, from our model that perform us this optimisation of label assignment globally on a mesh. So for the segmentation paradigm we really need to find a mapping between features exactly for the triangle to label probabilities for that face. In order to perform this we need to have features that informative of the types of parts. We use several descriptor to describe a input feature, such as surface curvature, singular values, shape diameter, distances from medial surface and so on.
  • #6 Many of the existing work in shape segmentation are driven by features. These can be defined per face, per vertex, per patch (a cluster of faces), or even per shape. These features are designed for different purposes, and many have been successfully applied in mesh segmentation. Per-face features include, SDF [19] which estimates of the thickness of a shape at a given face, Conformal Factor (CF) [20] that computes a position invariant representation of the curvature of non-rigid shapes and Spin Images (SI) that captures the surface information around a face using a 2D histogram. Recent works have also adopted image-based features to the 3D fields, one notable example is Shape Context (SC) [21] which is a 3D shape descriptor to encode both curvature and geodesic distance distributions in a 2D histogram [23]. However, there are many limitations to how useful a feature is on certain shapes. Therefore, the feature selection for a new technique is very important as it can impact the accuracy and speed. For these reasons, we opted to use seven features in this work, that are widely used in recent works and proved the high representativity. These features have shown be very useful in shape segmentation [24, 23, 26].
  • #7 For learning process, we concatenate them into a high-dimensional feature vector (600 components) and re-organize them to form a feature matrix. In this manner, low-level geometric features can be non-linearly combined and hierarchically compressed through various convolutional operations in CNNs. Guo and demonstrate that the ordering of features as well as the size of feature matrix only slightly change the accuracy of mesh labeling. CNN architecture consists of three main layers (convolutional layers, pooling layers and fully connected layers) that are stacked on top of each other to form a full CNN model. Our network was trained on the training set, with “categorical_hinge” as loss function and Adam [47] as optimizer. We have used a batch size of 512 samples, and a number of epochs up to 50, where the best model was selected based on the validation loss. However, this trained model is used only as a high-level features extractor. Therefore, we have used the SVM classifier with a linear kernel, which takes the flattened output from the rest layers as a new feature vector. To predict the probability of a face having a good segment.
  • #8 Most deep learning methods for classification using fully connected layers and convolutional layers have used softmax layer objective to learn the lower level parameters. However, there some studies try to replace the Softmax layer with the support vector machine (SVM) in an artificial neural network architecture and achieve great results. Inspired by these works, we research the performance of CNN with linear SVM classifier on the gender recognition based on PSB dataset. In the first layers, the input feature are extracted from the fully connected layer of the pre-trained CNN model as the features to train the SVM. In the last layer, we adjust CNN with a hinge loss function using an L2 norm to create a new architecture, namely CNN-SVM. The results have shown that SVM shows the better performance than Softmax in CNN to work out the gender recognition problem. Support vector machine (SVM) is a supervised classifier, it has been proposed by Vapnik. This classifier has been introduced to solve two-class pattern recognition problems using the Structural Risk Minimization principle. Given a training set in a vector space, this method finds. The best decision hyper-plane that separates a set of positive examples from a set of negative examples with maximum margin.
  • #9 To achieve a high-level segmentation with smooth boundaries in an active system, typically, the user would have to spend time in a refinement stage. In this stage, they would typically fine tune the small details of boundaries to achieve the desired result. This task is both tedious and time consuming, so to alleviate this we introduce an automatic boundary refining algorithm. One solution to this comes from the observation that segment boundaries also typically lie on regions with a large change in thicknes. we include implementation for a boundary refinement algorithm which makes use of Graph Cut: Let be two faces in a mesh, where is the set of all faces. Let be the set of neighbouring faces of . We can optimise the labels of all i by minimizing the following objective function: Estimated using CNN prediction where and are the distance and dihedral angle between the triangles and
  • #10 Now in order to a valuated method, we use the Princeton Segmentation Benchmark that provided by Chen et al. we have labeled 380 data meshes. And we train and test separately our method for each object of the 19 categories with 20 meshes each.
  • #11 We use 5-fold cross validation - the set is split into 5 equals (or close to equal) subsets. In each run, a single subset is left out of the training set and used for testing. This is repeated for all 5 folds. We fix the number of iterations to 50.
  • #12 After prediction, the optimization algorithm is applied as a post-processing. The algorithm can optimize the labeling by providing punishment to the predicted result based on the smoothness of the surface, which supposed to be able to sharpen the edges between two labels. Based on the experiment result on this data set, the optimization algorithm doesn’t improve accuracy significantly. The figure above shows the visualization of three samples from the test set.
  • #13 The accuracy is computed as the percentage of area of correctly classified faces over area of all the surface. The performance of existing methods is based on publicly reported results in the literature. Our approaches outperform all existing methods on 17 out of 19 objects. In the remaining 2 categories, the accuracy of our method is a little bit less than the prior-best.