This document provides a summary of image classification using deep learning techniques. It begins with an introduction to the speaker and their background. It then discusses the main types of image AI tasks like classification, detection, and segmentation. The document reviews the history and timeline of deep learning, important datasets like ImageNet, and algorithms such as convolutional neural networks. It presents the typical process flow for image-based deep learning including feature extraction using convolutional and pooling layers, classification layers, and different network architectures. The document concludes by discussing a homework assignment on building a multi-class image classification model using a dataset of dog, cat, and bird images.
Computer Vision abbreviated as CV aims to teach computers to achieve human level vision capabilities. Applications of CV in self driving cars, robotics, healthcare, education and the multitude of apps that allow customers to use the smartphone cameras to convey information has made it one of the most popular fields in Artificial Intelligence. The recent advances in Deep Learning, data storage and computing capabilities has lead to the huge success of CV. There are several tasks in computer vision, such as classification, object detection, image segmentation, optical character recognition, scene reconstruction and many others.
In this presentation I will talk about applying Transfer Learning, Image classification, object detection and the metrics required to measure them on still images. The increase in accuracy over of CV tasks over the past decade is due to Convolutional Neural Networks (CNN), CNN is the base used in architectures such as RESNET or VGGNET. I will go through how to use these pre-trained models for image classification and feature extraction. One of the break throughs in object detection has come with one-shot learning, where the bounding box and the class of the object is predicted simultaneously. This leads to low latency during inference (155 frames per second) and high accuracy. This is the framework behind object detection using YOLO , I will explain how to use yolo for specific use cases.
Improving computer vision models at scale presentationDr. Mirko Kämpf
Rigorous improvement of an image recognition model often requires multiple iterations of eyeballing outliers, inspecting statistics of the output labels, then modifying and retraining the model. When testing data is present at the petabyte scale, the ability to seamlessly access all the images that have been assigned specific labels poses a technical challenge by itself.
Marton Balassi, Mirko Kämpf, and Jan Kunigk share a solution that automates the process of running the model on the testing data and populating an index of the labels so they become searchable. Images and labels are stored in HBase. The model is encapsulated in a PySpark program, while the images are indexed with Solr and can be accessed from a Hue dashboard.
Improving computer vision models at scale presentationJan Kunigk
We developed a solution that automatically adds tags to images via neural networks running on spark on Tensorflow at scale. Images are stored in HBase and tags become searchable via Solr.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
Object extraction from satellite imagery using deep learningAly Abdelkareem
Presentation for extract objects from satellite imagery using deep learning techniques. you find a comparison between state-of-art approaches in computer vision.
Computer Vision abbreviated as CV aims to teach computers to achieve human level vision capabilities. Applications of CV in self driving cars, robotics, healthcare, education and the multitude of apps that allow customers to use the smartphone cameras to convey information has made it one of the most popular fields in Artificial Intelligence. The recent advances in Deep Learning, data storage and computing capabilities has lead to the huge success of CV. There are several tasks in computer vision, such as classification, object detection, image segmentation, optical character recognition, scene reconstruction and many others.
In this presentation I will talk about applying Transfer Learning, Image classification, object detection and the metrics required to measure them on still images. The increase in accuracy over of CV tasks over the past decade is due to Convolutional Neural Networks (CNN), CNN is the base used in architectures such as RESNET or VGGNET. I will go through how to use these pre-trained models for image classification and feature extraction. One of the break throughs in object detection has come with one-shot learning, where the bounding box and the class of the object is predicted simultaneously. This leads to low latency during inference (155 frames per second) and high accuracy. This is the framework behind object detection using YOLO , I will explain how to use yolo for specific use cases.
Improving computer vision models at scale presentationDr. Mirko Kämpf
Rigorous improvement of an image recognition model often requires multiple iterations of eyeballing outliers, inspecting statistics of the output labels, then modifying and retraining the model. When testing data is present at the petabyte scale, the ability to seamlessly access all the images that have been assigned specific labels poses a technical challenge by itself.
Marton Balassi, Mirko Kämpf, and Jan Kunigk share a solution that automates the process of running the model on the testing data and populating an index of the labels so they become searchable. Images and labels are stored in HBase. The model is encapsulated in a PySpark program, while the images are indexed with Solr and can be accessed from a Hue dashboard.
Improving computer vision models at scale presentationJan Kunigk
We developed a solution that automatically adds tags to images via neural networks running on spark on Tensorflow at scale. Images are stored in HBase and tags become searchable via Solr.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
Object extraction from satellite imagery using deep learningAly Abdelkareem
Presentation for extract objects from satellite imagery using deep learning techniques. you find a comparison between state-of-art approaches in computer vision.
A Distributed Deep Learning Approach for the Mitosis Detection from Big Medic...Databricks
The strongest indicator of a cancer patient's prognosis is the number of mitotic bodies that a pathologist manually counts from the high-resolution whole-slide histopathology images. Obviously, it is not efficient to manually count the mitosis number. But it is still challenging to automate the process of mitosis detection due to the limited training datasets and the intensive computing involved in the model training and inference. This presentation introduces a large-scale deep learning approach to train a two-stage CNN-based model with high accuracy to detect the mitosis locations directly from the high-resolution whole-slide images. In details, we first train a nuclei detection model to remove the background information from the raw whole-slide histopathology images. Second, a customized ResNet-50 model is trained on the cleaned dataset in the first step. The first step saves the training time while improving the model performance in the second step. A false-positive oversampling approach is used to further improve the model performance. With these models, the inference process is conducted to detect the mitosis locations from the large volume of histopathology images in parallel. Meanwhile, the whole pipeline, including data preprocessing, model training, hyperparameter tuning, and inference, is parallelized by utilizing the distributed TensorFlow, Apache Spark, and HDFS. The experiences and techniques in this project can be applied to other large scale deep learning problems as well.
Speaker: Fei Hu
Transfer learning (TL) is a research problem in machine learning (ML) that focuses on applying knowledge gained while solving one task to a related task
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...DataStax Academy
An earthquake occurs in the Sea of Japan. A tsunami is likely to hit the coast. The population must be warned by SMS. A datacenter has been damaged by the earthquake. Will the alerting system still work ?
Building this simple alerting system is a great way to start with Cassandra, as we discovered teaching a bigdata hands-on class in a french university.
What were the reasons that made a majority of students to choose Cassandra to implement a fast, resilient and high availability big data system to be deployed on AWS ?
What were the common pitfalls, the modeling alternatives and their performance impact ?
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
Deep Learning is all the rage these days, but where does the reality of what Deep Learning can do end and the media hype begin? In this talk, I will dispel common myths about Deep Learning that are not necessarily true and help you decide whether you should practically use Deep Learning in your software stack.
I’ll begin with a technical overview of common neural network architectures like CNNs, RNNs, GANs and their common use cases like computer vision, language understanding or unsupervised machine learning. Then I’ll separate the hype from reality around questions like:
• When should you prefer traditional ML systems like scikit learn or Spark.ML instead of Deep Learning?
• Do you no longer need to do careful feature extraction and standardization if using Deep Learning?
• Do you really need terabytes of data when training neural networks or can you ‘steal’ pre-trained lower layers from public models by using transfer learning?
• How do you decide which activation function (like ReLU, leaky ReLU, ELU, etc) or optimizer (like Momentum, AdaGrad, RMSProp, Adam, etc) to use in your neural network?
• Should you randomly initialize the weights in your network or use more advanced strategies like Xavier or He initialization?
• How easy is it to overfit/overtrain a neural network and what are the common techniques to ovoid overfitting (like l1/l2 regularization, dropout and early stopping)?
A Distributed Deep Learning Approach for the Mitosis Detection from Big Medic...Databricks
The strongest indicator of a cancer patient's prognosis is the number of mitotic bodies that a pathologist manually counts from the high-resolution whole-slide histopathology images. Obviously, it is not efficient to manually count the mitosis number. But it is still challenging to automate the process of mitosis detection due to the limited training datasets and the intensive computing involved in the model training and inference. This presentation introduces a large-scale deep learning approach to train a two-stage CNN-based model with high accuracy to detect the mitosis locations directly from the high-resolution whole-slide images. In details, we first train a nuclei detection model to remove the background information from the raw whole-slide histopathology images. Second, a customized ResNet-50 model is trained on the cleaned dataset in the first step. The first step saves the training time while improving the model performance in the second step. A false-positive oversampling approach is used to further improve the model performance. With these models, the inference process is conducted to detect the mitosis locations from the large volume of histopathology images in parallel. Meanwhile, the whole pipeline, including data preprocessing, model training, hyperparameter tuning, and inference, is parallelized by utilizing the distributed TensorFlow, Apache Spark, and HDFS. The experiences and techniques in this project can be applied to other large scale deep learning problems as well.
Speaker: Fei Hu
Transfer learning (TL) is a research problem in machine learning (ML) that focuses on applying knowledge gained while solving one task to a related task
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...DataStax Academy
An earthquake occurs in the Sea of Japan. A tsunami is likely to hit the coast. The population must be warned by SMS. A datacenter has been damaged by the earthquake. Will the alerting system still work ?
Building this simple alerting system is a great way to start with Cassandra, as we discovered teaching a bigdata hands-on class in a french university.
What were the reasons that made a majority of students to choose Cassandra to implement a fast, resilient and high availability big data system to be deployed on AWS ?
What were the common pitfalls, the modeling alternatives and their performance impact ?
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
Deep Learning is all the rage these days, but where does the reality of what Deep Learning can do end and the media hype begin? In this talk, I will dispel common myths about Deep Learning that are not necessarily true and help you decide whether you should practically use Deep Learning in your software stack.
I’ll begin with a technical overview of common neural network architectures like CNNs, RNNs, GANs and their common use cases like computer vision, language understanding or unsupervised machine learning. Then I’ll separate the hype from reality around questions like:
• When should you prefer traditional ML systems like scikit learn or Spark.ML instead of Deep Learning?
• Do you no longer need to do careful feature extraction and standardization if using Deep Learning?
• Do you really need terabytes of data when training neural networks or can you ‘steal’ pre-trained lower layers from public models by using transfer learning?
• How do you decide which activation function (like ReLU, leaky ReLU, ELU, etc) or optimizer (like Momentum, AdaGrad, RMSProp, Adam, etc) to use in your neural network?
• Should you randomly initialize the weights in your network or use more advanced strategies like Xavier or He initialization?
• How easy is it to overfit/overtrain a neural network and what are the common techniques to ovoid overfitting (like l1/l2 regularization, dropout and early stopping)?
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
2. About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2
3. Tutorial
Content
3
Image in AI process types
Overall technologies
Homework
Deep Learning history timeline
Exercise in image classification
6. Deep Learning history timeline
• From 1943-2019
6
Ref: https://machinelearningknowledge.ai/brief-history-of-deep-learning/
7. Deep Learning history timeline
• 2018 Turing Award
• Bengio, Hinton, and LeCun, are sometimes referred to as the "Godfathers of
AI" and "Godfathers of Deep Learning
7
Ref: https://awards.acm.org/about/2018-turing
8. Deep Learning history timeline
• ImageNet dataset
• Over 15 million images with more than 22,000 categories
• ILSVRC
8
Ref: https://image-net.org/about.php
Ref:
https://www.cs.princeton.edu/courses/archiv
e/spr18/cos598B/slides/cos598b_7feb18_ima
genet.pdf
9. Deep Learning history timeline
• ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
9
Ref: https://medium.com/nanonets/how-to-automate-surveillance-with-deep-learning-c8dea1d6387f
Ref: https://image-net.org/index.php
10. Deep Learning history timeline
• Object detection
10
val top-1: Probability of predicting only once and being error
val top-5: Predict five times, as long as one guess is error, the probability of being error
VGG16 has 1.3E parameters with
500 MB size (13 Conv + 3 FC = 16)
ChatGPT has 1750E parameters
with 45TB size
12. Deep Learning history timeline
• Object detection from video
• IoU (Intersection over Union)
• IoU > 0.5 (we set it as TP)
• IoU < 0.5 (we set it as FP)
12
14. More
• AP (Average Precision): used to object detection
• mAP (more Average Precision): more objects detection
14
precision-recall curve
AP = area under curve, AUC
IS dog possibility
Precision:3/4 =0.75 Recall:3/8= 0.375
Precision:5/10 =0.5 Recall:5/8= 0.625
15. Deep Learning history timeline
• CS231n:
• Convolutional Neural Networks for Visual Recognition
15
Ref: http://cs231n.stanford.edu/index.html
16. AI-Generated Content
• The year 2022 is considered as the first year of AI-generated content
production.
• In 2023, there will be an explosion of AI applications. Experts predict
that within two years, hundreds of thousands of AI application apps
may be created, with a myriad of new species of AI applications
emerging. While this is exciting, it also presents potential risks and
challenges, such as a complete rewriting and innovation of the
definition and process of personal productivity. The AIGC also
anticipates challenges to current societal norms, including issues
related to copyright law, academic ethics, and the proliferation of
deepfakes and fake news.
17
25. The input is fed to the network of stacked
Conv, Pool and Dense layers
26
Ref: https://learnopencv.com/image-classification-using-convolutional-neural-networks-in-keras/
Output can be a Softmax
layer indication
34. Improving model prediction accuracy
• Different geometric transformations
35
Ref: https://medium.com/analytics-vidhya/data-augmentation-is-it-really-necessary-b3cb12ab3c3f
35. Feature extractor
• Kernel map: image edge-detection,
sharpen… called image Filters
• Convolutional: Convolutional and
pooling layers which act as the feature
extractor
• Feature maps: The output of kernel
map process
36
Feature maps1
Feature maps2
38. More
• What if you want the feature map to be of the same size as the input
image? Using zero padding on it.
39
Ref: https://towardsdatascience.com/convolution-neural-networks-a-beginners-guide-implementing-a-mnist-hand-written-digit-8aa60330d022
39. feature extractor
• Pooling
• Max
• Average
40
Ref: https://www.researchgate.net/figure/Toy-example-illustrating-the-drawbacks-of-
max-pooling-and-average-pooling_fig2_300020038
41. Classifier
• Flatten Layer
• It is used to convert the data into 1D arrays to create a single feature vector.
After flattening we forward the data to a fully connected layer for final
classification
42
Ref: https://data-flair.training/blogs/keras-convolution-neural-network/
42. Classifier
• Dense Layer
• It is a fully connected layer. Each node in this layer is connected to the
previous layer
• This layer is used at the final stage of CNN to perform classification
43
Ref: https://data-flair.training/blogs/keras-convolution-neural-network/
45. Classifier
• Dropout Layer
• It is used to prevent the network from overfitting
46
Ref: https://data-flair.training/blogs/keras-convolution-neural-network/
46. Keras framework
• Keras is a deep learning API written in Python, running on top of the
machine learning platform TensorFlow.
• It was developed with a focus on enabling fast experimentation
• Integrates with TensorFlow2
• Efficiently executing low-level tensor operations on CPU, GPU, or TPU
• Faster developing for deep-learning networks
• Provides FULL Connection, Convolutional, Pooling, RNN, LSTM…
• The latest version: 2.12.0 (2023-3-24)
47
Ref: https://keras.io/about/
47. Dog and cat image dataset
• Cat and Dog: 23,422 images
• Training: 18,738 images
• Validation: 4,684 images
• Image size are not fixed!
• 350X320
• 448X329
• …
• …
49
48. More
• Data imbalance
• If a dataset consists of 100 cat and 900 dog images. If we train the neural
network on this data, it will just learn to predict dog every time
• In this case, we can easily balance the data using sampling techniques
• Down-sampling
• By removing some dog examples
• Up-sampling
• By creating more cat examples using image augmentation or any other method
50
Ref: Multi-Label Image Classification with Neural Network | Keras | by Shiva Verma | Towards Data Science
49. Dog and cat image dataset
• Network Calculator
51
Ref: https://madebyollin.github.io/convnet-calculator/
51. Homework
• Try to build a VGG-16 network with Dog and Cat classification
• https://blog.51cto.com/u_15351425/3727442
• Try to add additional bird images as the third image label
• Download the bird images
• https://drive.google.com/file/d/1NgmjVrRug_qPqlfU_Zb2O-
kyd5Hblaat/view?usp=sharing
• Modify the code, build the model to multi-class prediction
56