- Content-based image retrieval (CBIR) searches for images based on visual features like color, texture, and shape rather than keywords.
- CBIR systems extract features from images to create metadata and use those features to calculate visual similarity between images.
- Relevance feedback allows users to provide feedback on initial search results to help the system recalculate feature weights and improve subsequent results.
Content-based image retrieval (CBIR) uses visual image content to search large image databases according to user needs. CBIR systems represent images by extracting features related to color, shape, texture, and spatial layout. Features are extracted from regions of the image and compared to features of images in the database to find the most similar matches. CBIR has applications in medical imaging, fingerprints, photo collections, and more. Techniques include representing images with histograms of color and texture features extracted through transforms.
This document outlines a presentation on content-based image retrieval (CBIR). It discusses the motivation for CBIR by describing limitations of text-based image retrieval, such as problems with image annotation, human perception, and queries that cannot be described with text. CBIR allows images to be retrieved based on automatically extracted visual features like color, texture, and histograms. A typical CBIR system extracts image features and then matches features to find visually similar images. Applications of CBIR include crime prevention, security, medical diagnosis, and intellectual property. The conclusion states that CBIR reduces computation time and increases user interaction compared to other methods.
The document discusses content-based image retrieval (CBIR) systems. It describes how CBIR systems use feature extraction to search large image databases based on visual content. The key components of CBIR systems are feature extraction, indexing, and system design. Feature extraction involves extracting information about images' colors, textures, shapes, and spatial locations. Effective features and indexing techniques are needed to make CBIR scalable for large image collections. Performance is evaluated based on how well systems return relevant images.
The document describes the Scale-invariant feature transform (SIFT) algorithm. It outlines the key steps: 1) constructing scale space by generating blurred images at different scales, 2) calculating difference of Gaussian images to find keypoints, 3) assigning orientations to keypoints, and 4) generating 128-element feature vectors for each keypoint to uniquely describe local image features in a way that is invariant to scale, rotation, and illumination changes. The SIFT algorithm allows for reliable object recognition and image stitching.
The Hough transform is a feature extraction technique used in image analysis and computer vision to detect shapes within images. It works by detecting imperfect instances of objects of a certain class of shapes via a voting procedure. Specifically, the Hough transform can be used to detect lines, circles, and other shapes in an image if their parametric equations are known, and it provides robust detection even under noise and partial occlusion. It works by quantizing the parameter space that describes the shape and counting the number of votes each parametric description receives from edge points in the image.
OpenCV is an open-source library for computer vision and machine learning. The document discusses OpenCV's features including its modular structure, common computer vision algorithms like Canny edge detection, Hough transform, and cascade classifiers. Code examples are provided to demonstrate how to implement these algorithms using OpenCV functions and data types.
The presentation is made on CNN's which is explained using the image classification problem, the presentation was prepared in perspective of understanding computer vision and its applications. I tried to explain the CNN in the most simple way possible as for my understanding. This presentation helps the beginners of CNN to have a brief idea about the architecture and different layers in the architecture of CNN with the example. Please do refer the references in the last slide for a better idea on working of CNN. In this presentation, I have also discussed the different types of CNN(not all) and the applications of Computer Vision.
Content-based image retrieval (CBIR) uses visual image content to search large image databases according to user needs. CBIR systems represent images by extracting features related to color, shape, texture, and spatial layout. Features are extracted from regions of the image and compared to features of images in the database to find the most similar matches. CBIR has applications in medical imaging, fingerprints, photo collections, and more. Techniques include representing images with histograms of color and texture features extracted through transforms.
This document outlines a presentation on content-based image retrieval (CBIR). It discusses the motivation for CBIR by describing limitations of text-based image retrieval, such as problems with image annotation, human perception, and queries that cannot be described with text. CBIR allows images to be retrieved based on automatically extracted visual features like color, texture, and histograms. A typical CBIR system extracts image features and then matches features to find visually similar images. Applications of CBIR include crime prevention, security, medical diagnosis, and intellectual property. The conclusion states that CBIR reduces computation time and increases user interaction compared to other methods.
The document discusses content-based image retrieval (CBIR) systems. It describes how CBIR systems use feature extraction to search large image databases based on visual content. The key components of CBIR systems are feature extraction, indexing, and system design. Feature extraction involves extracting information about images' colors, textures, shapes, and spatial locations. Effective features and indexing techniques are needed to make CBIR scalable for large image collections. Performance is evaluated based on how well systems return relevant images.
The document describes the Scale-invariant feature transform (SIFT) algorithm. It outlines the key steps: 1) constructing scale space by generating blurred images at different scales, 2) calculating difference of Gaussian images to find keypoints, 3) assigning orientations to keypoints, and 4) generating 128-element feature vectors for each keypoint to uniquely describe local image features in a way that is invariant to scale, rotation, and illumination changes. The SIFT algorithm allows for reliable object recognition and image stitching.
The Hough transform is a feature extraction technique used in image analysis and computer vision to detect shapes within images. It works by detecting imperfect instances of objects of a certain class of shapes via a voting procedure. Specifically, the Hough transform can be used to detect lines, circles, and other shapes in an image if their parametric equations are known, and it provides robust detection even under noise and partial occlusion. It works by quantizing the parameter space that describes the shape and counting the number of votes each parametric description receives from edge points in the image.
OpenCV is an open-source library for computer vision and machine learning. The document discusses OpenCV's features including its modular structure, common computer vision algorithms like Canny edge detection, Hough transform, and cascade classifiers. Code examples are provided to demonstrate how to implement these algorithms using OpenCV functions and data types.
The presentation is made on CNN's which is explained using the image classification problem, the presentation was prepared in perspective of understanding computer vision and its applications. I tried to explain the CNN in the most simple way possible as for my understanding. This presentation helps the beginners of CNN to have a brief idea about the architecture and different layers in the architecture of CNN with the example. Please do refer the references in the last slide for a better idea on working of CNN. In this presentation, I have also discussed the different types of CNN(not all) and the applications of Computer Vision.
This presentation provides an overview of artificial intelligence (AI) and deep learning. It begins with introductions to AI and deep learning, explaining that AI allows machines to perform tasks typically requiring human intelligence through machine learning. Deep learning is a type of machine learning using artificial neural networks inspired by the human brain. The presentation then discusses why AI has grown recently, citing increased computing power, data storage, and data availability. It also covers deep learning model development and concepts like underfitting and overfitting. The presentation describes different types of learning approaches like supervised, unsupervised, and reinforcement learning. It concludes with popular applications of deep learning like precision agriculture, computer vision, and recommendations.
In this presentation we described important things about Image processing and computer vision. If you have any query about this presentation then feels free to visit us at:
http://www.siliconmentor.com/
This document discusses different methods of image segmentation: thresholding, edge-based segmentation, and region-based segmentation. It provides details on various thresholding techniques including basic global thresholding, Otsu's method, multiple thresholding, and variable thresholding. For edge-based segmentation, it mentions basic edge detection, the Marr-Hildreth edge detector, and watersheds. Finally, it covers region-based segmentation and provides an algorithm for region growing.
This document outlines the syllabus for a digital image processing course. It introduces key concepts like what a digital image is, areas of digital image processing like low-level, mid-level and high-level processes, a brief history of the field, applications in different domains, and fundamental steps involved. The course will cover topics in digital image fundamentals and processing techniques like enhancement, restoration, compression and segmentation. It will be taught using MATLAB and C# in the labs. Assessment will include homework, exams, labs and a final project.
The document discusses various techniques for image segmentation including discontinuity-based approaches, similarity-based approaches, thresholding methods, region-based segmentation using region growing and region splitting/merging. Key techniques covered include edge detection using gradient operators, the Hough transform for edge linking, optimal thresholding, and split-and-merge segmentation using quadtrees.
Lecture 1 for Digital Image Processing (2nd Edition)Moe Moe Myint
-What is Digital Image Processing?
-The Origins of Digital Image Processing
-Examples of Fields that Use Digital Image Processing
-Fundamentals Steps in Digital Image Processing
-Components of an Image Processing System
RANSAC is an algorithm for estimating model parameters from noisy data containing outliers. It works by:
1. Randomly selecting minimal samples needed to estimate a model
2. Calculating fit of model to all data to find inliers
3. Repeating for many iterations and selecting model with most inliers
The number of iterations needed depends on the expected outlier ratio and desired probability of finding the correct model. RANSAC is useful for problems like image alignment that involve fitting models to data containing outliers.
Harris corner detection is used to extract local features from images. It works by (1) computing the gradient at each point, (2) constructing a second moment matrix from the gradient, and (3) using the eigenvalues of this matrix to score how "corner-like" each point is. Points with a large, local maximum score are detected as corners. The Harris operator, which is a variant using the trace of the matrix, is commonly used due to its efficiency. Corners provide distinctive local features that can be matched between images.
This is the first part of the presentation series on one of the powerful open sources libraries, the opencv. this presentation is about the introduction, installation, some basic functions on images and some basic image processing on the images
The document discusses content-based image retrieval (CBIR). It provides a brief history of CBIR, noting it originated in 1992. It describes challenges of CBIR, including the semantic gap between low-level features extracted and high-level human concepts. It also outlines common CBIR techniques like color, shape, and texture analysis. Applications are described as image search and browsing. Limitations include not fully capturing human visual understanding.
Anti-aliasing is a technique used to reduce jagged or stair-stepped edges in digital images by adding subtle color variations around edges. It works by averaging pixel color values across edges to make them appear smoother. There are several techniques for anti-aliasing including increasing image resolution, prefiltering by calculating pixel color based on object overlap within a pixel area, and postfiltering through supersampling at a higher resolution and then averaging down. Unweighted area sampling draws lines as rectangles and sets pixel intensity proportional to the amount of overlap with the rectangle rather than distance from the pixel center.
This document presents a literature review and proposed work plan for face recognition using a back propagation neural network. It summarizes the Viola-Jones face detection algorithm which uses Haar features and an integral image for real-time detection. The algorithm has high detection rates with low false positives. Future work will apply back propagation neural networks to extract features and recognize faces from a database of facial images in order to build a facial recognition system.
This document provides instructions for installing OpenCV with Python on Windows and macOS systems. It also summarizes 9 example programs that demonstrate various computer vision techniques using OpenCV, such as reading and displaying images, opening video files and webcams, edge detection, line detection, color-based object tracking, contour classification, corner feature matching, and face recognition. The examples and more information are available at a provided GitHub URL. Reference materials for learning OpenCV with Python are also listed.
Anti-aliasing is a technique used to reduce aliasing, which makes curved or slanted lines appear jagged when displayed on a lower resolution output device like a monitor. Aliasing occurs because the device lacks enough resolution to smoothly represent curved lines. Anti-aliasing works by adding subtle color changes around lines, which causes jagged edges to blur together when viewed from a distance. There are several anti-aliasing techniques, including increasing the display resolution, area sampling to shade pixels based on the area covered by thickened lines, and post-filtering by generating a higher resolution virtual image and averaging it down.
A completed modeling of local binary pattern operatorWin Yu
This document presents the completed local binary pattern (CLBP) operator for texture classification. CLBP generalizes and completes the local binary pattern (LBP) by using a local difference sign-magnitude transform to encode the missing texture information not captured by LBP. The CLBP operator fuses three codes - CLBP_C for the center pixel, CLBP_S for the signs of differences, and CLBP_M for the magnitudes. Experiments on the Outex texture database show CLBP achieves much better classification accuracy than LBP and other state-of-the-art methods.
Lec10: Medical Image Segmentation as an Energy Minimization ProblemUlaş Bağcı
Enhancement, Noise Reduction, and Signal Processing • MedicalImageRegistration • MedicalImageSegmentation • MedicalImageVisualization • Machine Learning in Medical Imaging • Shape Modeling/Analysis of Medical Images Deep Learning in Radiology Fuzzy Connectivity (FC) – Affinity functions • Absolute FC • Relative FC (and Iterative Relative FC) • Successful example applications of FC in medical imaging • Segmentation of Airway and Airway Walls using RFC based method
Energyfunctional
– Data and Smoothness terms
• GraphCut – Min cut
– Max Flow
• ApplicationsinRadiologyImages
Case-based reasoning (CBR) classifiers use a database of problem solutions to solve
new problems. Unlike nearest-neighbor classifiers, which store training tuples as points
in Euclidean space, CBR stores the tuples or “cases” for problem solving as complex
symbolic descriptions.
Image Interpolation Techniques with Optical and Digital Zoom Conceptsmmjalbiaty
Digital image concepts and interpolation techniques for optical and digital zoom are discussed. There are three main types of interpolation used for resizing images: nearest neighbor, bilinear, and bicubic. Nearest neighbor is the simplest but produces the lowest quality, while bicubic is the most complex but highest quality. Optical zoom uses lens magnification before sensing, whereas digital zoom interpolates after sensing, resulting in lower quality than optical zoom. Interpolation methods assign pixel values to new locations during resizing based on weighting patterns around the original pixel values.
The document discusses various image enhancement techniques in digital image processing. It describes point operations like image negative, contrast stretching, thresholding, brightness enhancement, log transformation, and power law transformation. Contrast stretching expands the range of intensity levels and can be done by multiplying pixels with a constant, using a transfer function, or histogram equalization. Thresholding converts an image to binary by assigning pixel values above a threshold to one level and below to another. Log and power law transformations compress high intensity values and expand low values to enhance an image. Matlab code examples are provided for each technique.
Fundamental concepts and basic techniques of digital image processing. Algorithms and recent research in image transformation, enhancement, restoration, encoding and description. Fundamentals and basic techniques of pattern recognition.
This seminar report discusses content-based image retrieval (CBIR) systems. It defines CBIR as retrieving images from a database based on analyzing the visual content of images rather than relying on text annotations. The report outlines the key steps in a CBIR system, including extracting features like color, texture and shape from images, matching query images to images in the database based on their features, and allowing users to provide feedback to refine search results. Examples of applying different image features in CBIR systems are also provided.
This presentation provides an overview of artificial intelligence (AI) and deep learning. It begins with introductions to AI and deep learning, explaining that AI allows machines to perform tasks typically requiring human intelligence through machine learning. Deep learning is a type of machine learning using artificial neural networks inspired by the human brain. The presentation then discusses why AI has grown recently, citing increased computing power, data storage, and data availability. It also covers deep learning model development and concepts like underfitting and overfitting. The presentation describes different types of learning approaches like supervised, unsupervised, and reinforcement learning. It concludes with popular applications of deep learning like precision agriculture, computer vision, and recommendations.
In this presentation we described important things about Image processing and computer vision. If you have any query about this presentation then feels free to visit us at:
http://www.siliconmentor.com/
This document discusses different methods of image segmentation: thresholding, edge-based segmentation, and region-based segmentation. It provides details on various thresholding techniques including basic global thresholding, Otsu's method, multiple thresholding, and variable thresholding. For edge-based segmentation, it mentions basic edge detection, the Marr-Hildreth edge detector, and watersheds. Finally, it covers region-based segmentation and provides an algorithm for region growing.
This document outlines the syllabus for a digital image processing course. It introduces key concepts like what a digital image is, areas of digital image processing like low-level, mid-level and high-level processes, a brief history of the field, applications in different domains, and fundamental steps involved. The course will cover topics in digital image fundamentals and processing techniques like enhancement, restoration, compression and segmentation. It will be taught using MATLAB and C# in the labs. Assessment will include homework, exams, labs and a final project.
The document discusses various techniques for image segmentation including discontinuity-based approaches, similarity-based approaches, thresholding methods, region-based segmentation using region growing and region splitting/merging. Key techniques covered include edge detection using gradient operators, the Hough transform for edge linking, optimal thresholding, and split-and-merge segmentation using quadtrees.
Lecture 1 for Digital Image Processing (2nd Edition)Moe Moe Myint
-What is Digital Image Processing?
-The Origins of Digital Image Processing
-Examples of Fields that Use Digital Image Processing
-Fundamentals Steps in Digital Image Processing
-Components of an Image Processing System
RANSAC is an algorithm for estimating model parameters from noisy data containing outliers. It works by:
1. Randomly selecting minimal samples needed to estimate a model
2. Calculating fit of model to all data to find inliers
3. Repeating for many iterations and selecting model with most inliers
The number of iterations needed depends on the expected outlier ratio and desired probability of finding the correct model. RANSAC is useful for problems like image alignment that involve fitting models to data containing outliers.
Harris corner detection is used to extract local features from images. It works by (1) computing the gradient at each point, (2) constructing a second moment matrix from the gradient, and (3) using the eigenvalues of this matrix to score how "corner-like" each point is. Points with a large, local maximum score are detected as corners. The Harris operator, which is a variant using the trace of the matrix, is commonly used due to its efficiency. Corners provide distinctive local features that can be matched between images.
This is the first part of the presentation series on one of the powerful open sources libraries, the opencv. this presentation is about the introduction, installation, some basic functions on images and some basic image processing on the images
The document discusses content-based image retrieval (CBIR). It provides a brief history of CBIR, noting it originated in 1992. It describes challenges of CBIR, including the semantic gap between low-level features extracted and high-level human concepts. It also outlines common CBIR techniques like color, shape, and texture analysis. Applications are described as image search and browsing. Limitations include not fully capturing human visual understanding.
Anti-aliasing is a technique used to reduce jagged or stair-stepped edges in digital images by adding subtle color variations around edges. It works by averaging pixel color values across edges to make them appear smoother. There are several techniques for anti-aliasing including increasing image resolution, prefiltering by calculating pixel color based on object overlap within a pixel area, and postfiltering through supersampling at a higher resolution and then averaging down. Unweighted area sampling draws lines as rectangles and sets pixel intensity proportional to the amount of overlap with the rectangle rather than distance from the pixel center.
This document presents a literature review and proposed work plan for face recognition using a back propagation neural network. It summarizes the Viola-Jones face detection algorithm which uses Haar features and an integral image for real-time detection. The algorithm has high detection rates with low false positives. Future work will apply back propagation neural networks to extract features and recognize faces from a database of facial images in order to build a facial recognition system.
This document provides instructions for installing OpenCV with Python on Windows and macOS systems. It also summarizes 9 example programs that demonstrate various computer vision techniques using OpenCV, such as reading and displaying images, opening video files and webcams, edge detection, line detection, color-based object tracking, contour classification, corner feature matching, and face recognition. The examples and more information are available at a provided GitHub URL. Reference materials for learning OpenCV with Python are also listed.
Anti-aliasing is a technique used to reduce aliasing, which makes curved or slanted lines appear jagged when displayed on a lower resolution output device like a monitor. Aliasing occurs because the device lacks enough resolution to smoothly represent curved lines. Anti-aliasing works by adding subtle color changes around lines, which causes jagged edges to blur together when viewed from a distance. There are several anti-aliasing techniques, including increasing the display resolution, area sampling to shade pixels based on the area covered by thickened lines, and post-filtering by generating a higher resolution virtual image and averaging it down.
A completed modeling of local binary pattern operatorWin Yu
This document presents the completed local binary pattern (CLBP) operator for texture classification. CLBP generalizes and completes the local binary pattern (LBP) by using a local difference sign-magnitude transform to encode the missing texture information not captured by LBP. The CLBP operator fuses three codes - CLBP_C for the center pixel, CLBP_S for the signs of differences, and CLBP_M for the magnitudes. Experiments on the Outex texture database show CLBP achieves much better classification accuracy than LBP and other state-of-the-art methods.
Lec10: Medical Image Segmentation as an Energy Minimization ProblemUlaş Bağcı
Enhancement, Noise Reduction, and Signal Processing • MedicalImageRegistration • MedicalImageSegmentation • MedicalImageVisualization • Machine Learning in Medical Imaging • Shape Modeling/Analysis of Medical Images Deep Learning in Radiology Fuzzy Connectivity (FC) – Affinity functions • Absolute FC • Relative FC (and Iterative Relative FC) • Successful example applications of FC in medical imaging • Segmentation of Airway and Airway Walls using RFC based method
Energyfunctional
– Data and Smoothness terms
• GraphCut – Min cut
– Max Flow
• ApplicationsinRadiologyImages
Case-based reasoning (CBR) classifiers use a database of problem solutions to solve
new problems. Unlike nearest-neighbor classifiers, which store training tuples as points
in Euclidean space, CBR stores the tuples or “cases” for problem solving as complex
symbolic descriptions.
Image Interpolation Techniques with Optical and Digital Zoom Conceptsmmjalbiaty
Digital image concepts and interpolation techniques for optical and digital zoom are discussed. There are three main types of interpolation used for resizing images: nearest neighbor, bilinear, and bicubic. Nearest neighbor is the simplest but produces the lowest quality, while bicubic is the most complex but highest quality. Optical zoom uses lens magnification before sensing, whereas digital zoom interpolates after sensing, resulting in lower quality than optical zoom. Interpolation methods assign pixel values to new locations during resizing based on weighting patterns around the original pixel values.
The document discusses various image enhancement techniques in digital image processing. It describes point operations like image negative, contrast stretching, thresholding, brightness enhancement, log transformation, and power law transformation. Contrast stretching expands the range of intensity levels and can be done by multiplying pixels with a constant, using a transfer function, or histogram equalization. Thresholding converts an image to binary by assigning pixel values above a threshold to one level and below to another. Log and power law transformations compress high intensity values and expand low values to enhance an image. Matlab code examples are provided for each technique.
Fundamental concepts and basic techniques of digital image processing. Algorithms and recent research in image transformation, enhancement, restoration, encoding and description. Fundamentals and basic techniques of pattern recognition.
This seminar report discusses content-based image retrieval (CBIR) systems. It defines CBIR as retrieving images from a database based on analyzing the visual content of images rather than relying on text annotations. The report outlines the key steps in a CBIR system, including extracting features like color, texture and shape from images, matching query images to images in the database based on their features, and allowing users to provide feedback to refine search results. Examples of applying different image features in CBIR systems are also provided.
Image processing with OpenCV allows various techniques to manipulate digital images. Some key techniques include smoothing to remove noise, erosion and dilation to diminish or accentuate features, and edge detection algorithms like Sobel, Laplace, and Canny to find edges. The core OpenCV module provides functions for accessing pixel values, adjusting contrast and brightness, and drawing shapes. Feature detection identifies keypoints like edges, corners, and blobs, then describes the details around them for later matching against other images. Common algorithms include SURF, SIFT, and BRIEF for feature extraction and description and FLANN and BruteForce for feature matching.
This document discusses image enhancement techniques in the spatial domain. It defines spatial domain enhancement as the direct manipulation of pixel values. Some key techniques discussed include contrast stretching to improve low contrast, histogram equalization to improve contrast, sharpening filters to enhance edges using derivatives, and smoothing filters for noise reduction. The document provides examples and definitions of various spatial filters and their effects.
The document discusses binocular stereo vision and methods for estimating depth from stereo image pairs. It describes how humans can perceive depth from stereo vision and basic stereo matching algorithms that find correspondences between left and right images to compute a dense depth map. It also discusses challenges like the correspondence problem and limitations of window-based matching. More advanced methods formulate stereo matching as an energy minimization problem that can be solved using techniques like graph cuts.
This document discusses various image features that can be used for large-scale visual search and content-based image retrieval (CBIR). It describes both high-level semantic features and low-level visual features that can be extracted from images. For low-level features, it outlines several popular global features like color histograms, color moments, texture descriptors using gray-level co-occurrence matrices (GLCM), shape context, and GIST. It also discusses commonly used local feature detectors like Harris corner detector, SIFT, and descriptors like SIFT, SURF, BRIEF.
This document provides an overview of feature detection techniques in machine vision, including edge detection, the Canny edge detector, interest points, and the Harris corner detector. It describes how edge detection works by finding discontinuities in images using masks and correlation. It explains that the Canny edge detector is an optimal method that uses Gaussian smoothing and non-maximum suppression. Interest points are localized features useful for applications like image alignment, and the Harris corner detector computes gradients to find locations with dominant directions, identifying corners.
MetroScope is a digital image processing tool for complex shape metrology. It consists of an image processing engine, digital photo album, and image sharing framework. MetroScope can extract feature shapes, measure areas and line edge roughness. It is well-suited for mask manufacturing tasks like OPC characterization, defect metrology, and process evaluation. The document demonstrates MetroScope's capabilities in defect sizing, corner rounding quantification, and line edge roughness measurement.
1) The document discusses edge detection and the Hough transform for detecting lines and circles in images. It describes common edge detectors like Sobel, Canny, and LoG and explains how they work.
2) It then introduces the Hough transform as a method to detect lines and circles by having each edge point "vote" for possible lines and circles it could belong to in a parameter space.
3) The Hough transform reduces the complexity from quadratic to linear time compared to examining all pairs of edge points, and is less sensitive to gaps or noise in the edge points.
Analyzing color imaging failure on consumer-grade camerasSaiTedla1
There are many efforts to employ consumer-grade cameras for home-based health and wellness monitoring. Such applications rely on users to capture images for analysis using their personal cameras in a home environment. When color is a primary feature for diagnostic algorithms, the camera requires calibration to ensure accurate color measurements. Given the importance of these diagnostic tests for the users’ health and well-being, it is important to understand the conditions in which color calibration may fail. To this end, we analyzed a wide range of camera sensors and environmental lighting to determine (1) how often color calibration failure is likely to occur and (2) the underlying reasons for failure. Our analysis shows that it is rare to encounter a camera sensor and lighting condition combination that results in color imaging failure. Moreover, when color imaging does fail, the cause is almost always attributed to spectral poor environmental lighting and not the camera sensor. We believe this finding is useful for scientists and engineers developing color-based applications for use with consumer-grade cameras.
This document provides an outline for a seminar on computer graphics. It begins with basics of computer graphics including definitions, classifications, and principles. It then covers topics like computer-aided design, presentation graphics, computer art, entertainment, education and training, and visualization. Graphics devices, output primitives, displays, and input devices are discussed. Drawing points, lines, polygons, and transformations are explained. 3D concepts like parallel projection, perspective projection, and object representations are introduced. The document also covers color models, animations, graphics processing units, and the OpenGL graphics library. It provides examples of functions for initializing and creating windows in OpenGL.
At the end of this lesson, you should be able to;
describe spatial resolution
describe intensity resolution
identify the effect of aliasing
describe image interpolation
describe relationships among the pixels
The document discusses the fundamentals of image formation including how images are represented digitally in computers. It covers the image formation process involving the geometry and physics of light. Common image file formats and their storage are described as well as how to display, convert, and print images using software tools. Image sampling and quantization in the digitization process are also summarized.
This document discusses digital image processing and image interpolation techniques. It defines a digital image as a representation of a 2D image using pixels, and digital image processing as focusing on improving images for human interpretation and machine perception. The key stages of digital image processing are described as image acquisition, restoration, enhancement, segmentation, representation and description, object recognition, color processing, and compression. Common image interpolation techniques like nearest neighbor, linear, cubic, and B-spline interpolation are also summarized.
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Sunando Sengupta
1) Given a sequence of stereo images, the pipeline generates a dense 3D semantic model of the urban environment.
2) Depth maps are generated from stereo images and fused into a volumetric representation using camera poses from feature tracking.
3) Semantic segmentation of street view images is done using a CRF model, and labels are projected onto the 3D model faces to generate the semantic model.
4) The semantic model is evaluated by projecting it back to the input images and calculating metrics like recall and intersection over union. Future work includes real-time implementation and combining image and geometric context.
Enhance and quantify Microstructure using Machine LearningManthan Ambolkar
The document summarizes machine learning work done to enhance microstructure and Kikuchi pattern images obtained from electron backscatter diffraction (EBSD). Convolutional neural networks were trained using Pix2Pix to denoise and enhance input images. L2 loss was found to produce the best results by removing noise while maintaining straight band patterns. Feature identification in microstructures was also explored by labeling images and training a model to segment and identify features like twins, grains. Early results showed potential but more data is needed to identify thinner twins accurately.
This document discusses various spatial filtering methods used in image processing. Spatial filters are defined by their neighborhood, which is usually a square window, and their operation, which processes pixels in the neighborhood. Linear filters include correlation and convolution, where the output is a linear combination of input pixels. Common filters are smoothing (low-pass) filters like averaging and Gaussian, which reduce noise and detail, and sharpening (high-pass) filters like unsharp masking and derivatives, which enhance details like edges. Derivatives like the gradient and Laplacian are used to detect edges.
This chapter discusses image acquisition, including image sensors, representation of image data, and types of digital images. It describes how image sensors like vidicons and solid-state arrays are used to transform optical images into electrical signals. Digital images represent images as matrices of pixels, where each pixel is assigned an integer value representing brightness or color. Common types of digital images include binary, grayscale, color, and indexed color images.
Scale and object aware image retargeting for thumbnail browsingperillaroc
1) The document proposes a scale and object aware image retargeting method for generating thumbnail images that preserves important visual information.
2) It introduces the concepts of scale dependent saliency and objectness to identify important regions, and uses cyclic seam carving and thin-plate spline warping to retarget images in a way that minimally distorts objects and structures.
3) Experimental results show the method performs better than scaling and traditional seam carving methods in qualitative and quantitative evaluations for generating thumbnail images that match descriptions.
This document provides an overview of computer graphics systems. It discusses the basic components of a graphics system including input, computation, and output. For output, it describes raster display technologies like cathode ray tubes (CRTs) and liquid crystal displays (LCDs). It also discusses graphics memory and framebuffers for storing pixel color values, as well as color depth and dithering techniques. The goal of computer graphics is to solve the color function for each pixel on the display.
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on automated letter generation for Bonterra Impact Management using Google Workspace or Microsoft 365.
Interested in deploying letter generation automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on integration of Salesforce with Bonterra Impact Management.
Interested in deploying an integration with Salesforce for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Dive into the realm of operating systems (OS) with Pravash Chandra Das, a seasoned Digital Forensic Analyst, as your guide. 🚀 This comprehensive presentation illuminates the core concepts, types, and evolution of OS, essential for understanding modern computing landscapes.
Beginning with the foundational definition, Das clarifies the pivotal role of OS as system software orchestrating hardware resources, software applications, and user interactions. Through succinct descriptions, he delineates the diverse types of OS, from single-user, single-task environments like early MS-DOS iterations, to multi-user, multi-tasking systems exemplified by modern Linux distributions.
Crucial components like the kernel and shell are dissected, highlighting their indispensable functions in resource management and user interface interaction. Das elucidates how the kernel acts as the central nervous system, orchestrating process scheduling, memory allocation, and device management. Meanwhile, the shell serves as the gateway for user commands, bridging the gap between human input and machine execution. 💻
The narrative then shifts to a captivating exploration of prominent desktop OSs, Windows, macOS, and Linux. Windows, with its globally ubiquitous presence and user-friendly interface, emerges as a cornerstone in personal computing history. macOS, lauded for its sleek design and seamless integration with Apple's ecosystem, stands as a beacon of stability and creativity. Linux, an open-source marvel, offers unparalleled flexibility and security, revolutionizing the computing landscape. 🖥️
Moving to the realm of mobile devices, Das unravels the dominance of Android and iOS. Android's open-source ethos fosters a vibrant ecosystem of customization and innovation, while iOS boasts a seamless user experience and robust security infrastructure. Meanwhile, discontinued platforms like Symbian and Palm OS evoke nostalgia for their pioneering roles in the smartphone revolution.
The journey concludes with a reflection on the ever-evolving landscape of OS, underscored by the emergence of real-time operating systems (RTOS) and the persistent quest for innovation and efficiency. As technology continues to shape our world, understanding the foundations and evolution of operating systems remains paramount. Join Pravash Chandra Das on this illuminating journey through the heart of computing. 🌟
2. Outline
What is CBIR ?
Image Features
Feature Weighting and Relevance
Feedback
User Interface and Visualization
2
3. What is Content-based Image
Retrieval (CBIR)?
Image Search Systems that search
for images by image content
<-> Keyword-based Image/Video Retrieval
(ex. Google Image Search, YouTube)
3
4. Applications of CBIR
Consumer Digital Photo Albums
Digital Cameras
Flickr
Medical Images
Digital Museum
Trademarks Search
MPEG-7 Content Descriptors
4
5. Basic Components of CBIR
Feature Extractor
Create the metadata
Query Engine
Calculate similarity
User Interface
5
6. How does CBIR work ?
Extract Features from Images
Let the user do Query
Query by Sketch
Query by Keywords
Query by Example
Refine the result by Relevance Feedback
Give feedback to the previous result
6
7. Query by Example
Pick example images, then ask the system
to retrieve “similar” images.
“Get similar images”
CBIR
Query Sample
Results
7
8. Relevance Feedback
User gives a feedback to the query results
System recalculates feature weights
Query Feedback
Feedback
Initial
sample 1st Result 2nd Result
8
9. Basic Components of CBIR
Feature Extractor
Create the metadata
Query Engine
Calculate similarity
User Interface
9
11. Color Features
Which Color Space?
RGB, CMY, YCrCb, CIE, YIQ, HLS, …
Our Favorite is HSV
Designed to be similar to human perception
11
12. HSV Color Space
H (Hue)
Dominant color (spectral)
S (Saturation)
Amount of white
V (Value)
Brightness
How to Use This?
12
13. Straightforward way to use
HSV as color features
Histogram for each H, S, and V
Then compare in each bin
Is this a good idea?
13
14. Are these two that different?
Histogram comparison is very
sensitive
14
15. Color Moments [Stricker ‘95]
For each image, the color distribution in
each of H, S and V is calculated
1st (mean), 2nd (var) and 3rd moment for HSV
N
1
∑ pij
Ei =
N j=1
i : color channel {i=h,s,v}
N
1
N = # of pixels in image
∑ (pij − Ei ) 2 )1 2
σi = (
N j=1
Total 9 features
1N
si = ( ∑ (pij − Ei ) 3)1 3
N j=1
15
16. Shape Features
Region-Based Shape
Outer Boundary
Contour-Based Shape
Features of Contour
Edge-Based Shape
Ex. Histogram of edge length and
orientation
16
17. Region-based vs. Contour-based
Region-based
Suitable for Complex objects with
disjoint region
Contour-based
Suitable when semantics are contained in
the contour
17
20. Angular Radial Transformation (ART)
[Kim’99]
A Region-based shape
Calculate the coefficients based on image
intensities in polar coordinates (n<3, m<12)
2π 1
∫ ∫
Fnm = Vnm (ρ,θ ) f (ρ,θ )ρdρdθ
0 o
f (ρ,θ )L image intensity in polar coordinates
Vnm (ρ,θ )L ART basis function
Vnm (ρ,θ ) = 1 / 2π exp( jmθ )Rn (ρ)
1 n=0
Rn (ρ) =
2 cos(πnρ) n ≠ 0
Total 35 coefficients in 140 bits (4 bits/coeff)
€ 20
21. Curvature Scale-Space (CSS)
[Mokhtaarian ‘92]
A contour-based shape
1) Apply lowpass filter repeatedly until
concave contours smoothed out
2) “How contours are filtered” becomes
the features
Zero crossing in the curvature functions
•
after each application of the lowpass
filter
CSS Image
•
21
22. CSS Image
Tracks zero-crossing locations of each concavity in the contour
Contour Curvature
B
A
C
D
3 ITR
F
E
29 ITR
B
A
E
F
100
ITR
22
S
23. CSS Features
# of peaks in CSS images
Highest peak
Circularity (perimeter2/ area)
Eccentricity
Etc.
23
25. Wavelet Filter Bank
Coarse Info (low
freq)
Wavelet
Filter
Original Image
Detail (high freq)
25
26. Texture Features from Wavelet
Wavelet
Original Feature
Image
Image Extraction
f6,v6 f3,v3
f1,v1
Wavelet f5,v5 f4,v4
Filter
f2,v2 f0,vo
Bank
Take the mean
Decompose the
and variance of
images into
each subband
frequency subbands
26
27. Other approaches: Region-Based
Global features often times fail to
capture local content in an image
GLOBAL DESCRIPTION
{Green, Grassy, Hillside}
color, texture, shape
No sheep? No fence? No houses?
27
28. Other approaches: Region-Based
Segmentation-Based
Images are segmented by color/texture similarities:
Blobworld [Carson ‘99], Netra [Ma and Manjunath ‘99]
Grid-Based
Images are partitioned, features are calculated from
blocks: [Tian ‘00],[Moghaddam ‘99]
28
30. Basic Components of CBIR
Feature Extractor
Create the metadata
Query Engine
Calculate similarity
User Interface
30
31. Now, We have many features
(too many?)
How to express visual “similarity”
with these features?
31
32. Visual Similarity ?
“Similarity” is Subjective and Context-
dependent.
“Similarity” is High-level Concept.
Cars, Flowers, …
But, our features are Low-level features.
Semantic Gap!
32
33. Which features are most important?
Not all features are always important.
“Similarity” measure is always changing
The system has to weight features on the fly.
How ?
33
34. Online Feature Weighting
Approach #1 - Manual
Ask the user to specify number
“35% of color and 50% of texture…”
Very difficult to determine the numbers
Approach #2 - Automatic
Learn feature weights from examples
Relevance Feedback
34
35. Online Feature Weighting
From Query Examples, the system
determines feature weighting matrix W
CBIR
Calculate W
Query
Result
rr r rT rr
distance( x, y ) = ( x − y ) W ( x − y )
35
36. How to Calculate W ?
No Negative Examples (1-class)
Positive and Negative Examples (2-class)
One Positive and Many Negative classes
(1+x)-class
Many Positive and Many Negative classes
(x+y)-class
36
37. When there are only relevant
images available…
We want to give more weights to
common features among example
images.
Use the variance.
Features with low variance
-> Common features
-> Give higher weight
37
38. One Class Relevance Feedback in
MARS [Rui ‘98]
Calculates the Variance among relevant examples.
The inverse of variance becomes the weight of each
feature.
This means “common features” between positive
examples have larger weights.
1/σ12 0
2
1/σ 2
W =
2
1/σ 3
W is a k x k diagonal matrix
O
2
0 1/σ k
38
39. Relevance Feedback as Two-Class
Problem (positive and negative)
Fisherʼs Discriminant Analysis (FDA)
Find a W that …
minimizes the scatter
of each class cluster
(within scatter)
positive
maximizes the scatter
between the clusters
negative
(between scatter)
39
40. Two-Class problem
Target function
W T SBW
W = argmax T
W is full matrix W SW W
W
SB LBetween Scatter Matrix
SW LWithin Scatter Matrix
2
SW = ∑ ∑ (x j − mi )(x j − mi )T
j ∈group # i
i=1
SB = (m1 − m2 )(m1 − m2 )T
€
m1,m2 Lmean of each class
40
41. Solution
The problem is reduced to
generalized eigenvalue problem
SB w i = λi SW w i
1/ 2
W = ΦΛ
ΛLdiagonal matrix of eigenvalues
ΦLeigenvectors
41
42. From Two-class to (1+x)-class
Positive examples are usually from
one class such as flower
Negative examples can be from any
classes such as “car”, “elephant”,
“orange”…
It is not desirable to assume negative
images as one class.
42
43. RF as (1+x)-Class Problem
• Biased Discriminant Analysis [Zhou et al. ‘01]
• Negative examples can be any images
• Each negative image has its own group
positive
negative
SW = ∑ (x − m)(x − m)T
x ∈ positive
SB = ∑ (x − m)(x − m)T
x ∈negative
mLmean of positive class
The solution is similar to FDA
43
44. RF as (x+y)-Class Problem
Group BDA [Nakazato, Dagli ‘03]
Multiple Positive classes
Scattered Negative classes
positive
negative
44
45. RF as (x+y)-Class Problem
Group BDA [Nakazato, Dagli ‘03]
Multiple Positive classes
Scattered Negative classes
positive
negative
SW = ∑ i ∑ x ∈i (x − mi )(x − mi )T
SB = ∑ ∑ (y − mi )(y − mi )T
i y ∈negative
mi Lmean of positive class i
45
46. Basic Components of CBIR
Feature Extractor
Create the metadata
Query Engine
Calculate similarity
User Interface
46
47. User Interface and Visualization
Basic GUI
Direct Manipulation GUI
El Nino [UC San Diego]
Image Grouper [Nakazato and Huang]
3D Virtual Reality Display
47
48. Traditional GUI for Relevance
Feedback
User selects
relevant images
If good images
are found, add
them
When no more
images to add, the
search converges
Slider or Checkbox
48
49. ImageGrouper [Nakazato and Huang]
Query by Groups
Make a query by creating groups of images
Easier to try different combinations of
query sets (trial-and-Error Query)
49
51. Note
Trial-and-Error Query is very important
because
Image similarity is subjective and context-
dependent.
In addition, we are using low-level image features.
(semantic gap)
Thus, it is VERY difficult to express the user’s
concept by these features.
51
52. Image Retrieval in 3D
Image retrieval and browsing in 3D
Virtual Reality
The user can see more images without
occlusion
Query results can be displayed in
various criteria
Results by Color features, by texture, by
combination of color and texture
52
53. 3D MARS
Texture
Structure
color
Initial Display
Result
53
54. 3D MARS in CAVE™
Shuttered glasses for
immersive 3D
experience
Click and Drag images
by WAND
Fly-through by
Joystick
54
55. Demos
Traditional GUI
IBM QBIC
• http://wwwqbic.almaden.ibm.com/
UIUC MARS
• http://chopin.ifp.uiuc.edu:8080
ImageGrouper
http://www.ifp.uiuc.edu/~nakazato/grouper
55
56. References (Image Features)
Bober, M., “MPEG-7 Visual Descriptors,” In IEEE Transactions on Circuits and
Systems for Video Technology, Vol. 11, No. 6, June 2001.
Stricker, M. and Orengo, M., “Similarity of Color Images,” In Proceedings of SPIE,
Vol. 2420 (Storage and Retrieval of Image and Video Databases III), SPIE Press,
Feb. 1995.
Zhou, X. S. and Huang, T. S., “Edge-based structural feature for content-base
image retrieval,” Pattern Recognition Letters, Special issue on Image and Video
Indexing, 2000.
Smith, J. R. and Chang S-F. Transform features for texture classification and
discrimination in large image databases. In Proceedings of IEEE Intl. Conf. on
Image Processing, 1994.
Smith J. R. and Chang S-F. “Quad-Tree Segmentation for Texture-based Image
Query.” In Proceedings of ACM 2nd International Conference on Multimedia, 1994.
Dagli, C. K. and Huang, T.S., “A Framework for Color Grid-Based Image Retrieval,”
In Proceedings of International Conference on Pattern Recognition, 2004.
Tian, Q. et. al. “Combine user defined region-of-interest and spatial layout in image
retrieval,” in IEEE Intl. Conf. on Image Processing, 2000.
Moghaddam B. et. al. “Defining image content with multiple regions-of-interest,” in
IEEE Wrkshp on Content-Based Access of Image and Video Libraries, 1999.
56
57. References (Relevance Feedback)
Rui, Y., et al., “Relevance Feedback: A Power Tool for
Interactive Content-Based Image Retrieval,” In IEEE
Trans. on Circuits and Video Technology, Vol.8, No.5,
Sept. 1998
Zhou, X. S., Petrovic, N. and Huang, T. S. “Comparing
Discriminating Transformations and SVM for Learning
during Multimedia Retrieval.” In Proceedings of ACM
Multimedia ‘01, 2001.
Ishikawa, Y., Subrammanya, R. and Faloutsos, C.,
“MindReader: Query database through multiple
examples,” In Proceedings of the 24th VLDB
Conference, 1998.
57
58. References (User Interfaces and
Visualizations)
Nakazato, M. and Huang, T. S. “3D MARS: Immersive Virtual
Reality for Content-Based Image Retrieval.“ In Proceedings of
2001 IEEE International Conference on Multimedia and Expo
(ICME2001), Tokyo, August 22-25, 2001
Nakazato, M., Manola, L. and Huang, T.S., “ImageGrouper: Search,
Annotate and Organize Images by Groups,” In Proc. of 5th Intl.
Conf. On Visual Information Systems (VIS’02), 2002.
Nakazato, M., Dagli C.K., and Huang T.S., “Evaluating Group-Based
Relevance Feedback for Content-Based Image Retrieval,” In
Proceedings of International Conference on Image Processing,
2003.
Santini, S. and Jain, R., “Integrated Browsing and Querying for
Image Database,” IEEE Multimedia, Vol. 7, No. 3, 2000, page
26-39.
58