Project conducted in fulfilment of the M.Eng. degree. Describes a novel computational approach to construct a 3D rendering of cheese from a limited number of X-ray projections.
Efficient Model-based 3D Tracking by Using Direct Image RegistrationEnrique Muñoz Corral
This thesis deals with the problem of efficiently tracking 3D objects in sequences of images. We tackle the efficient 3D tracking problem by using direct image registration. This problem is posed as an iterative optimization procedure that minimizes a brightness error norm. We review the most popular iterative methods for image registration in the literature, turning our attention to those algorithms that use efficient optimization techniques. Two forms of efficient registration algorithms are investigated. The first type comprises the additive registration algorithms: these algorithms incrementally compute the motion parameters by linearly approximating the brightness error function. We centre our attention on Hager and Belhumeur’s factorization-based algorithm for image registration. We propose a fundamental requirement that factorization-based algorithms must satisfy to guarantee good convergence, and introduce a systematic procedure that automatically computes the factorization. Finally, we also bring out two warp functions to register rigid and nonrigid 3D targets that satisfy the requirement. The second type comprises the compositional registration algorithms, where the brightness function error is written by using function composition. We study the current approaches to compositional image alignment, and we emphasize the importance of the Inverse Compositional method, which is known to be the most efficient image registration algorithm. We introduce a new algorithm, the Efficient Forward Compositional image registration: this algorithm avoids the necessity of inverting the warping function, and provides a new interpretation of the working mechanisms of the inverse compositional alignment. By using this information, we propose two fundamental requirements that guarantee the convergence of compositional image registration methods. Finally, we support our claims by using extensive experimental testing with synthetic and real-world data. We propose a distinction between image registration and tracking when using efficient algorithms. We show that, depending whether the fundamental requirements are hold, some efficient algorithms are eligible for image registration but not for tracking.
This document summarizes Cameron Ellum's PhD dissertation which examines new strategies for integrating GNSS and photogrammetric data. It introduces two new integration strategies: 1) Inter-processor communication between a GNSS Kalman filter and photogrammetric bundle adjustment, allowing two-way information flow. 2) A combined least-squares adjustment of both GNSS and photogrammetric observations, integrating the measurements at the lowest level within a single processor. Testing showed the first strategy can help GNSS positioning after outages but may not improve mapping accuracy. The combined adjustment demonstrated how photogrammetric control can replace a fixed GNSS base and how using partial GNSS observations can help constrain error growth, though exposure positions were
This document describes a dissertation that aims to improve 3D stereo reconstruction of human faces by combining it with a generic morphable face model. The dissertation first discusses background topics like facial landmark annotation, 3D morphable face models, texture representation, stereo reconstruction and face model deformation. It then describes the proposed scheme which involves steps like landmark annotation, pose estimation, shape fitting, texture extraction, stereo reconstruction from image pairs and deformation of the face model. The results show that fusing the stereo reconstruction with a single image reconstruction using a morphable model leads to a more accurate 3D face model compared to using either method alone. Finally, the deformed face model is visualized on a smartphone using a cardboard viewer.
This thesis presents methods for the automated localisation of organs in fetal magnetic resonance imaging (MRI) to enable automated preprocessing for motion correction. The first method localises the fetal brain independently of orientation using a Viola-Jones detector followed by classification of image regions with bundled SIFT features. This localisation of the brain is then used to steer the localisation of the heart, lungs and liver using segmentation with autocontext random forests and random forests with steerable features. Evaluation shows the brain localisation and segmentation performs as well as manual preprocessing. Preliminary results on motion correction of the fetal thorax using the heart, lung and liver localisation are also presented.
Fabric Defect Detaction in Frequency Domain Using Fourier AnalysisGokay Titrek
The document presents a method for fabric defect detection in the frequency domain using Fourier analysis. It proposes avoiding computationally expensive machine learning techniques by extracting a template from the fabric's repeating patterns. Defects are then detected by comparing test images to the template in the frequency domain using the Fourier transform and normalized cross-correlation. The method is shown to enable online and fully automated real-time defect detection for textiles. Experimental results on a collected dataset demonstrate the approach.
This document describes a research project aimed at extracting the cortical surface and separating the hemispheres in MRI datasets using 3D image segmentation techniques. For cortical surface extraction, a conditional dilation approach is used to "open" closed cavities in the segmented cortex to obtain a surface with hollow sphere topology. For hemisphere separation, marker volumes are defined and dilated to grow segmentation masks for each hemisphere, addressing challenges like marker volumes growing into each other. Experimental results demonstrate the feasibility of the proposed approaches.
This document is a doctoral thesis submitted by Manuela P. Feilner to the Department of Microtechnology at EPFL in 2002. The thesis proposes using statistical wavelet analysis methods for functional magnetic resonance imaging (fMRI) of the brain. Chapter 1 introduces the motivation and contributions of the thesis. Chapter 2 provides background on fMRI and image acquisition techniques. Subsequent chapters develop statistical analysis methods using wavelet transforms and apply them to analyze real fMRI data to identify brain activation patterns. The goal is to improve detection of activated regions compared to existing real-space methods.
This document is a master's thesis that aims to detect sensor failures and malfunctions in cameras. It presents methods to assess image quality and detect blurred, overexposed, underexposed, and obstructed images that could indicate sensor issues. The thesis covers implementing classifiers like support vector machines to detect blur and exploring techniques like thresholding and histograms to detect exposure range problems. It also examines detecting issues like blooming effects and power failures or obstructions that could impact the camera sensor. The goal is to continuously monitor image quality to identify sensor malfunctions in real-time.
Efficient Model-based 3D Tracking by Using Direct Image RegistrationEnrique Muñoz Corral
This thesis deals with the problem of efficiently tracking 3D objects in sequences of images. We tackle the efficient 3D tracking problem by using direct image registration. This problem is posed as an iterative optimization procedure that minimizes a brightness error norm. We review the most popular iterative methods for image registration in the literature, turning our attention to those algorithms that use efficient optimization techniques. Two forms of efficient registration algorithms are investigated. The first type comprises the additive registration algorithms: these algorithms incrementally compute the motion parameters by linearly approximating the brightness error function. We centre our attention on Hager and Belhumeur’s factorization-based algorithm for image registration. We propose a fundamental requirement that factorization-based algorithms must satisfy to guarantee good convergence, and introduce a systematic procedure that automatically computes the factorization. Finally, we also bring out two warp functions to register rigid and nonrigid 3D targets that satisfy the requirement. The second type comprises the compositional registration algorithms, where the brightness function error is written by using function composition. We study the current approaches to compositional image alignment, and we emphasize the importance of the Inverse Compositional method, which is known to be the most efficient image registration algorithm. We introduce a new algorithm, the Efficient Forward Compositional image registration: this algorithm avoids the necessity of inverting the warping function, and provides a new interpretation of the working mechanisms of the inverse compositional alignment. By using this information, we propose two fundamental requirements that guarantee the convergence of compositional image registration methods. Finally, we support our claims by using extensive experimental testing with synthetic and real-world data. We propose a distinction between image registration and tracking when using efficient algorithms. We show that, depending whether the fundamental requirements are hold, some efficient algorithms are eligible for image registration but not for tracking.
This document summarizes Cameron Ellum's PhD dissertation which examines new strategies for integrating GNSS and photogrammetric data. It introduces two new integration strategies: 1) Inter-processor communication between a GNSS Kalman filter and photogrammetric bundle adjustment, allowing two-way information flow. 2) A combined least-squares adjustment of both GNSS and photogrammetric observations, integrating the measurements at the lowest level within a single processor. Testing showed the first strategy can help GNSS positioning after outages but may not improve mapping accuracy. The combined adjustment demonstrated how photogrammetric control can replace a fixed GNSS base and how using partial GNSS observations can help constrain error growth, though exposure positions were
This document describes a dissertation that aims to improve 3D stereo reconstruction of human faces by combining it with a generic morphable face model. The dissertation first discusses background topics like facial landmark annotation, 3D morphable face models, texture representation, stereo reconstruction and face model deformation. It then describes the proposed scheme which involves steps like landmark annotation, pose estimation, shape fitting, texture extraction, stereo reconstruction from image pairs and deformation of the face model. The results show that fusing the stereo reconstruction with a single image reconstruction using a morphable model leads to a more accurate 3D face model compared to using either method alone. Finally, the deformed face model is visualized on a smartphone using a cardboard viewer.
This thesis presents methods for the automated localisation of organs in fetal magnetic resonance imaging (MRI) to enable automated preprocessing for motion correction. The first method localises the fetal brain independently of orientation using a Viola-Jones detector followed by classification of image regions with bundled SIFT features. This localisation of the brain is then used to steer the localisation of the heart, lungs and liver using segmentation with autocontext random forests and random forests with steerable features. Evaluation shows the brain localisation and segmentation performs as well as manual preprocessing. Preliminary results on motion correction of the fetal thorax using the heart, lung and liver localisation are also presented.
Fabric Defect Detaction in Frequency Domain Using Fourier AnalysisGokay Titrek
The document presents a method for fabric defect detection in the frequency domain using Fourier analysis. It proposes avoiding computationally expensive machine learning techniques by extracting a template from the fabric's repeating patterns. Defects are then detected by comparing test images to the template in the frequency domain using the Fourier transform and normalized cross-correlation. The method is shown to enable online and fully automated real-time defect detection for textiles. Experimental results on a collected dataset demonstrate the approach.
This document describes a research project aimed at extracting the cortical surface and separating the hemispheres in MRI datasets using 3D image segmentation techniques. For cortical surface extraction, a conditional dilation approach is used to "open" closed cavities in the segmented cortex to obtain a surface with hollow sphere topology. For hemisphere separation, marker volumes are defined and dilated to grow segmentation masks for each hemisphere, addressing challenges like marker volumes growing into each other. Experimental results demonstrate the feasibility of the proposed approaches.
This document is a doctoral thesis submitted by Manuela P. Feilner to the Department of Microtechnology at EPFL in 2002. The thesis proposes using statistical wavelet analysis methods for functional magnetic resonance imaging (fMRI) of the brain. Chapter 1 introduces the motivation and contributions of the thesis. Chapter 2 provides background on fMRI and image acquisition techniques. Subsequent chapters develop statistical analysis methods using wavelet transforms and apply them to analyze real fMRI data to identify brain activation patterns. The goal is to improve detection of activated regions compared to existing real-space methods.
This document is a master's thesis that aims to detect sensor failures and malfunctions in cameras. It presents methods to assess image quality and detect blurred, overexposed, underexposed, and obstructed images that could indicate sensor issues. The thesis covers implementing classifiers like support vector machines to detect blur and exploring techniques like thresholding and histograms to detect exposure range problems. It also examines detecting issues like blooming effects and power failures or obstructions that could impact the camera sensor. The goal is to continuously monitor image quality to identify sensor malfunctions in real-time.
Machine learning solutions for transportation networksbutest
This dissertation proposes machine learning solutions for problems in transportation networks. It contains four main contributions:
1. A probabilistic graphical model called a Gaussian Tree Model that describes multivariate traffic patterns using fewer parameters than standard models. This allows learning from less data.
2. A dynamic probabilistic model of traffic flow inspired by macroscopic flow models. It handles uncertainty and incorporates observations using a particle filter for prediction.
3. Two new optimization algorithms for vehicle routing that use the traffic flow model for routing in volatile environments.
4. A method for detecting traffic accidents using supervised learning that outperforms manual methods. It addresses data biases using dynamic Bayesian networks to improve performance with little labeled data.
This thesis presents work on a novel Compass Star Tracker (CST) instrument that can determine its local position on Earth by imaging stars. The document covers:
1) The theoretical development of the local position determination mathematics using star sensor attitude information.
2) Design and fabrication of a proof-of-concept CST, including the imaging sensor, optics, inclinometer, cooling system, and data acquisition hardware.
3) Experimental testing of the CST to validate the position determination concept, including analysis of theoretical and practical error sources.
The CST has potential applications for autonomous position sensing on the Moon and Mars. The work serves to eliminate constraints on the previous CST design and provides a
It is very difficult to come up with a single, consistent notation to cover the wide variety of data, models and algorithms that we discuss. Furthermore, conventions difer between machine learning and statistics, and between different books and papers. Nevertheless, we have tried to be as consistent as possible. Below we summarize most of the notation used in this book, although individual sections may introduce new notation. Note also that the same symbol may have different meanings depending on the context, although we try to avoid this where possible.
This document provides a summary of the NIST gauge block calibration process. It begins with an introduction and preface describing the history of NIST's involvement with gauge block calibration. It then discusses the key aspects of the calibration process including measurement assurance programs, mechanical comparisons using gauge block comparators, and gauge block interferometry. The goal is to completely describe the current NIST gauge block calibration process which has been developed and refined over many years of research.
The document is a thesis submitted by Aaron Croasmun to the Graduate School of the Pennsylvania State University in partial fulfillment of the requirements for a Master of Science degree in Computer Science. The thesis proposes a novel and efficient method for skeletonization of blood vessel networks in medical images. Existing skeletonization methods often require human intervention, prior knowledge of vessel boundaries, or post-processing to extract morphological parameters from the skeletons. The proposed method aims to automatically detect vessel centerlines in an image and represent them as a graph structure to facilitate measurement of parameters like branch length and branching points. Promising results are shown when applying the method to complex retinal vessel networks.
This document provides an overview and outline of a thesis on single person pose recognition and tracking using a single camera. The thesis aims to improve the performance of an interactive spatial game controlled by human poses. Key areas discussed include background subtraction using mixtures of Gaussians, particle filtering for torso tracking, and classifiers for pose recognition. The experimental setup involves video recordings of people in different conditions for testing and training classifiers. The thesis contributes improvements to hand detection and adds a classifier to detect non-poses for better game control.
Eye2Eye Optometrists, A Keratoconus Clinic providing Treatment of Keratoconus with Contact Lenses.
Our Clinic is Offering professional services for the Best Treatment of Keratoconus in Lahore, Pakistan.
Visit: Eye2Eye Optometrists | 13-D Bank Square Market | Beside Servaid Pharmacy | Valencia Main Boulevard | Lahore.
Call or What's App: 0300-4207747
In this thesis, I make as a first attempt a
mode choice model with smartphone data when data collection is passive. My research
consists in identifying and solving arising issues, due to the nature of the data, in order
to derive a dataset suitable for mode choice analysis. The key components of the
proposed methodology concern the detection of trips, activities and identication of the
trip purpose based on smarthphone data, and common issues to mode choice modeling,
such as the determination of the chosen mode and missing attributes of the unchosen
alternative, are addressed as well. The derived dataset is further enriched by complementary
datasets including socio-economic and meteorological information.
This thesis presents an approach for non-rigid multi-modal object tracking using Gaussian mixture models (GMM). The target is represented by a GMM with each ellipsoid corresponding to a different fragment of the target. A region growing algorithm is used to automatically adapt the fragment set and extract accurate boundaries. Tracking performance is improved by incorporating joint Lucas-Kanade feature tracking to handle large motions. Experimental results demonstrate the effectiveness of the approach on challenging sequences.
This document is a thesis submitted by Mark Kenneth Quinn to the University of Manchester for the degree of Doctor of Philosophy. It investigates shock diffraction phenomena and their measurement through a combination of experimental and numerical techniques. The thesis contains literature on shock waves, shock tubes, shock diffraction, shear layers, and numerical simulations. It then describes the experimental techniques of schlieren, particle image velocimetry (PIV), pressure measurements, and pressure-sensitive paint (PSP) used to study the phenomena. The apparatus and simulation setup are also outlined. Results are then presented and discussed for shock diffraction around sharp and round geometries based on density and particle-based measurements for a range of Mach numbers.
This document summarizes a student project on predicting malicious activity using real-time video surveillance. The project applies techniques like super-resolution, face and object recognition using HOG features, and neural networks to enhance video quality, identify objects and faces, and semantically describe scenes to detect unusual activity. Algorithms were implemented in MATLAB and results were stored in a MongoDB database. Key techniques included super-resolution, PCA-based face recognition, HOG-based object detection, and neural networks like CNNs and RNNs for image captioning. The project aims to help detect criminal activity and track convicted individuals in public spaces.
This document is a semester thesis submitted by Jesús Ignacio Maldonado Covarrubias to ETH Zurich in June 2011. The thesis investigates the dimensioning of an access panel for the fixed leading edge of a commercial aircraft. The objectives are to analyze an initial access panel design using finite element analysis, evaluate it against criteria such as strength and stability, and perform optimizations to reduce the panel's mass. The work is broken down into tasks such as creating CAD and FE models, analyzing different stiffener configurations, and conducting a design improvement study. The document outlines the problem definition, reviews relevant literature, describes the analytical and FE modeling approaches, and presents the results and conclusions of the study.
This document is a thesis on statistical approaches to solving the inverse problem in scatterometry. Scatterometry is used to characterize nanostructures by measuring diffraction patterns and reconstructing critical dimensions. The thesis covers:
1) Using maximum likelihood estimation and least squares to reconstruct critical dimensions and estimate measurement variances from simulated and measured data.
2) Investigating the effects of systematic errors like line roughness and multilayer variations on diffraction patterns.
3) Including these systematic errors in the reconstruction model and showing it reduces estimated variances and improves consistency with other measurement methods.
4) Outlining a Bayesian approach that incorporates prior knowledge to further reduce uncertainties in the reconstructed critical dimensions.
Im-ception - An exploration into facial PAD through the use of fine tuning de...Cooper Wakefield
This document is a thesis submitted by Cooper Wakefield to the University of Queensland for the degree of Bachelor of Engineering. The thesis proposes developing a presentation attack detection (PAD) system through fine tuning a deep convolutional neural network. It aims to leverage pre-trained networks and fine tune the upper layers to differentiate between real and fake facial images with a high degree of accuracy. The thesis outlines the problem of presentation attacks on facial recognition systems, reviews prior approaches to PAD, and describes the proposed solution of using transfer learning on a CNN to classify images as real or fake.
The Trimble M3 total station offers reliable mechanical technology combined with powerful Trimble Access field software. It features a compact, lightweight design and provides long battery life of up to 26 hours. The Trimble Access software includes modules for topographic surveys, staking, control, and road design capabilities like importing alignments and guiding offset staking.
This document is a project report submitted by Ramashish Baranwal and Ripinder Singh for the degree of Bachelor of Technology. It outlines their work on developing a content-based image retrieval system called Imagefinder. The system segments images into homogeneous regions, extracts visual features like color and shape, and indexes the features using a C-tree for efficient retrieval of similar images based on user queries. Experimental results demonstrate the image segmentation and retrieval capabilities of the system. The report also discusses potential improvements like incorporating relevance feedback to further refine search results.
This document appears to be the introduction and first few sections of a seminar paper on triangulation methods. It introduces basic concepts of epipolar geometry including the epipolar plane, epipoles, fundamental matrix, and essential matrix. It discusses how these mathematical concepts describe the relationships between 3D points and their 2D projections in multiple images. The document provides foundations for later sections that will describe methods for reconstructing 3D points from stereo image pairs using triangulation.
This document is a textbook on statistics for economists that covers topics such as descriptive statistics, probability, random variables, probability distributions, sampling, and statistical inference. It includes chapters on data types and graphical displays, basic probability concepts, discrete and continuous random variables, the normal and other common probability distributions, expectation, moments, joint and conditional probability distributions, sampling and sampling distributions, point estimation, and estimating population proportions. Each chapter provides examples and exercises.
Mrs. Davis introduces the concept of turning a shape into a form through value and contrast by shading and rendering an object ultimately turning the real lines into implied lines to create realism.
En término de sistemas de almacenamiento existen distintas posibilidades, según las características de los materiales que van a ser almacenados, el espacio con que se cuente (y que ahora pasa a ser una restricción) y la necesidad de fluidez (nivel de servicio) del almacén o CEDI.
El documento habla sobre la responsabilidad humana de cuidar el medio ambiente y la necesidad de tener conciencia ambiental más allá del crecimiento económico. También menciona que aunque algunas personas no les importa dañar el medio ambiente porque morirán antes de que se destruya el planeta, hay que pensar en las futuras generaciones.
Este documento resume los conceptos clave de la enseñanza para la comprensión. Explica que la comprensión implica poder realizar diversas tareas que demuestren y aumenten el entendimiento de un tema. Describe las inteligencias múltiples de Howard Gardner y argumenta que los estudiantes aprenden de diferentes maneras. Finalmente, introduce la enseñanza para la comprensión como un enfoque educativo que presenta los temas de diversas formas para aprovechar las capacidades de cada estudiante.
Machine learning solutions for transportation networksbutest
This dissertation proposes machine learning solutions for problems in transportation networks. It contains four main contributions:
1. A probabilistic graphical model called a Gaussian Tree Model that describes multivariate traffic patterns using fewer parameters than standard models. This allows learning from less data.
2. A dynamic probabilistic model of traffic flow inspired by macroscopic flow models. It handles uncertainty and incorporates observations using a particle filter for prediction.
3. Two new optimization algorithms for vehicle routing that use the traffic flow model for routing in volatile environments.
4. A method for detecting traffic accidents using supervised learning that outperforms manual methods. It addresses data biases using dynamic Bayesian networks to improve performance with little labeled data.
This thesis presents work on a novel Compass Star Tracker (CST) instrument that can determine its local position on Earth by imaging stars. The document covers:
1) The theoretical development of the local position determination mathematics using star sensor attitude information.
2) Design and fabrication of a proof-of-concept CST, including the imaging sensor, optics, inclinometer, cooling system, and data acquisition hardware.
3) Experimental testing of the CST to validate the position determination concept, including analysis of theoretical and practical error sources.
The CST has potential applications for autonomous position sensing on the Moon and Mars. The work serves to eliminate constraints on the previous CST design and provides a
It is very difficult to come up with a single, consistent notation to cover the wide variety of data, models and algorithms that we discuss. Furthermore, conventions difer between machine learning and statistics, and between different books and papers. Nevertheless, we have tried to be as consistent as possible. Below we summarize most of the notation used in this book, although individual sections may introduce new notation. Note also that the same symbol may have different meanings depending on the context, although we try to avoid this where possible.
This document provides a summary of the NIST gauge block calibration process. It begins with an introduction and preface describing the history of NIST's involvement with gauge block calibration. It then discusses the key aspects of the calibration process including measurement assurance programs, mechanical comparisons using gauge block comparators, and gauge block interferometry. The goal is to completely describe the current NIST gauge block calibration process which has been developed and refined over many years of research.
The document is a thesis submitted by Aaron Croasmun to the Graduate School of the Pennsylvania State University in partial fulfillment of the requirements for a Master of Science degree in Computer Science. The thesis proposes a novel and efficient method for skeletonization of blood vessel networks in medical images. Existing skeletonization methods often require human intervention, prior knowledge of vessel boundaries, or post-processing to extract morphological parameters from the skeletons. The proposed method aims to automatically detect vessel centerlines in an image and represent them as a graph structure to facilitate measurement of parameters like branch length and branching points. Promising results are shown when applying the method to complex retinal vessel networks.
This document provides an overview and outline of a thesis on single person pose recognition and tracking using a single camera. The thesis aims to improve the performance of an interactive spatial game controlled by human poses. Key areas discussed include background subtraction using mixtures of Gaussians, particle filtering for torso tracking, and classifiers for pose recognition. The experimental setup involves video recordings of people in different conditions for testing and training classifiers. The thesis contributes improvements to hand detection and adds a classifier to detect non-poses for better game control.
Eye2Eye Optometrists, A Keratoconus Clinic providing Treatment of Keratoconus with Contact Lenses.
Our Clinic is Offering professional services for the Best Treatment of Keratoconus in Lahore, Pakistan.
Visit: Eye2Eye Optometrists | 13-D Bank Square Market | Beside Servaid Pharmacy | Valencia Main Boulevard | Lahore.
Call or What's App: 0300-4207747
In this thesis, I make as a first attempt a
mode choice model with smartphone data when data collection is passive. My research
consists in identifying and solving arising issues, due to the nature of the data, in order
to derive a dataset suitable for mode choice analysis. The key components of the
proposed methodology concern the detection of trips, activities and identication of the
trip purpose based on smarthphone data, and common issues to mode choice modeling,
such as the determination of the chosen mode and missing attributes of the unchosen
alternative, are addressed as well. The derived dataset is further enriched by complementary
datasets including socio-economic and meteorological information.
This thesis presents an approach for non-rigid multi-modal object tracking using Gaussian mixture models (GMM). The target is represented by a GMM with each ellipsoid corresponding to a different fragment of the target. A region growing algorithm is used to automatically adapt the fragment set and extract accurate boundaries. Tracking performance is improved by incorporating joint Lucas-Kanade feature tracking to handle large motions. Experimental results demonstrate the effectiveness of the approach on challenging sequences.
This document is a thesis submitted by Mark Kenneth Quinn to the University of Manchester for the degree of Doctor of Philosophy. It investigates shock diffraction phenomena and their measurement through a combination of experimental and numerical techniques. The thesis contains literature on shock waves, shock tubes, shock diffraction, shear layers, and numerical simulations. It then describes the experimental techniques of schlieren, particle image velocimetry (PIV), pressure measurements, and pressure-sensitive paint (PSP) used to study the phenomena. The apparatus and simulation setup are also outlined. Results are then presented and discussed for shock diffraction around sharp and round geometries based on density and particle-based measurements for a range of Mach numbers.
This document summarizes a student project on predicting malicious activity using real-time video surveillance. The project applies techniques like super-resolution, face and object recognition using HOG features, and neural networks to enhance video quality, identify objects and faces, and semantically describe scenes to detect unusual activity. Algorithms were implemented in MATLAB and results were stored in a MongoDB database. Key techniques included super-resolution, PCA-based face recognition, HOG-based object detection, and neural networks like CNNs and RNNs for image captioning. The project aims to help detect criminal activity and track convicted individuals in public spaces.
This document is a semester thesis submitted by Jesús Ignacio Maldonado Covarrubias to ETH Zurich in June 2011. The thesis investigates the dimensioning of an access panel for the fixed leading edge of a commercial aircraft. The objectives are to analyze an initial access panel design using finite element analysis, evaluate it against criteria such as strength and stability, and perform optimizations to reduce the panel's mass. The work is broken down into tasks such as creating CAD and FE models, analyzing different stiffener configurations, and conducting a design improvement study. The document outlines the problem definition, reviews relevant literature, describes the analytical and FE modeling approaches, and presents the results and conclusions of the study.
This document is a thesis on statistical approaches to solving the inverse problem in scatterometry. Scatterometry is used to characterize nanostructures by measuring diffraction patterns and reconstructing critical dimensions. The thesis covers:
1) Using maximum likelihood estimation and least squares to reconstruct critical dimensions and estimate measurement variances from simulated and measured data.
2) Investigating the effects of systematic errors like line roughness and multilayer variations on diffraction patterns.
3) Including these systematic errors in the reconstruction model and showing it reduces estimated variances and improves consistency with other measurement methods.
4) Outlining a Bayesian approach that incorporates prior knowledge to further reduce uncertainties in the reconstructed critical dimensions.
Im-ception - An exploration into facial PAD through the use of fine tuning de...Cooper Wakefield
This document is a thesis submitted by Cooper Wakefield to the University of Queensland for the degree of Bachelor of Engineering. The thesis proposes developing a presentation attack detection (PAD) system through fine tuning a deep convolutional neural network. It aims to leverage pre-trained networks and fine tune the upper layers to differentiate between real and fake facial images with a high degree of accuracy. The thesis outlines the problem of presentation attacks on facial recognition systems, reviews prior approaches to PAD, and describes the proposed solution of using transfer learning on a CNN to classify images as real or fake.
The Trimble M3 total station offers reliable mechanical technology combined with powerful Trimble Access field software. It features a compact, lightweight design and provides long battery life of up to 26 hours. The Trimble Access software includes modules for topographic surveys, staking, control, and road design capabilities like importing alignments and guiding offset staking.
This document is a project report submitted by Ramashish Baranwal and Ripinder Singh for the degree of Bachelor of Technology. It outlines their work on developing a content-based image retrieval system called Imagefinder. The system segments images into homogeneous regions, extracts visual features like color and shape, and indexes the features using a C-tree for efficient retrieval of similar images based on user queries. Experimental results demonstrate the image segmentation and retrieval capabilities of the system. The report also discusses potential improvements like incorporating relevance feedback to further refine search results.
This document appears to be the introduction and first few sections of a seminar paper on triangulation methods. It introduces basic concepts of epipolar geometry including the epipolar plane, epipoles, fundamental matrix, and essential matrix. It discusses how these mathematical concepts describe the relationships between 3D points and their 2D projections in multiple images. The document provides foundations for later sections that will describe methods for reconstructing 3D points from stereo image pairs using triangulation.
This document is a textbook on statistics for economists that covers topics such as descriptive statistics, probability, random variables, probability distributions, sampling, and statistical inference. It includes chapters on data types and graphical displays, basic probability concepts, discrete and continuous random variables, the normal and other common probability distributions, expectation, moments, joint and conditional probability distributions, sampling and sampling distributions, point estimation, and estimating population proportions. Each chapter provides examples and exercises.
Mrs. Davis introduces the concept of turning a shape into a form through value and contrast by shading and rendering an object ultimately turning the real lines into implied lines to create realism.
En término de sistemas de almacenamiento existen distintas posibilidades, según las características de los materiales que van a ser almacenados, el espacio con que se cuente (y que ahora pasa a ser una restricción) y la necesidad de fluidez (nivel de servicio) del almacén o CEDI.
El documento habla sobre la responsabilidad humana de cuidar el medio ambiente y la necesidad de tener conciencia ambiental más allá del crecimiento económico. También menciona que aunque algunas personas no les importa dañar el medio ambiente porque morirán antes de que se destruya el planeta, hay que pensar en las futuras generaciones.
Este documento resume los conceptos clave de la enseñanza para la comprensión. Explica que la comprensión implica poder realizar diversas tareas que demuestren y aumenten el entendimiento de un tema. Describe las inteligencias múltiples de Howard Gardner y argumenta que los estudiantes aprenden de diferentes maneras. Finalmente, introduce la enseñanza para la comprensión como un enfoque educativo que presenta los temas de diversas formas para aprovechar las capacidades de cada estudiante.
El documento describe diferentes tipos de comunicación en Internet. La comunicación puede ser pasiva o activa. La comunicación pasiva implica publicar contenido y responder a comentarios, mientras que la comunicación activa también incluye el seguimiento e intervención. La comunicación puede ser síncrona u asincrónica. La comunicación síncrona es en tiempo real entre personas a través de computadoras, mientras que la comunicación asincrónica permite la comunicación no simultánea como el correo electrónico.
Elizabeth "Biz" Nelson is currently the Head of Theatre Design at Trinity Valley Community College in Athens, TX. She received her MFA in Scenography from the University of Kansas and a BA in Theatre from Northwestern College. Nelson has over 10 years of experience in scenic and costume design, stage management, and technical theatre for universities and theaters across the United States. She regularly designs and works on productions while mentoring students in design.
Este documento habla sobre un blog y cómo insertar imágenes y textos en él. Brevemente menciona las palabras "Mi blog", "Imágenes", "Textos" y "Como insertar" que indican que trata sobre cómo añadir estas cosas a un blog propio.
El Consejo Académico de la Universidad del Quindío modificó el calendario de ceremonias de grado para 2017, estableciendo seis fechas de grado en febrero, marzo, mayo, agosto, octubre y diciembre. El nuevo calendario incluye fechas límite para cada etapa del proceso de grado como la entrega de solicitudes, emisión de paz y salvos y envío de documentos.
This document summarizes Seth Nagelberg's academic record at Ryerson University, where he majored in Computer Science. It shows his coursework and grades from 2011 to 2016, maintaining a high cumulative GPA of 4.2. He consistently earned high grades, often on the Dean's List, and remained in clear academic standing.
Jomar Duyao is a site engineer with 11 years of experience in construction and engineering projects in the UAE and Philippines. He is seeking a site engineer position and has experience supervising precast installation, overseeing quality control, and managing projects from planning through handover. His background includes roles as a site engineer, QA/QC engineer, and technician for various companies such as Dubai Precast, JT Metro JV, and Saudi Telecom.
This document lists various printing, packaging and design services including corrugated boxes, product sticking, stationery, desktop calendars, magazines, leaflets, posters and print ads.
The document lists over 20 IT projects completed by a bank between 2014-2016. It includes projects to automate processes, install new networks and systems, upgrade platforms, implement online and mobile banking, deploy new ATMs, replace workstations, and upgrade servers. Many projects involved working with core vendors to ensure smooth transitions. All projects appear to have been completed on time.
60969_Orsted2003-Morten Høgholm Pedersen-New Digital Techniques in Medical Ul...Morten Høgholm Pedersen
This document is a PhD thesis on new digital techniques in medical ultrasound scanning. It contains four main sections. The first section provides background on 3D ultrasound imaging, scanning techniques, and visualization methods. The second section describes a clinical trial using 3D ultrasound to stage cervical cancer in patients. Results showed 3D ultrasound was comparable to MRI and histology in evaluating tumor size and invasion. The third section discusses a pre-clinical trial using coded excitation to improve ultrasound image quality. Initial results found coded excitation increased penetration depth and reduced sidelobe artifacts. The fourth section concludes the thesis and discusses perspectives on using these new digital ultrasound techniques clinically.
This thesis proposes a computed tomography (CT) scanning method for logs using feature-tailored voxels. The method reduces the number of unknowns in the reconstruction problem by defining voxels based on the geometry of internal log features. Data is binned, centered and normalized to make the reconstruction tolerant to motion and eliminate unnecessary voxels. Experimental results on phantoms and real logs demonstrate the effectiveness of the approach for lumber processing applications.
This document is Sebastian Fabian's master's thesis which presents a novel method for predicting upcoming road topography using a self-learning geographical and topographical raster map. The system constructs the map from GPS trace data, predicts routes by looking ahead in the map, and updates the map each time an area is driven through. It is designed for use in embedded automotive control systems to enable features like adaptive cruise control. The thesis describes the implementation of the map building, route prediction, and other aspects of the system with considerations for memory and processing constraints. It also evaluates the performance of different design choices and parameters through testing on sample route data.
This doctoral thesis examines methods for estimating the authenticity of videos by analyzing their visual quality and structure. It aims to determine the proportion of information an edited video retains from its original parent video. The thesis first evaluates existing no-reference algorithms for visual quality assessment. It then explores techniques for shot segmentation and comparison. It also develops models for calculating a video's authenticity degree based on factors like visual quality, shot importance, and evidence of global modifications. The goal is to objectively estimate a video's authenticity when only the video itself is available for analysis, without relying on external metadata.
LATENT FINGERPRINT MATCHING USING AUTOMATED FINGERPRINT IDENTIFICATION SYSTEMManish Negi
This document describes a project on latent fingerprint matching using an automated fingerprint identification system. The project aims to develop an algorithm for latent fingerprint matching that uses minutiae and orientation field information. The algorithm performs fingerprint enhancement techniques like binarization and thinning. It then extracts minutiae features and calculates the orientation field. The minutiae features and orientation field are used to match latent fingerprints to those in a database. The algorithm is implemented in MATLAB with a GUI. Test results on the FVC2002 database show that using both minutiae and orientation field matching provides better performance than only using minutiae.
This document describes a project that aims to estimate full-body demographics from images using computer vision and machine learning techniques. The project proposes a novel method to automatically annotate images with categorical labels for a wide range of body features, like height, leg length, and shoulder width. The method explores using common computer vision algorithms to extract features from images and video frames and compare them to a database of subjects with labeled body features. The document outlines the requirements, approaches considered, design and implementation of the project, and evaluates the results in estimating demographics and identifying individuals.
This document is a research paper written by Craig Ferguson at the University of Cape Town that presents a high performance traffic sign detection technique for use in low power systems or high speed vehicles. The paper introduces the problem of traffic sign detection in vehicles and outlines the objectives and structure of the research. It then reviews existing literature on topics like preprocessing, detection, classification, training and testing. The paper goes on to describe the proposed method, which uses RGB thresholding for segmentation and tracks signs across frames to allow for a voting scheme. It presents results showing the method performs detection at 13ms per frame and achieves 83% detection efficiency, significantly outperforming a cascade classifier detector. The technique is constrained to midday lighting but provides a proof
Virtual Environments as Driving Schools for Deep Learning Vision-Based Sensor...Artur Filipowicz
At the turn of the 20th century, inventors and industrialists alike strived to enable every person to own and drive a car. Overtime, automobile ownership grew to meet that vision. One hundred years later, automobile manufacturers and technology companies are working on self-driving cars which would be neither owned nor driven by individuals. The benefits of replacing cars with fully autonomous vehicles are enormous. While it is difficult to put a value on lives saved, injuries avoided, pollution reduced, and commute time repurposed, economic savings from this technology are estimated to be on the order of trillions of dollars. The main roadblock in achieving the vision for this century is developing technology which would enable autonomous vehicles to perceive and understand the environment as well as, if not better than, human divers. Perception is a roadblock because presently no algorithm is capable of reaching human levels of cognition.
This thesis explores the interaction between virtual reality simulation and Deep Learning which may develop computer vision that rivals human vision. The specific problem considered is detection and localization of a stop object, the stop sign, based on an image. A video game, Grand Theft Auto 5, is used to collect over half a million images and corresponding ground truth labels with and without stop signs in various lighting and weather conditions. A deep convolutional neural network trained on this data and fine tuned on real world data achieves accuracy in stop sign detection of over 95% within 20 meters of the stop sign and has a false positive rate of 4% on test data from the real world. Additionally, the physical constraints on this problem are analysed, a framework for the use of simulators is developed, and domain adaptation and multi-task learning are explored.
This document summarizes Philip Engström's master's thesis project on interactive GPU-based volume rendering. The project investigated two approaches - one based on textured slices of proxy geometry and one based on ray casting. It was found that the ray casting implementation provided far superior image quality. Most of the project work focused on improving ray casting performance through an empty space skipping method using a complex bounding geometry. The report provides background on volume rendering and GPU technology to make the project accessible to readers with basic computer graphics knowledge.
This document provides information about analytical chemistry concepts and terminology. It begins with an introduction to units of measurement and expressions of concentration commonly used in analytical chemistry. It then discusses the basic equipment and techniques used to measure mass and volume, prepare standard solutions, and record experimental work in a laboratory notebook. The document emphasizes the importance of careful measurements and calculations in analytical chemistry. It aims to establish a foundation of terminology, concepts, and procedures that are fundamental to quantitative chemical analysis.
This document provides an introduction to the course GS400.02 Introduction to Photogrammetry. It discusses photogrammetry as an engineering discipline influenced by developments in computer science and electronics. The document notes that there is typically a gap between research findings, product development, and implementation in practice. It uses analytical plotters as an example, noting they were invented in the late 1950s but not widely manufactured and used until around 20 years later. The course will provide an overview of photogrammetry concepts and principles with an emphasis on understanding rather than detailed operational knowledge.
The document is a project report submitted by Ajay Vishwas Jadhav to the Centre for Modeling and Simulation at Savitribai Phule Pune University. The report describes Jadhav's work on modeling and optimization of rheological data during his M.Tech program from January to June 2015. The project involved fitting experimental rheological data to relaxation spectra models using nonlinear regression techniques like the Marquardt-Levenberg algorithm and genetic algorithms. The report includes analysis of model and experimental data as well as details of the algorithms used.
The document describes an image processing methodology to detect the nematode C. elegans in microscope images. It aims to automate the identification of individual worms, which is currently done manually but is too labor-intensive. The methodology segments worms from the background, detects endpoints, generates shape descriptors, and performs profile-driven shape fitting to identify worms. It was implemented as a plug-in for the open-source image analysis software Endrov and aims to improve upon previous automated methods by achieving a higher matching accuracy.
Efficiency Optimization of Realtime GPU Raytracing in Modeling of Car2Car Com...Alexander Zhdanov
This master's thesis investigates efficiency optimization techniques for real-time GPU raytracing used in modeling car-to-car communication systems. Specifically, it aims to improve the simulation of the propagation channel through ray reordering and caching. The research analyzes existing caching schemes exploiting frame coherence, GPU data structures, and ray reordering techniques. It proposes algorithms for ray sorting on the CPU and caching tracing data. The thesis then implements and evaluates the proposed methods, analyzing system performance for static and dynamic scenes. Testing shows ray reordering significantly increases efficiency, though caching provides varying benefits depending on the scheme used.
This document is a master's thesis submitted by R.Q. Vlasveld to Utrecht University in partial fulfillment of the requirements for a Master of Science degree. The thesis explores using one-class support vector machines (SVMs) for temporal segmentation of human activity time series data recorded by inertial sensors in smartphones. The author first reviews related work in temporal segmentation and change detection methods. An algorithm is then presented that uses an incremental SVDD model to detect changes between activities in a continuous data stream. The algorithm is tested on both artificial and real-world human activity data sets recorded by the author. Quantitative and qualitative results demonstrate the method can find changes between activities in an unknown environment.
This document provides an overview and summary of a thesis on visualizing uncertainty in fiber tracking based on diffusion tensor imaging (DTI). The thesis addresses challenges with visualizing uncertainty throughout the DTI and fiber tracking pipeline, including image acquisition, diffusion modeling, fiber tracking, and visualization. It proposes and evaluates various techniques for visualizing different types of uncertainty, such as value uncertainty, location uncertainty, and parameter uncertainty. The visualization techniques are applied to fiber tracking results to aid in neurosurgical planning and other medical applications.
The document discusses support architecture for high-level synthesis of algorithms that use pointers. It first introduces high-level synthesis and its typical steps of compilation, allocation, scheduling, binding and generation. It then presents a case study on OpenCV image processing algorithms that heavily use pointers. The proposed architecture aims to address the memory model problem for such algorithms. It consists of different memory structures like RAM, ring buffer and virtual buffer to support locality and efficient handling of pointers. Exception handling and integration methods are also discussed to map the algorithm to the architecture within the high-level synthesis flow.
This document is a master's thesis submitted by Sascha Nawrot to Berlin University of Applied Sciences in partial fulfillment of the requirements for a Master of Science degree in Applied Computer Science. The thesis introduces novel, lightweight open source annotation tools for whole slide images that enable deep learning experts and pathology experts to cooperate in creating training samples by annotating regions of interest in whole slide images, regardless of platform or format, in a fast and easy manner. The tools consist of a conversion service to convert whole slide images to an open format, an annotation service for annotating regions of interest, and a tessellation service to extract the annotated regions from the images.
Ellum, C.M. (2001). The development of a backpack mobile mapping systemCameron Ellum
The document summarizes Cameron MacKenzie Ellum's 2001 master's thesis on developing a backpack mobile mapping system. The system integrated a GPS receiver, digital magnetic compass, inclinometer, and consumer digital camera. Testing of the prototype system achieved horizontal and vertical accuracies of 0.2 meters and 0.3 meters respectively. Ellum derived new techniques for including navigational data from sensors like the compass and inclinometer in a photogrammetric bundle adjustment. The thesis details the design, implementation, and testing of the prototype backpack mobile mapping system.
Similar to Venturini - X ray imaging for cheese (20)
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSIJNSA Journal
The smart irrigation system represents an innovative approach to optimize water usage in agricultural and landscaping practices. The integration of cutting-edge technologies, including sensors, actuators, and data analysis, empowers this system to provide accurate monitoring and control of irrigation processes by leveraging real-time environmental conditions. The main objective of a smart irrigation system is to optimize water efficiency, minimize expenses, and foster the adoption of sustainable water management methods. This paper conducts a systematic risk assessment by exploring the key components/assets and their functionalities in the smart irrigation system. The crucial role of sensors in gathering data on soil moisture, weather patterns, and plant well-being is emphasized in this system. These sensors enable intelligent decision-making in irrigation scheduling and water distribution, leading to enhanced water efficiency and sustainable water management practices. Actuators enable automated control of irrigation devices, ensuring precise and targeted water delivery to plants. Additionally, the paper addresses the potential threat and vulnerabilities associated with smart irrigation systems. It discusses limitations of the system, such as power constraints and computational capabilities, and calculates the potential security risks. The paper suggests possible risk treatment methods for effective secure system operation. In conclusion, the paper emphasizes the significant benefits of implementing smart irrigation systems, including improved water conservation, increased crop yield, and reduced environmental impact. Additionally, based on the security analysis conducted, the paper recommends the implementation of countermeasures and security approaches to address vulnerabilities and ensure the integrity and reliability of the system. By incorporating these measures, smart irrigation technology can revolutionize water management practices in agriculture, promoting sustainability, resource efficiency, and safeguarding against potential security threats.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
1. X-ray imaging for measuring holes in cheese
by
Lorenzo Venturini (Queens’ College)
Fourth-year undergraduate project in
Group F, 2015/2016
Supervised by Dr. Joan Lasenby
I hereby declare that, except where specifically indicated, the work
submitted herein is my own original work.
Signed: Date:
2. Technical abstract
X-ray imaging for measuring holes in cheese
Lorenzo Venturini, Queens’ College, Group F, 2015/2016
Swiss-type cheeses like Emmental and Leerdammer are characterised by the presence
of large and prominent holes, which give these cheeses their characteristic look and flavour.
As consumers expect a certain size and distribution of holes in their Swiss cheese, ensur-
ing that these characteristics meet their standards has become economically important.
Quality control of these cheeses, to ensure that they have the desired characteristics, is
thus an important step. Traditionally, this has been performed by a technique known as
core sampling, an inherently destructive method which cuts through and visually exam-
ines holes along a column cut through the cheese. Not only does this damage the cheese,
it also shears the holes along the column, distorting them and providing an inaccurate
sample of the internal structure. It is also limited in scope, providing data only on those
holes which happen to lie along the column cut.
This project attempts to develop a new, nondestructive technique to obtain a better
picture of the size and distribution of the holes in this cheese. Specifically, the aim of the
project is to be able to reconstruct a 3-dimensional map of the cheese from the minimum
number of X-ray projections of the cheese, obtaining an accurate and economically com-
petitive quality-control technique. This is to be designed for commercial applications, and
therefore the cost and competitiveness of the techniques are more important than perfect
accuracy.
It was attempted to obtain this 3D reconstruction from just 2-dimensional X-ray pro-
jections from different angles. This is insufficient to obtain an accurate 3D reconstruction
using traditional techniques (such as those used for medical imaging), and therefore a
custom method was designed to reconstruct the structure by incorporating the known
prior information about the characteristics of the holes in cheese. Individual holes were
found in each 2D projection, and then simple geometric techniques were used to match
holes in different projections to each other and reconstruct their 3D positions and lengths.
To obtain a ground truth to compare reconstructions to, a proxy cheese, with similar
statistics to real Emmental cheese, was designed using MATLAB and parametric CAD
and then manufactured by 3D printing. X-ray projections of this were then taken and
analysed to obtain reconstructions.
An objective function was devised and an optimisation technique to minimise it was
developed by matching ellipsoidal kernels to individual 2D projections. The kernels were
initialised at uniform spacing and they were translated and stretched until the rms error
of the reconstructed image using them was minimised. This was done in two different 2D
projections, taken at a relative angle of 90º to each other, and the set of kernel locations,
brightnesses and eccentricities was compared between the two projections. A similarity
2
3. score was calculated between each pair of converged kernels, and the most similar kernels
in the two projections were matched to each other.
This matching, plus knowledge of the geometry of the system, provided enough infor-
mation to obtain a full three-dimensional reconstruction of the structure of the cheese.
The centres of the holes were simply found via geometric principles from their positions
in each view. The length along each axis was similarly obtained, using the assumption of
orthographic projection to make recovery of the lengths easier and less prone to noise.
The locations of the centres of the holes which were correctly reconstructed in this way
were found with good accuracy, with an rms error on the position of the centre of 4.5mm.
Reconstruction of the lengths and volumes of the holes was somewhat less accurate, with
average deviations of 12% and 19%, respectively. This is nonetheless far better accuracy
than offered by core sampling methods.
The execution time of the method on high-resolution X-ray images was found to be very
slow, running on the order of an hour on a standard PC for a single projection. This can
be improved by subsampling high-resolution images, but the optimisation method devised
is inefficient and has scope to be made faster. Using a method with a variable step size or
preloading the reconstruction to compute rms error could substantially improve the speed
of the technique. Another relatively simple change that could be made to improve the
accuracy of this technique is to change the optimisation technique to correctly fit holes
at the edges of the cheese, or holes with a substantial amount of overlap. It should also
be possible to automatically detect the edges of the cheese in the image and make the
process invariant to translation.
Overall, it was deemed that the technique developed offers better performance than
traditional core sampling. It is rapid (with only two X-ray projections), cost-effective,
nondestructive, and gives a reasonably accurate image of the location and sizes of the
holes within the cheese. Additional steps were proposed to improve the accuracy and
efficiency of the algorithm outlined, and possible future applications for this technique
were also specified.
3
4. Acknowledgments
I would like to thank my supervisor, Dr. Joan Lasenby, for all the guidance and support
she has given me throughout the project. She has followed my progress throughout the
year and her advice on how to proceed has proved invaluable.
I would also like to express my gratitude to Cheyney Design for providing the idea
for this project and allowing me to use their X-ray equipment. In particular, I would
like to thank Richard Parmee for being my main point of contact and providing valuable
resources; Jonathan Cameron for teaching me to use the X-ray machine and providing
me with X-ray images; and Patrick Roux for arranging the delivery of real cheese.
Mifroma provided the cheese that was used in this project and their help has made it
possible to test the methods developed here in the real world.
I would also like to thank Prof. Jim Haseloff of the Cambridge University Department
of Plant Sciences for allowing me to use his lab’s 3D printer for my unusual and time-
consuming cheese print, as well as Mihails Delmans in his lab for printing the proxy.
Finally, I would like to thank Konstantinos Kyriakopoulos, who gave me advice with
my code and helped me greatly speed up my implementation, as well as being an excellent
friend.
4
7. 1 Introduction
1.1 Motivation
Cheese with holes, such as Emmental or Leerdammer, is a very popular and tasty addition
to the dinner table. Production of Emmental cheese in Switzerland and France alone
exceeds 270000 tons a year [1]. One of the distinguishing features of these cheeses is the
presence of large and prominent holes (or “eyes”), caused by colonies of bacteria which
consume the cheese during maturation and produce carbon dioxide [2]. As the density
and size of holes changes the chemical makeup and flavour of the cheeses, manufacturers
strive to maintain a consistent size and distribution of holes in their product.
With such a popular and profitable product, it is surprising that the techniques used
to find the holes within samples of cheese and conduct quality control have remained
the same for centuries. In fact, the traditional technique to inspect holes, known as core
sampling, simply involves extracting a cylindrical sample from a wheel of cheese using a
tool known as a cheese trier [3], and manually inspecting the holes present along it.
While this can provide a rough picture of the characteristics of the holes along that
sample, it has some clear limitations. This technique is semi-destructive, lowering the
value of the wheel of cheese. Furthermore, it only gives information about the holes on
the sample - it will not, for example, be able to find any asymmetries in the distribution
of holes in the cheese.
1.2 Problem definition
The aim of this project is to design a nondestructive technique to faithfully find the 3D
positions and other characteristics of the holes within a cheese. This should be at least
as accurate as existing (destructive) techniques, as well as cost-effective. X-ray imaging
was chosen as it is a relatively cheap and established technology that can provide a good-
quality picture of the holes within the cheese.
X-ray imaging is not the only method which can be thought of to provide a three-
dimensional map of the cheese: several other common imaging techniques, such as ul-
trasound or MRI, could conceivably be used instead. X-ray was chosen due to the low
cost, high resolution and low image acquisition time of the technique, which makes it an
easy option for this project. MRI was discarded due to its high cost relative to X-ray
imaging and high acquisition time. Ultrasound, though cheap, requires a gel to be spread
over the cheese first to diminish reflection of the ultrasound signal. This was deemed
to be unpalatable to consumers, potentially lowering the value, and therefore X-ray was
preferred.
To produce a three-dimensional image, multiple two-dimensional projections must
be taken and combined; one part of this project is focused on obtaining a good three-
7
8. Figure 1: Two X-ray images of a cheese proxy, imaged at a relative angle of ∼ 25º. The
relative change in position of the holes in the projections suggest that it is possible to
extract 3D information from them.
dimensional image with all desired information, from 2D projections (see Figure 1).
This project is aimed primarily at commercial applications, trying to find a technique
that is more cost-effective and accurate than current methods. This means that any
method developed does not need to be perfectly accurate: it just needs to be able to
provide a better picture than established techniques, with only a small error. It also
means that low cost is a priority: as part of this tradeoff, it is desirable to take as few
2D projections as possible. The aim therefore is to take as few as two projections, while
maintaining an appropriate level of fidelity.
Cheyney Design, a Cambridge-based company specialised in X-ray imaging, provided
technical advice as well as X-ray equipment to help with this project; Mifroma, a Swiss
cheese manufacturer, provided cheese samples.
1.3 Structure of report
This report is split into seven sections, corresponding to the different aspects of the
problem that were examined.
Section 2 provides background to the problem, giving a review of the literature tackling
similar problems and a technical background to the methods and techniques used in the
project. Section 3 describes the work that was done to prepare images for analysis,
including design of a 3D-printed proxy, image preprocessing, and the performance of
common techniques. A custom method to extract holes in individual 2D projections is
outlined in section 4, describing the process, structure of the kernels and components of
the cost function. Two projections are then combined to obtain a single 3D reconstruction,
and the technique used to find that is described in section 5. Section 6 gives an overview
of the performance and efficiency of each of the stages implemented in this report, with a
8
9. quantitative analysis of the quality of the results and suggestions to improve the method.
Conclusions and reflections on future directions and possible applications are given in
section 7.
2 Theory and design of the experiment
2.1 Literature review
There are many examples in the literature of X-ray imaging being used for quality control
of foods, including cheese. Brosnan and Sun (2004) [4] offer a review of computer vision
applications in food inspection, including several applications for X-rays. These include
using X-ray imaging to detect bones in chicken and fish [5] and to find split pits in peaches
[6].
These techniques are well-established, and several companies exist which offer X-ray
applications for food inspection, including Cheyney Design, which was heavily involved
in this project. However, past efforts have mostly focused on detecting foreign objects
within the cheese, such as fragments of metal or stone.
The only attempt that could be found in the literature to characterise holes using
X-ray imaging techniques was made by Kraggerhud et al. [7]. They sought to find the
positions and dimensions of holes in cheese using an X-ray scan, and therefore provide a
statistical indication of the quality of the cheese. However, their attempts are limited to
a single 2-dimensional projection - they did not attempt to build a 3D reconstruction of
the cheese. Additionally, the technique they developed is only valid on spherical holes;
in reality, however, the holes are ellipsoids with substantial eccentricity. It was therefore
deemed that this project would make a novel contribution to the state of the art.
Other attempts exist in the literature to extract holes from photographs of cheese [8],
ultrasound [9] and even MRI [10]. However, due to the different imaging equipment having
different capabilities and limitations, the techniques developed with these two methods
are not directly applicable to X-ray imaging.
2.2 Cheese specifications
The cheese used for the purposes of this project was a sample of Emmental cheese provided
by Mifroma, a Swiss cheesemaker.
It was aged for 3 months under commercial conditions and X-rayed at Cheyney Design.
This sample has dimensions of 345mm × 265mm × 55mm, though only a section of it
(210mm x 100mm x 55mm) was examined by X-ray, for practical purposes. The techniques
developed however should, in principle, be applicable to cheeses of any size.
9
10. Figure 2: Left, a projection of a block of cheese imaged with visible light. Right, a
projection of a larger block imaged with X-rays.
2.3 X-ray imaging principles
Most materials in everyday life (including cheese) are opaque to visible light. This means
that when they are struck by photons of visible light, they will tend to absorb all, or
nearly all, of their photons near their surface. This is why holding a block of cheese in
front of a light is not in itself sufficient to identify whether there are any holes within
it - the material’s attenuation coefficient to visible light is simply too high to let any
appreciable number of photons through its thickness, leading to a uniformly dark image
(see Figure 2, left).
Clearly this is not a viable technique to locate holes inside a cheese. However, while
cheese is opaque to visible light, it is much more transparent to other frequencies of the
electromagnetic spectrum.
X-rays are the region of the light spectrum with wavelength 0.01-10 nm (or energies
∼100 eV - 100 keV per photon). This is a far higher energy range than visible light,
and crucially most biological materials (including cheese) are much more transparent to
this range of frequencies than to visible light. This leads to well-known applications in
medicine, where the relative transparencies of different types of tissue can allow for a
photographic image of the patient to be constructed by shining X-ray radiation through
them. This can also be used in cheese, which is mostly (but not entirely) transparent to
X-ray radiation, allowing an image of the holes present within it to be obtained (Figure
2, right).
The attenuation of radiation through a material is given by the Beer-Lambert law [11],
n
n0
= exp (−µx)
where n is the number of photons exiting the material, n0 is the number of photons
10
11. entering the material, µ is the material’s attenuation coefficient1
and x is the thickness.
The X-ray attenuation coefficient for cheese (approximated by that of water [12]) per
unit volume is larger than that of air by a factor of approximately 103
, which confirms
that it should be easy to draw contrasts between solid cheese and holes in an X-ray image.
X-rays are typically produced by X-ray generators. These consist of vacuum tubes
where a hot cathode releases electrons, which are accelerated by high voltages (50 kV in
the model used in this report[14]) onto an anode. The energy released by the electrons’
collision with the anode is in the form of X-ray radiation, whose energy is limited by
the voltage (ie. the radiation produced by this setup cannot exceed 50 keV). The output
distribution of X-ray frequencies is a combination of the generator materials’ characteristic
spectra and a smooth curve, known as bremsstrahlung [13]. To prevent the emission of
radiation at low energy, this output is then filtered by a thin layer of material (beryllium
in the generator used). This is known as beam hardening.
The cheese has a relatively small attenuation coefficient with respect to X-rays (µ ≈
5 × 10−2
cm−1
at the X-ray energy used) which means that cheeses with thickness of up
to several centimetres can be usefully imaged with a reasonable amount of light passing
through. A quick back-of-the-envelope calculation shows that the contrast of a hole of
diameter d cm within a cheese of thickness t is (assuming air within the holes to have no
attenuation)
Contrast =
(brightness through hole centre)
(brightness through region with no holes)
=
exp(−µ(t − d))
exp(−µt)
= exp(µd).
This means that even a relatively small hole with diameter d = 1cm presents a contrast
ratio of roughly 1.05:1, which should be noticeable when the hole is placed in a piece of
otherwise uniform cheese. This contrast can also be artificially enhanced to make the
holes appear more obvious.
2.4 Reconstruction techniques
Several 3D reconstruction techniques exist to find the internal structure of a 3D object
from 2D projections. Many are widespread, being used very commonly in medicine to
perform CT scans.
It is possible to take a different approach, focusing instead on finding individual holes
in a 2D projection, and then combining them with different projections to reconstruct a
1
The coefficient µ is depends on the sum of several nuclear interactions and therefore has a complex
relationship with frequency - nonetheless it can be approximated to be constant across the X-ray spectrum
for the purposes of this report.
11
12. 3D image. These are also investigated.
2.4.1 Traditional methods
Computed tomography (CT) is a very widespread and well-established technique to re-
cover a 3D reconstruction from single 2D X-ray projections. Since the technique’s devel-
opment in the 1970s, it has been used in hospitals around the world and is now part of
many fairly routine medical procedures.
Filtered backprojection [15] is the principal mathematical technique by which CT
works. It uses a mathematical technique known as the Radon transform to attempt to
invert the projection process, using many (on the order of 100) individual 2D projections
of the area of interest, from many angles around the object (approximately a 180º range).
Each of these is then projected back along the line where the points with high attenuation
could lie in a 3D image, and filters are applied to reduce any of the resulting “streaking”
artifacts. With large numbers of projections and angles, this can lead to a very accurate
reconstruction which is appropriate for use in sensitive medical applications like surgery
or radiotherapy.
A different technique which is increasingly being used to recover 3D images in medi-
cal CT is iterative reconstruction [15]. There are many different approaches to iterative
reconstruction, but fundamentally they attempt to build a 3D reconstruction by finding
the most likely attenuation coefficient for each voxel by iterative optimisation techniques.
This is increasingly popular in medical devices but it typically requires much more com-
putational power than backprojection to obtain a reconstruction.
2.4.2 Hough transform-based methods
A different approach could be to find the locations of individual holes within 2D projec-
tions, and combine the information of two projections to find a three-dimensional position.
This would involve finding ellipses in a two-dimensional image.
There are several techniques to find ellipses within 2D images [16]. Of these, a popular
method is the Hough transform.
The Hough transform can find imperfect instances of a desired feature within an image
[17, 18]. Typically, this is used in computer vision to find straight lines or circles within
an image. More recently, techniques extending the Hough transform to find other features
such as ellipses in 2D images have been investigated [18].
The memory usage involved with the Hough transform is heavily dependent on the di-
mensionality of the features being matched. The complexity of the transform is O nd−2
,
where n is the number of pixels in the image and d is the dimensionality of the feature
searched for.
12
13. Figure 3: Left, a traditional computer vision setup with two cameras (or a rotating
camera). This is equivalent to the setup on the right, where the cheese is rotated instead.
2.5 Geometric principles and three-dimensional reconstruction
Standard computer vision techniques can be used to find the 3D position of points from
2D images at different angles.
Any point in an image at one angle must lie along a line in the other image known as
the epipolar line. This is the image in the other camera of the line joining the point and
the optical centre of the other camera. With a known point in one image and knowing the
positions and internal parameters of the different cameras, it becomes possible to narrow
down the position to a single straight line in the other image2
, making it easier to find
point matches. From the calibration of the two cameras it becomes relatively easy to then
find the 3D positions of point matches.
Figure 3 shows two imaging setups. Observing a cheese using a stereo rig of two
cameras at opposite angles is in fact equivalent to observing the cheese with a single
camera and rotating it instead. This is a cheaper and easy setup and therefore preferable,
while maintaining all the stereo vision techniques described above.
Real sensors and cameras operate under perspective projection, where the 3D image
of the cheese is somewhat distorted along its thickness - the part of the cheese which is
nearer to the X-ray source appears larger than that further away. If the cheese is placed
far enough away from the X-ray source (and so its thickness is negligible compared to the
distance from the source) it becomes possible to approximate the perspective projection to
an orthographic projection [19]. This is mathematically easier and requires less calibration
than perspective projection, so it would be useful if this is a good approximation.
2
In fact, due to nonlinear distortion from the sensor and the X-ray generator not being a point source,
it is narrowed down to a narrow region around a line instead of a single line.
13
14. Volume (mm2
) 150×100×35
Hole proportion (by vol.) 20%
Hole radius (mm, mean) 8
Hole radius (mm, std. dev) 2
Stretch (mean) 1
Stretch (std. dev) 0.25
Table 1: Parameters used in generating holes for the 3D-printed proxy.
3 Techniques implemented
3.1 3D printed proxy design
In any cheese, the distribution of holes is initially unknown and cutting the cheese to
attempt to determine that will tend to deform the holes themselves. Therefore, it is
impossible to obtain an accurate ground truth to which any attempted reconstruction
can be compared.
To address this, a “fake cheese” proxy was designed. It was designed to have similar
hole dimensions and statistics to real cheese, and to thus be a good model for real cheese
to an X-ray scanner. A real piece of Emmental cheese was inspected to find the size and
distribution of the holes. The distributions obtained were then randomly sampled with a
MATLAB program to create a unique cheese model, which was then 3D printed.
Any reconstruction could then be compared to the CAD “ground truth” to find and
rectify any errors.
3.1.1 Proxy design considerations
A pack of store-bought Emmental cheese was inspected to find the distribution of positions
of the holes in it, as well as the distribution of their sizes. This was then used to inform
the design of the proxy cheese.
The parameters for the proxy cheese designed are shown in Table 1. The centres of
holes were assumed to be uniformly distributed throughout the volume of the cheese,
taking up approximately 20% of the volume. Intersecting holes were allowed3
.
The holes themselves are well approximated by ellipsoids. The ellipsoids were coded
in MATLAB to be spheres with different levels of stretch along each axis. The radii of the
holes were set to be normally distributed, with a mean of 8mm and a standard deviation
of 2mm. The stretch along each axis was also normally distributed around 1, with a
standard deviation of 0.25. If anything, this somewhat overstates the eccentricity of the
ellipsoidal holes in real cheese - this was not thought to be a significant problem.
3
This is a slight deviation from what was observed in real cheese samples. In fact, real holes in cheese
tend to have a thin layer of cheese separating them, even if they would overlap. This was thought to have
no impact on imaging.
14
15. Figure 4: A wireframe view of the CAD model of the proxy.
Another design consideration for the proxy was its size and the size that the manufac-
turing process could allow. A typical industrial block of cheese measures approximately
300mm x 200mm x 70mm. However, the maximum dimensions of any object that could
be printed on the 3D printers available for use was 223mm × 223mm × 205mm [20].
Besides, it was found that printing a proxy with the same volume as an industrial block
of cheese would take more than one spool of 3D printer filament, raising other practical
problems. Therefore, it was finally decided to scale down the dimensions of the proxy by
a factor of 2 along each dimension.
A MATLAB script was written to randomly generate hole positions and parameters for
the proxy based on this specification. These were then imported into a CAD programme
for design of the physical object.
3.1.2 3D printing
OpenSCAD [21] was then used to develop a CAD model of the proxy, which was then 3D
printed. A top wireframe view of the CAD model can be seen in Figure 4.
Most 3D printers work by a process known as additive manufacturing: a thin filament
of an appropriate type of plastic is heated to its melting point (typically above 200 ºC)
and deposited onto a surface, where it quickly hardens again. When filament is added
beside already-manufactured filament, it fuses, creating a single solid mesh structure.
Thus meshes of filament can form solid layers, and new layers can be printed on top of
existing ones to produce 3D structures.
One benefit of this process is that it makes it possible to create fully-enclosed voids.
It is desirable for the proxy, much like real cheese, to contain holes which are completely
surrounded by material and not visible to the naked eye. These could now be manufac-
tured.
Additive 3D printing also introduces a few restrictions and artifacts in the manufac-
turing process. The regular mesh created by the 3D printer is visible and can create
15
16. Figure 5: Left, the CAD model of the 3D printed proxy. Right, a photograph of the final
product.
noise in the X-ray measurements. Additionally, the printing process draws the contours
of shapes (such as holes or the cheese’s edge) before filling it in. This creates noticeable
artifacts around the edges of holes and in their X-ray images, which must be taken into
account.
The 3D printer used was an Ultimaker 2 [20] present at the Department of Plant
Sciences, University of Cambridge. The filament used was heated to 215 ºC and had a
diameter of 100 µm, the minimum allowed by the printer, to reduce mesh artifacts while
imaging as much as possible.
The material used for printing was polylactic acid (PLA), a standard 3D printer ma-
terial. This is a biologically-derived plastic normally synthesised from sugarcane, and it
is non-toxic and biodegradable [22]. As it is an organic compound, it was assumed that
its X-ray attenuation coefficient is similar to that of real cheese.
3.1.3 Image preprocessing
The resulting 3D-printed proxy (shown in Figure 5) subjectively looks very similar to
cheese, but does present a few differences in its X-ray images. Due to the additive man-
ufacturing process, the regular plastic mesh which the solid parts of the proxy are built
from is visible in the X-ray images: this presents a significant source of high-frequency
noise, which has the potential to add confusion to the results. Additionally, the entire
X-ray image appears very dark and it is difficult to make out any higher-order features
within the cheese (such as holes). To counteract this, all pixel values were normalised to
have a maximum at 255, which made the image much brighter with no loss of information.
To eliminate the high-frequency mesh from the image, filtering was used. Both a
Gaussian filter and a median filter were considered for use. As shown in Figure 6, they
can both effectively blur out the mesh, with the median filter providing marginally better
subjective sharpness than the Gaussian filter, and slightly slower performance. As a result,
the median filter was chosen.
16
17. Figure 6: Left, the original image. Centre, filtered using a Gaussian blur, and right, using
a median filter.
Figure 7: The original histogram of an X-ray image of the proxy cheese, taken with
Cheyney equipment. Right, the new histogram that the image’s intensity was adapted
to.
3.2 Histogram adaptation
Figure 7 shows the histogram of an X-ray image taken of the proxy. This shows a promi-
nent peak around the pixel values of 50-70, and relatively few pixels outside those values.
Subjectively, the effect is to cause the output image to look washed-out and uniform, and
difficult to find holes in (especially smaller and shallower ones).
Therefore, histogram adaptation was used to make any holes present in the image
more prominent. Histogram equalisation was originally applied to the image, increasing
the contrast between holes and regions without holes, with results as can be seen in Figure
8.
However, it was noticed that this technique had its own problems. A simple equali-
sation emphasised small changes in brightness in uniform regions (due to the texture of
the proxy or even noise) and flattened brighter regions with one or several holes present.
17
18. Figure 8: A detail of an X-ray image of the proxy. Left, the original image output from
the X-ray. Centre: the same detail after histogram equalisation. Right, the same detail
with custom histogram adaptation.
This made it difficult to distinguish between overlapping holes, and occasionally led to
spurious holes being found in areas with no holes at all.
Therefore, a different form of histogram adaptation was designed. A section of a
Gaussian curve was selected as giving empirically the best results, with good contrast
between holes and background and between overlapping holes.
This is shown in Figure 8. The form of the histogram chosen was found, by trial and
error, to be best at
H(i) ∝ exp −
i2
20000
where H is the height of the histogram at a given pixel intensity i. This was divided into
64 bins which each pixel was assigned to. This was found to give the best (lowest-cost)
matches to the objective function described in section 4.4 below.
3.3 Traditional techniques
Common techniques to extract 3D reconstructions from 2D projections, such as filtered
backprojection and iterative reconstruction, are widely used in medical imaging to obtain
highly accurate 3D images. These techniques, however, routinely require a large number
of individual projections from a large range of angles.
Unfortunately, filtered backprojection is not an effective technique to obtain a re-
construction of the structure of the cheese under the constraints required in this project.
Figure 9 shows simulated reconstructions (from 1D projections) of an X-ray slice of cheese,
operating under the constraints of the problem. With either a limited number of projec-
tions or a limited angular range of projections, this method fails to provide an image of
18
19. Figure 9: Left to right: An X-ray projection of cheese; attempted reconstructions from
filtered backprojection using (a) few (10) 1-dimensional projections of the image and (b)
limited angles (within ±50°).
any quality and must therefore be discarded.
Iterative reconstruction was judged to proceed too slowly to be effective (estimating
several hours on a standard PC to converge) and was terminated. Nevertheless, it is
expected to suffer from the same problems.
A reason for the poor performance of backprojection is that it does not assume any-
thing about the internal structure of the cheese. This leaves it with insufficient information
to reconstruct an accurate picture from just two projections. However, in reality much
is known about the internal structure of the cheese: it is an approximately uniform sub-
stance with constant attenuation, with ellipsoidal holes of varying sizes within it. Different
techniques that use this information can therefore be expected to perform better.
3.4 Hough transform-based methods
A different proposed approach is to find individual holes in 2D projections, and then
combine different projections to find a full 3D reconstruction.
Section 2.4.2 introduced the Hough transform, which attempts to find imperfect in-
stances of a given feature within a 2D image. This was applied to search for holes in an
X-ray projection.
At first, a circular Hough transform was used. The dimensionality of a feature Hough
transform is O nd−2
, where n is the number of pixels in the image and d is the dimen-
sionality of the feature; since circles have d = 3 dimensions (given by the x- and y-position
of the centre, plus the radius) this reduces to O (n), which computes very quickly on a
normal workstation. An implementation of the circular Hough transform is a MATLAB
library function and this was used. The left-hand side of Figure 10 shows the result of
this attempt. While some of the holes in the cheese are correctly identified, the algorithm
fails to find many, and finds other spurious matches and errors. This is believed to be due
to the fact that the holes in the X-ray projection are in fact ellipsoidal, and only poorly
19
20. Figure 10: Left, the output of a circular Hough detector, right, the output of an elliptical
Hough detector, decimated so it could run in less time.
approximated by spheres.
A different approach that can be taken is computation of an elliptical Hough transform,
finding ellipses in the projection directly. This should provide a better match to the holes.
However, ellipses have d = 5 dimensions (the same three as circles, plus an eccentricity
term and an angle term) and therefore the complexity of an elliptical Hough transform is
O (n3
). This makes it impossible to evaluate this transform directly on a normal computer,
as it requires about 1TB of memory for a test image. Data-reduction techniques were
employed to make this possible, including subsampling and edge detection to speed up
the runtime.
The results of this are shown on the right-hand side of Figure 10. It appears that
the scale of data reduction necessary made it impossible to obtain a reconstruction of
any accuracy from the algorithm, and in fact it seems to perform worse than the circular
Hough transform. Therefore it was decided to abandon any further investigation of Hough
transforms, and to seek different methods to find ellipses in 2D images.
4 First-principles ellipse fitting
4.1 Motivation
All the techniques described in the literature and attempted above proved inadequate for
the problem of finding holes in cheese. Therefore, a novel technique was developed, using
an iterative approach to optimise an objective function from first principles. This method
attempts to fit a number of 2D ellipsoidal kernels to an image, each of which is supposed
to correspond to a hole. The final positions and characteristics of the optimised kernels
should match those of the real holes in the X-ray projection.
4.2 Description of technique
A set of ellipsoidal kernels is specified within an image. As an initial “guess”, a relatively
large number of kernels is initialised in the image, equally spaced and with identical size
20
21. Figure 11: A contour plot of the kernels used for optimisation. This one is spherical with
r = 50 and no eccentricity.
and no eccentricity. The kernel positions and sizes are then optimised asynchronously to
minimise an objective function. Kernels are moved individually until each reaches a local
minimum in the objective function. This is repeated for all kernels in the image.
To ensure that there are a correct number of holes in the final converged image, at
each iteration of the convergence sequence a set of tests are applied to ensure that all
kernels are fit to a hole, and are not caused by overfitting the data. Any kernels that fail
this test are eliminated from the optimisation procedure.
This process is then repeated until all the kernels converge, or the change in the
objective function is negligible.
4.3 Ellipsoidal kernels
The fitting technique relies on fitting ellipsoidal kernels onto the data given. Since they
need to be fitted onto a single projection, the kernels themselves must be 2D projections
of ellipsoids - brighter in the (thicker) centre and tapering off towards the sides, as shown
in Figure 11.
Figure 11 shows a contour plot of a spherical kernel C with radius r = 50 used in the
optimisation process. The intensity of such a kernel (with radius r) is given by
C(x, y) = max 0, r2 − (x, y) − (r, r) 2
.
This comes from a simple application of Pythagoras’ theorem. The kernel can be thought
of as a projection of a sphere, centred around the point (r, r). The intensity of the kernel
21
22. Figure 12: The C(x, y) function for a spherical kernel is simply given by the height from
(x, y) to the surface of the sphere.
at any position is proportional to the vertical thickness of the sphere from that position.
Any point outside the sphere is outside the kernel, so its intensity should be set to 0.
Figure 12 provides a visual representation of this argument. It is clear that the inten-
sity of the kernel can thus be obtained from Pythagoras’ theorem.
When the kernel is not spherical, it is still relatively easy to compute. Any ellipsoid
can be described as a sphere that has been stretched along the x-axis by a factor of sx
and along the y-axis by a factor of sy. To find the intensity of the ellipsoid at any point,
it is therefore relatively easy to project it back onto a sphere and proceed from there,
C(x, y) = max
0, r2 −
x
sx
,
y
sy
− (r, r)
2
.
This is the form of kernel that was used in this investigation.
4.4 Specification of an objective function
Optimisation is carried out by an iterative method, changing the positions and parameters
of the kernels to minimise an objective function. Therefore the choice of objective function,
and the terms within it, is very important to achieve a good fit.
The initial objective function used was simply the rms error between the projection
and the 2D reconstruction, with a fixed number of holes. An example of a stable fit
obtained by minimising this objective can be seen in Figure 13.
The first hole to be fitted seems to attempt to fit all the holes present in the image,
and any subsequent holes which are added result only in the fitting of the intersections
and additional noise. This is clearly unphysical and does not provide any information on
the holes present in the image. This is relatively stable, and small deviations from this
equilibrium only result in the cost function growing.
22
23. Figure 13: Left, a detail of an X-ray scan of the 3D-printed proxy (with dc component
removed). Right, the reconstruction obtained by a simple least-squares objective function.
Coefficient Component Value Threshold Value
a major axis 2 a0 3
b number of holes 0.6 None N/A
Table 2: The coefficients chosen for each of the terms in the objective function. These
were chosen by trial and error and led to seemingly the best fits.
4.4.1 Description of terms in objective function
Additional terms were added to the cost function to prevent such incorrect fits.
The terms added penalised for several common misfit artifacts observed while trying to
find an appropriate form for the objective function. It was found that “misfit” holes were
typically very large (covering multiple holes) and had large eccentricity, as they tended
to fit holes which were next to each other. To address this, it was decided to add a term
to the cost function, adding a penalty to holes with a large major axis to prevent misfits.
Another term was also added, representing the quantity of kernels in the image, to avoid
overfit:
CMA =
i
max (li,1 − a0, li,2 − a0, 0)2
Cn = n,
where li,1and li,2 are the lengths of the axes of hole i. CMA penalises the square of the
major axis (the longer axis of the elliptical kernel) beyond a0, applying a quadratically
increasing cost (see Figure 14). The value of a0 can be found in Table 24
. Cn avoids overfit
by placing a penalty on the overall number of holes fitted in the projection, ensuring that
any “overfit” kernels which do not fit any real holes are eliminated.
4
Lengths are expressed in multiples of 100 pixels, for ease of manipulation.
23
24. Figure 14: The change cost added into the function from the constraint on major-axis
length, with the change in the length of the major axis itself.
Figure 15: The components of the cost function over the course of the optimisation of set
of kernels.
Therefore the added terms apply no extra cost up to a certain threshold (which almost
all true holes within the cheese satisfy) and then add a cost which increases linearly with
distance from the threshold. The overall objective function is therefore
C = RMSE + aCMA + bCn,
where a and b are scaling coefficients that place the correct weight on each component to
reach a good fit. The coefficients chosen for the cost function can be seen in Table 2.
Figure 15 shows the change in the cost function. Typically this choice of cost function
results in no added cost in the converged fit from the added parameters to the function,
as the characteristics of real holes place them below the thresholds chosen for added cost.
However, these can add fairly significant costs in intermediate steps of the optimisation,
24
25. Figure 16: The reconstruction of the same area of the scan obtained using the final version
of the objective function.
and prevent the establishment of local minima which do not correspond to the ground
truth.
4.5 Optimisation method
After the initialisation of the kernels, a very simple iterative technique is used to minimise
the objective function. This is specified as follows:
1. Select a kernel
2. Translate the kernel slightly to the left and the change in the objective is measured.
(a) If the error decreases, (2) is repeated until the objective no longer decreases.
(b) Otherwise, it is moved to the right until the objective increases.
3. Repeat same procedure as (2) for y-axis translation, intensity, and stretches along
the x- and y-axis.
4. Return to step (2) and repeat until the rms error remains constant between itera-
tions.
5. Select another kernel and return to (2) until convergence.
Each optimisation is done on the basis of the results of the previous one - that is, optimised
positions, stretches etc. are stored and used to compute the optimisation for the next
parameter. This method was implemented using MATLAB [23] as a platform.
Rotation of the kernels was not included in this method, after it was found that it
slowed down convergence substantially (by adding a degree of freedom) without increasing
the accuracy of the final fit.
25
26. Figure 17: Comparison of the time-evolution of the cost function for asynchronous and
synchronous update rules.
4.5.1 Choice of asynchronous update
Using a synchronous update rule was also considered. Under this alternative scheme, each
parameter is updated based on the others’ values at the previous iteration, and only after
all parameters have been optimised are the new values stored.
Figure 17 shows the convergence of the objective function under the two different
schemes with a test image of part of the proxy. The synchronous update rule clearly seems
to converge somewhat slower, even though both techniques eventually reach essentially
the same solution.
Additionally, a synchronous update rule no longer guarantees that the cost does not
increase at any iteration, while the asynchronous rule does. This is seen quite clearly
in Figure 17. This can cause problems in the optimisation, including infinite loops be-
tween two states which never converge. These were in fact observed in some test images.
Therefore, it was decided that an asynchronous update was the better choice, both for
convergence speed and practical reasons.
4.6 Determining the appropriate number of kernels
The number of holes in a projection is not known in advance. This can cause difficulties in
choosing the appropriate number of kernels to fit to each projection - too few, and several
holes will not be detected; too many, and overfitting can occur, leading to phantom “holes”
which do not match anything, or several kernels fitting the same hole.
To prevent this, the algorithm is initialised with a number of kernels deemed signifi-
cantly larger than the number of holes likely to be present in it. A number of methods are
then implemented to ensure that any additional holes can be eliminated. Any kernel which
is deemed to be “extra” in this way is discarded and removed from the reconstruction,
26
27. leaving all the others unchanged.
A term is added to the cost function representing the number of kernels to be fitted.
This scales linearly with the number of kernels, adding 0.6 (a small but not insignificant
cost) to the cost of a given fit per kernel present in it. At each iteration, kernels are
removed one by one and the fit is attempted without each kernel present. If this results
in a better (lower-cost) fit, then the kernel is discarded.
If any of the holes are too small or too faint, they are also deemed to be overfit. A
test for this is performed and any kernels with brightness or size below defined thresholds
are similarly removed.
Additional kernels may be removed at the 3D fitting stage if no corresponding kernels
are found. The procedure for this is explained in detail below.
5 Three-dimensional fitting
Once holes have been found in 2D projections, each hole in one projection must be matched
to the corresponding hole in another projection. Three-dimensional information must then
be extracted from these matches to obtain a full 3D reconstruction.
5.1 Matching projections to each other
The first step in this process is finding matches between holes in 2D projections.
Figure 18 shows X-ray projections at two different angles of a detail of the 3D printed
proxy, and the corresponding detected 2D positions of the holes (found as described
above). The holes on the right of the images are known to be near the top surface of
the proxy, while those on the left are nearer to the bottom. This can be seen in the
projections, as there is a great deal of relative movement of the holes between them. This
3D information now needs to be used to extract the positions of the holes themselves.
Note also that in both images, a hole is detected in the top-left corner. This is due to
a different hole being cropped out to find a sample image.
5.1.1 Determination of similarity scores
The first step taken is to match each detected hole in one projection with its corresponding
hole in the other. This is done by calculating a dissimilarity score Sij for each pair of
detected holes i and j in different projections. A lower score indicates similar holes
according to number of measures, which are therefore more likely to match to each other.
Therefore, if N holes are detected in one projection and M holes in another, an N × M
matrix S stores the dissimilarity scores of each.
The dissimilarity score Sij is obtained from a number of different metrics, including the
position of the centre (x, y), the area of the holes, their eccentricity and their brightness
27
28. Figure 18: Top, two X-ray images of a detail of the proxy at two different angles. Bottom,
the reconstructions of the hole positions from the projections.
(or thickness along the direction of the projection). These component scores are given by
Sx(i, j) = max {|xi − xj| − ∆x, 0}
Sy(i, j) = max {|yi − yj| − ∆y, 0}
SA(i, j) = |li,1li,2 − lj,1lj,2|
Secc(i, j) = max
li,1
li,2
−
lj,1
lj,2
,
li,2
li,1
−
lj,2
lj,1
Sint(i, j) = |Ii − Ij| .
The scores Sx and Sy correspond to the absolute distance between the centres of the
holes, with some tolerance ∆x and ∆y5
. The tolerance ∆x is deliberately chosen to be
5
This means that any pair of holes within distances ∆x and ∆y of each other have no distance costs,
28
29. Coefficient Component Value
α Hole area 100
β Eccentricity 100
γ Intensity (depth) 50
Table 3: The coefficients corresponding to different components of the dissimilarity score
S. Intensity has a relatively low coefficient because there is only a weak correlation
between intensities from different angles.
large, as it represents the epipolar constraint. According to this constraint, a point in one
image must lie along a line (known as the epipolar line) in the other, representing a ray
between the camera’s optical centre and the point itself. In the geometry of this setup,
the epipolar line is purely horizontal (due to the camera’s rotation being purely in the
x-axis) and therefore the range of ∆x must be large. The epipolar line can, however, be
narrowed down to a relatively short line segment representing the thickness of the cheese.
In practice, nonlinear distortion means that the correspondence is found along a thin
band around the epipolar line rather than the line itself. This is given by ∆y.
The area and eccentricity components are expected (and seen) to be fairly constant
between different projections, and therefore no tolerances are added. The final score S is
given by
S = Sx + Sy + αSsz + βSecc + γSint
where α, β and γ are coefficients added on to the area, eccentricity and intensity terms to
reach a similar magnitude to the x- and y-components.
The intensity coefficient γ is given a relatively small value, as the observed thickness
of a hole can vary substantially between projections. The coefficients used can be found
in Table 3.
Figure 19 shows a visual representation of the S matrix in this case, with darker pixels
representing a lower dissimilarity score.
The correct matches here are quite distinct from incorrect matches, with each correct
match for each hole scoring several times lower than the lowest incorrect match. The
general diagonal trend along the matrix is to be expected, as the holes are in generally
similar positions in both projections and the kernels are always generated in the same
positions (eg kernel 1 is always initialised near the top-left corner, etc).
5.1.2 Determination of correct matches
In this instance, the correct matches are all quite clear from the matrix. However, there
may be other cases where matches may be ambiguous, where more holes are detected in
and the cost increases linearly with distance from those thresholds.
29
30. Figure 19: Visual representation of the matching costs between the kernels in the two test
images. Brighter cells represent higher costs.
one projection than in another, or where two holes in one projection seem to match to
the same hole in the other.
Therefore, an algorithm to provide the best guess for a correct fit was devised.
Figure 20 shows a flowchart describing the algorithm used. The global minimum Sij
of the matrix S is found, and the holes (i, j) are deemed to be a match to each other. To
avoid any other matches to the same holes, all elements in row i and column j of S are
replaced with very large values. This is repeated as many times as necessary, with new
minima (and matches) being found at each iteration.
A threshold score is set, and any pair of holes scoring above this threshold is rejected.
This threshold was calibrated, but is set relatively high to allow for the possibility of errors
in the 2D matching stage. If the minimum found is above this threshold, the algorithm
deems that all matches have been found and exits. This acts as another mechanism of
noise rejection, removing any “overfitting” holes which do not correspond to any holes
in the other projection, and allowing there to be different numbers of kernels in each
projection, M = N.
5.2 Recovery of 3D position and parameters
Once the holes have been matched to each other, it is possible to combine the information
from the two projections to achieve a three-dimensional reconstruction.
Figure 21 shows a diagram of a hole being projected onto the detector surface from
two beams, each at 45º to the surface. The coordinates of the first projection (x1, y1) are
30
31. Figure 20: A flowchart showing the process by which kernels are matched in two projec-
tions, from the dissimilarity scores in the matrix S.
Figure 21: The positions of the projections of the centre of a 3D hole will diverge by an
amount proportional to the height of the hole.
31
32. given by
x1 = X − Z tan(θ)
y1 = Y
and those of the second projection (x2, y2) are
x2 = X + Z tan(θ)
y2 = Y.
Therefore the 3D position of the centre of the hole can be simply obtained with
X =
x1 + x2
2
Y = y1 = y2 = y1, y2
Z =
x2 − x1
2 tan(θ)
=
x2 − x1
2
,
where y1, y2 is the arithmetic mean of y1 and y2 (to reduce noise). As long as the X-ray
source is aligned correctly relative to the centre of the hole, this is true regardless of the
distance to the source.
Finding the dimensions of each hole is somewhat more susceptible to the geometry
of the system. If the X-ray source is assumed to be very far away (ie, all incoming rays
are parallel and the projection reduces to an orthographic projection) the length in each
direction can be given by
lX = lx1, lx2 cos(θ) + cL I1, I2 sin(θ)
lY = ly1 = ly2 = ly1, ly2
lZ = lx1, lx2 sin(θ) + cL I1, I2 cos(θ),
where cL is a scaling factor (measured from known thicknesses) relating the brightness of
a hole in the X-ray projection to its thickness6
.
However, this is only a good approximation if the distance between the X-ray generator
and the surface of the cheese is much greater than that between the surface of the cheese
and the detector. If it is smaller, the divergent beams from the generator cause holes at
the surface to appear larger than those near the bottom.
32
33. Figure 22: Left, a photographic detail of the 3D-printed proxy. Right, an X-ray image of
the same detail.
5.2.1 X-ray scaling factors
Pixel positions on the X-ray detector (u, v) are linearly related to the true world position
of the projection (x, y) by a scaling and translation operation. This can be expressed as
u
v
=
ku 0 u0
0 kv v0
x
y
1
where ku and kv are the scaling factors along the x- and y- directions, respectively, and
(u0, v0) is the pixel position of the point (x = 0, y = 0) (which can be set arbitrarily).
The factors ku and kv were estimated by inspecting known distances and lengths in the
proxy (as shown in Figure 22) which shows the same detail of the proxy (whose lengths
are all known by design) in a photograph and in an X-ray image. It was found that, for
the X-ray detector used, ku = kv ≈ 13.9 px/mm. However, this does vary depending on
the specific X-ray detector used.
Additionally, a scaling factor cL was used above to relate the brightness of a detected
hole in a projection to the hole’s depth in the direction of the X-rays. This is not in
principle a linear operation: the Beer-Lambert law suggests an exponential variation
in contrast with hole thickness, and the added image preprocessing further distorts the
relationship. However, a linear approximation was found to perform well in this situation7
.
A reasonable value cL = 1.8 was selected experimentally, to minimise the reconstruction
error with respect to the proxy.
6
In fact, by the Beer-Lambert law, the relationship between brightness and thickness is exponential.
Since all the holes are relatively small, however, linearity is a good approximation.
7
This is due to the relatively small thickness of the holes, meaning that the first-order linear approxi-
mation of the exponential characteristic is relatively accurate.
33
34. Figure 23: A diagram showing the difference between the true length of a hole and its
measured length (under perspective projection).
5.3 Assumption of orthographic projection
The reconstruction is obtained using orthographic projection (ie, assuming that all in-
coming rays are parallel). This makes the geometry used for reconstruction considerably
easier, but it is a simplification and a potential source of error.
The X-ray source can be treated as a point source, and its rays all diverge from
the source and cross the cheese at different angles before impacting the detector. This
divergence leads to holes, especially those near the top surface, appearing magnified with
respect to their real size. A visual representation of this discrepancy is given in Figure
23. The magnification is given by
M =
lmeas
lreal
=
lSD
lSH
where lSD is the distance between the source and the detector, and lSH is the distance
between the detector and the hole measured.
In the setup used, the distance between source and detector is lSD = 400mm, and the
cheese lies on top of the detector. A hole lying at distance d = 30mm from the bottom of
the cheese is therefore magnified by a factor of
M =
lSD
lSH
=
lSD
lSD − d
=1.081.
34
35. which leads to a hole appearing roughly 8% longer in both directions , and thus appearing
to have a volume (M2
− 1) ≈17% larger than the real volume of the hole8
. This is a
significant error, but it is still deemed to be smaller than any of the errors introduced by
other sources and therefore acceptable.
Higher order errors are also introduced by this assumption which are more difficult to
negotiate. The 2D kernels used for fitting are all orthographic projections of ellipsoids,
and somewhat differ from perspective projections of ellipsoids. The effect was deemed to
be small and difficult to correct for.
5.4 Surface and intersecting holes
The technique described here assumes that all holes are fully spherical and contained
within the cheese. Therefore, it has difficulty reconstructing holes which do not meet
those requirements, such as holes which are “cut off” and visible from the outside (like
those shown in Figure 22) or even internal holes which intersect others.
While this does lead to potential problems reconstructing the holes present in the
proxy and in slices of cheese, it should not present any issues when dealing with wheels of
real cheese. A wheel itself does not have any exposed holes at the surface, and all of its
holes are entirely internal. This ensures that this technique remains useful for that case.
6 Results and discussion
6.1 Accuracy and efficiency of 2D optimisation
To find the 3D position of holes, their positions in each 2D projections were first found.
The optimisation process described above always led to a converged stable equilibrium
of the fitted hole positions in a 2D projection.
Figure 24 shows the final converged fit for a top-view X-ray projection of the 3D-
printed proxy. The final fit had an rms error of RMSE = 32.03 (in a range of 0-255, from
the error in pixel values). All fits made by the algorithm resulted in similar performance,
with rms error in the range 25-35.
Many holes are matched perfectly, or nearly perfectly, by the algorithm. It performs
particularly well on fully segmented holes with no overlap with any others in the image,
but it can also find accurate reconstructions of holes with notable overlap. The holes
near the top of the image, for example, are matched very well. However, the algorithm
does perform relatively poorly on holes with substantial overlap and little variation in
brightness between them, such as the holes near the centre of the proxy. This can lead to
some holes missing during the 3D fitting stage.
8
The length in the z-direction should not be affected by the assumption of orthographic projection.
35
36. Figure 24: Left, an X-ray image of the proxy. Right, the 2D reconstruction obtained using
the algorithm designed.
Holes which are partially outside of the image also present a problem for 2D fitting.
Due to the design of the routine, all matched holes must lie entirely within the image.
This leads to relatively poor fits near cut-off holes at the sides of the X-ray image. This
should not present a problem with real cheese, as a wheel entirely contains all of its own
holes.
Figure 25 shows an X-ray image of the cheese described in Section 2.2, and its final
converged fit. The holes which are clear to the human eye can be seen to be matched quite
well. There is also a fairly large number of unclear holes, which are matched generously
by the algorithm, finding more holes than are probably present. Any holes which do not
correspond to anything are removed later, in the 3D reconstruction stage. The rms error
of this fit is very low, with RMSE = 17.48.
6.1.1 Efficiency of 2D optimisation
The 2D optimisation stage ran very slowly due to the iterative method used. At each step
of the optimisation process, the cost function was evaluated. This involves finding the
rms error of the reconstruction, which requires rendering of the proposed reconstruction.
Reconstructing a proposed fit adds all holes individually and then finds the rms error of
each pixel, so the reconstruction process of a u × v image has complexity O(uv × N),
where N is the number of holes to be fitted to the cheese. This is then looped through
all N holes, and therefore the overall computational complexity of the approach taken for
2D reconstruction is O(uv × N2
).
The very large number of loop iterations and the rendering of the reconstruction at
every step made the process run very slowly. The 2D optimisation for a projection of the
36
37. Figure 25: Left, an X-ray image of a real piece of cheese. Right, the 2D reconstruction
obtained.
entire proxy shown in Figure 24 computed in approximately one hour on a normal PC.
The need to re-render the reconstruction at every step, and the presence of this step at
the bottom of four nested loops, means that the algorithm spends a large proportion of
its time evaluating proposed fits. It would be possible to speed up execution substantially
(if not change the overall complexity) by saving a render for the N − 1 holes not being
optimised at the current iteration, and then simply adding the proposed fit for the kernel
being fitted before recomputing the cost.
An even simpler way to reduce running time is by subsampling high-resolution images,
which results in large computational savings with a relatively small decrease in perfor-
mance. This was done, and as predicted, subsampling by a factor of 2 along each axis
reduced the running time of the optimisation by a factor of approximately 4.
Due to the O(uv×N2
) complexity of the optimisation process, there are also significant
expense savings to be had if the image can be segmented into smaller sub-images, and
each sub-image only uses a fraction of the N kernels. This is not straightforward to do as
many holes overlap with each other and a naive segmentation would cut holes across the
sub-images and result in incorrect fits. It is, however, possible to identify lines along which
the image can be segmented (such as the divide between the upper and lower sections of
the fit in Figure 24) and perform optimisation along those.
Another area where changes could be made to lower the running time of the process
(and bring marginal improvements to the fit) is the optimisation technique itself. The
algorithm as implemented uses a fixed step size to translate and stretch kernels around the
projection to minimise the error. Varying the step size based on the previous change could
result in many unnecessary iterations being skipped and a much faster optimisation per
kernel [24, 25]. As the derivative of the objective function with any of the parameters being
37
38. Figure 26: Left, the sample image used by Kraggerhud et al. Right, the reconstruction
obtained by them (reproduced from [7])
used is not apparent, derivatives must be computed numerically [26] or a “derivative-free”
optimisation method should be used instead [27].
It was also found that the final converged fit showed some dependence on the starting
conditions for the optimisation. Varying the spacing of the initial kernel setup, or offset-
ting the kernels, led to slightly different converged fits (for example, fitting or skipping
holes that a different starting setup did not do). All converged fits nonetheless presented
a similar rms error.
6.1.2 Comparison with past results
It is difficult to directly compare the results obtained in this report to other results
available in the literature, as nobody has attempted to find the three-dimensional location
of holes in cheese through X-ray imaging.
However, Kraggerhud et al. [7] attempted to use X-ray imaging to monitor the growth
of holes in cheese over time. They proposed a method to find holes in individual 2-
dimensional X-ray projections of cheese. This started from the assumption that holes in
a single X-ray projection are well-approximated by uniform circles. They then scanned
the image with a set of uniform circular templates of different radii to obtain a cross-
correlation image for each [28]. They then took the maximum cross-correlation value
for each pixel, thresholded the image and found the centroids and radii of the resulting
patches as the radii of the holes within the cheese. The results they obtained from a
sample image (reproduced from the paper) are shown in Figure 26.
This is clearly a different approach from that taken in this project. Apart from the
different optimisation technique used, the 2D matching process used in this project did
not assume uniform circular holes, but rather used kernels which were projections of an
ellipsoid and allowed them to deform.
The converged 2D fit obtained for the same data used by Kraggerhud et al. using
the algorithm developed in this project is shown in Figure 27. This subjectively seems
to perform somewhat better than Kraggerhud et al.’s system, finding several holes which
38
39. Figure 27: Left, the sample image used by Kraggerhud et al [7]. Right, the reconstruction
obtained by the algorithm developed.
Figure 28: Left, a detail (in wireframe view) of the ground truth for the proxy showing
surface holes; right, its reconstruction.
their technique did not correctly identify. It is still not perfect, and tends to overfit
somewhat and finds more holes than are truly present in the image. This is to an extent
intentional in the fitting process, however, as these phantom holes can later be removed
in the 3D matching stage with reasonable accuracy and relatively little inaccuracy9
. Thus
(as far as the two methods are comparable) the method employed for two-dimensional
matching here performs as well as, or better than, the state of the art.
6.2 Three-dimensional fitting
The two-dimensional fits obtained from each projection were then matched to obtain a
three-dimensional reconstruction.
Figure 28 shows a three-dimensional reconstruction of a detail of the 3D-printed proxy,
using the algorithms detailed above. Although there are some errors in the reconstruction
9
It would not be helpful in this instance to compare the rms error of the two reconstructions, as the
matching proposed by Kraggerhud et al. does not explicitly produce a reconstruction.
39
40. Error type rms error
Centre location 4.5mm
Side length 12%
Hole volume 19%
Table 4: The rms magnitudes of the reconstruction errors observed in the reconstruction
of the proxy.
of each hole, it is clearly a fairly faithful reconstruction of the original.
The spurious hole which has been identified in the top-left corner of the reconstruction
is in fact an artifact due to cropping the 2D images: part of a larger hole remained
present and was matched (as is visible in Figure 18, which shows the projections of the
same detail). Two real holes which are present on the left, on the other hand, are not
reconstructed on the right. This is due to the nature of those holes: one is almost entirely
outside of the proxy and is invisible in X-ray images, and the other has a very high
eccentricity and is only visible in one of the projections. The other holes, with good
visibility in both projections, are reasonably reconstructed.
The reconstructed position of the centre of the holes presented an rms error of 4.5mm
(and the error seems to be approximately equal along all axes). This leads to very good
positioning of all the holes, and agreement with the ground truth.
The quality of reconstruction of lengths and volumes about each axis of the holes was
somewhat worse. The rms difference between the actual and reconstructed length of each
axis averaged 12%, while the rms difference between actual and reconstructed volume was
19%. These errors are all shown in Table 4. However, this overstates the true error in the
reconstruction. This relative inaccuracy is largely due to the inaccurate reconstruction
of surface holes, which are reconstructed to be somewhat smaller than they are in the
original due to the low contrast measured by the X-ray. Therefore in reality this error is
somewhat immaterial.
It must also be noted that the traditional technique of core sampling does a generally
poor job of estimating the lengths and volumes of the holes cut across. The mechanical
strains induced by core sampling lead to the distortion of the holes cut across [29], and
it therefore becomes difficult to obtain an accurate impression of the lengths and the
volumes of the holes.
There was no significant change in volume discrepancy between holes at the bottom of
the cheese (nearer the detector) and holes nearer the top surface. Therefore it was deemed
that the assumption of orthographic projection used to perform 3D reconstruction was
not a significant source of error.
Figure 29 shows the 3D reconstruction of the cheese analysed above. Even though
each 2D projection fitted the holes quite generously, finding many matches in a single
projection, only a few of those were recognised to be real holes in the 3D matching stage.
40
41. Figure 29: The 3D reconstruction of the real cheese shown in Figure 25.
Figure 30: Left, the original proxy design (in wireframe view); right, the reconstructed
proxy.
As there is no ground truth available for the cheese, it is not known how accurate this
reconstruction is, but it appears to be reasonable.
6.3 Agreement with ground truth
Figure 30 shows a comparison of the original 3D-printed proxy and the reconstruction
obtained after using the 3D fitting process.
The algorithm developed only matches 21 of the original 83 holes. Thus it appears
(and it seems clear from images of the reconstruction) that many of the holes in the
proxy are missing. However, the “missing” holes are largely holes cut-off near the edges
of the cheese (the method’s weaknesses in matching those are described above) and the
heavily overlapping holes near the centre of the proxy. Non-overlapping (or marginally
41
42. overlapping) holes far from the edges were generally matched and reconstructed quite
well. This is a substantial weakness of the reconstruction, but one that is less significant
in real cheeses. In real cheeses, the holes do not extend to the edge of the wheel and they
generally do not overlap [30].
7 Conclusion
This project was established with the aim to use as few X-ray images as possible to obtain
a faithful three-dimensional reconstruction of the internal structure of cheese with holes
in it. The outcome was to be aimed at a commercial application, so it needed to be
low-cost, rapid and an improvement over the destructive methods currently used in the
quality-control industry [3].
This was split into several parts. To obtain a known ground truth for calibration and
quantification of errors, a proxy was designed and manufactured (with a 3D printer) and
made to resemble the statistics of real Emmental cheese. This proxy, as well as a few
samples of real cheese, was imaged using an X-ray generator from several angles.
An algorithm was developed to find the locations of individual holes in a single two-
dimensional X-ray projection. This was achieved by specifying a cost function to be
minimised to obtain the most accurate reconstruction, and optimising it by translating
and scaling a set of ellipsoidal kernels corresponding to individual holes.
The characteristics of the kernels from two X-ray projections at different angles were
then matched and combined to produce a single 3-dimensional image. A simple similarity
score was computed to match the found holes in one 2D projection to those of the other,
and geometry was used to obtain 3D positions of the centres of the holes. Assuming
orthographic projection, the length of each hole along each axis was also reconstructed,
and a full 3D image was obtained from the positions.
Both the 2D and 3D matching steps proved somewhat prone to errors. The 2D match-
ing stage, while correctly matching many of the holes, had difficulty matching holes on
the sides of the image or holes which overlapped a lot with others. This is a particular
problem with the 3D-printed proxy used - real cheese shows less overlap between holes so
generally produces higher quality matches.
The 3D reconstruction stage matched holes in different projections correctly with
good accuracy, with no incorrect matches (which would have been judged as incorrect by
a human) observed. 3D reconstruction did a very good job of correctly finding the 3D
positions of the holes, with an rms error of just 4.5mm. On the other hand, the length of
each side of the hole was matched somewhat worse, with a 12% rms error on the length
in each direction and a 19% rms error on the volume. Nevertheless, this is a better result
than offered by core sampling, which itself distorts the length and volume of holes by
shearing them.
42
43. One problem that remains in the implementation used is the speed of the 2D optimi-
sation. It runs very slowly on high-resolution X-ray images of cheese, taking roughly an
hour on a standard PC. Subsampling improves this, but reduces accuracy. There remain
many computational savings to be had by optimising the code.
Obtaining a 3D reconstruction of the holes in cheese from X-ray images is a problem
which had never been specifically tackled in the literature. The method proposed in this
project, while imperfect and in need of improvements, represents the first viable solution
to this problem.
7.1 Future directions
There remain many improvements that can be made to the process described in this
project to make it more accurate and efficient.
The main problem encountered is the low speed of the 2D optimisation step. As
designed, it takes a very long time to converge to a stable solution, on the order of an
hour for a high-resolution X-ray image. While it is possible to speed it up by subsampling
the image to a lower resolution, this lowers the accuracy of optimisation.
A better optimisation technique with a variable step size would substantially reduce
the running time of the method, and could even lead to more accurate solutions. Even a
relatively simple algorithm, such as steepest-descent (with numerical derivatives) [24, 26]
could improve performance by a great deal. Since holes are fitted one at a time, it would
also be possible to preload a reconstruction with all the other holes, so that the objective
function does not need to obtain a new reconstruction and compute its rms error at each
iteration.
One major source of inaccuracy in optimisation which would be relatively easy to fix
is the 2D fitting of cut-off holes at the side of the image. These are not fit correctly as
fit holes must remain entirely within the image. This could be fixed relatively easily by
allowing kernels to be centred beyond the edges of the X-ray image, and improving the
quality of the fit.
One other feature which would improve the technique outlined here is a system to
recognise the edges of the cheese in the image and crop the image to contain only the
cheese. This could be done with simple edge detection [31], followed by cropping and
rotation, to make reconstruction invariant to cheese positioning.
7.1.1 Possible applications
The techniques developed in this report can potentially be used in more areas than the
quality control of cheese. The computational methods used are in no sense limited to
X-ray images, or cheese itself, and are able to find ellipsoidal objects in any image.
One area where this can potentially be of use is biological microscopy, where accurate
43
44. automated cell counting is an active area of research [32]. Apart from providing a count of
the cells, the techniques developed here could be used to find the specific location and size
of cells in a sample and even produce a three-dimensional reconstruction of the sample at
a microscopic scale.
44
45. References
[1] Bachmann H.-P., U. Bütikofer, D. Isolini. Encyclopedia of Dairy Sciences. FAM,
Swiss Federal Dairy Research Station; August 2001.
[2] Sherman J.M. “The Cause of Eyes and Characteristic Flavor in Emmental or Swiss
Cheese”. Journal of Bacteriology, 6, no. 4 (1921): pp. 379-393.
[3] “Cheese trier.” U.S. Patent 2,362,090, issued November 7, 1944.
[4] Brosnan T., D-W. Sun, “Improving quality inspection of food products by computer
vision – a review”. Journal of Food Engineering, 61, no. 1 (January 2004): pp. 3–16.
[5] Jamieson V., “Physics raises food standards”. Physics World, 15, no.1 (2002): pp.
21–22.
[6] Han Y.J., S.V. Bowers, R.B. Dodd, “Nondestructive detection of split–pit peaches”.
Transactions of the ASAE, 35, no.6 (1992): pp. 2063–2067.
[7] Kraggerhud H., Wold, J. P., Høy, M. and Abrahamsen, R. K. , “X-ray images for
the control of eye formation in cheese”. International Journal of Dairy Technology,
62 (2009): pp. 147–153.
[8] Caccamo M., Melilli, C.M. et al. “Measurement of gas holes and mechanical openness
in cheese by image analysis”. Journal of Dairy Science, 87, no. 3 (April 2004): pp.
739-48.
[9] Eskelinen J.J., A.P. Alavuotunki et al., “Preliminary Study of Ultrasonic Structural
Quality Control of Swiss-Type Cheese”. Journal of Dairy Science, 90, no. 9 (Septem-
ber 2007): pp. 4071–4077.
[10] Mussea M., S. Challois et al., “MRI method for investigation of eye growth in semi-
hard cheese”. Journal of Food Engineering, 121 (January 2014): pp. 152–158.
[11] Swinehart D. F. “The Beer-Lambert Law”. Journal of Chemical Education, 39, no.7
(January 1972): p. 333.
[12] ICRU, “Tissue Substitutes in Radiation Dosimetry and Measurement”, Report 44 of
the International Commission on Radiation Units and Measurements (1989).
[13] Koch H. W., and J. W. Motz. “Bremsstrahlung cross-section formulas and related
data.” Reviews of Modern Physics, 31, no. 4 (1959): p. 920.
[14] Cheyney Design, “X-ray Generator High Performance Air-Cooled Monoblock” CDD
XG datasheet, November 2015.
45
46. [15] Hsieh J. Computed tomography: principles, design, artifacts, and recent advances.
Bellingham, WA: SPIE—The International Society for Optical Engineering; 2003.
[16] Wong C.Y., Lin, Ren and Kwok, “A Survey on Ellipse Detection Methods”. IEEE
International Symposium on Industrial Electronics, May 2012
[17] Duda, R. O., P. E. Hart, “Use of the Hough Transformation to Detect Lines and
Curves in Pictures,” Comm. ACM ,15 (January 1972): pp. 11–15.
[18] Nair P., A. Saunders et al., “Hough transform based ellipse detection algorithm,”
Pattern Recognition Letters, 17, no. 7 (1996): pp. 777–784.
[19] Hartley R. I., P. Sturm, “Triangulation”. Computer Vision and Image Understanding,
68 (November 1997): pp. 146–157.
[20] Ultimaker B.V., “Ultimaker 2 User Manual” Ultimaker 2 datasheet, November 2014.s
[21] OpenSCAD Release 2015.03, The OpenSCAD Developers.
[22] Drumright, R.E., P.R. Gruber, and D.E. Henton. “Polylactic acid technology.” Ad-
vanced materials, 12, no. 23 (2000): pp. 1841-1846.
[23] MATLAB Release 2015a, The MathWorks, Inc., Natick, Massachusetts, United
States.
[24] Battiti, R. “First-and second-order methods for learning: between steepest descent
and Newton’s method.” Neural computation, 4, no. 2 (1992): pp. 141-166.
[25] Jacobs, R.A. “Increased rates of convergence through learning rate adaptation.” Neu-
ral networks, 1, no. 4 (1988): pp. 295-307.
[26] Nocedal J., and S. Wright. Numerical optimization. Springer Science & Business
Media, 2006.
[27] Rios L.M. and Sahinidis, N.V. “Derivative-free optimization: a review of algorithms
and comparison of software implementations”. Journal of Global Optimization, 56,
no.3 (2013): pp. 1247-1293.
[28] Pratt W. K. Digital Image Processing, New York, NY: John Wiley & Sons, Inc. 2nd
edn (1991), pp 651–668.
[29] Casiraghi E.M., Bagley E.B., Christianson D.D. “Behavior of mozzarella, cheddar and
processed cheese spread in lubricated and bonded uniaxial compression”. Journal of
Texture Studies, 16, no. 3 (September 1985):pp. 281-301.
46
47. [30] Clark, W.M. “On the formation of “eyes” in Emmental cheese.” Journal of Dairy
Science, 1, no.2 (1917): pp. 91-113.
[31] Canny, J. “A computational approach to edge detection.” IEEE Transactions on
Pattern Analysis and Machine Intelligence, 6 (1986): pp. 679-698.
[32] Benes, F.M. et al. “Two-dimensional versus three-dimensional cell counting: a prac-
tical perspective”. Trends in Neurosciences, 24 , no. 1 (2001): pp. 11-17
47
48. A Risk assessment retrospective
The main risk present in this project, and pointed out in its risk assessment, is handling
X-ray imaging equipment. This was an essential part of the project, as its aim is to create
3D reconstructions of cheese from X-ray images, and could not be avoided.
X-ray radiation, like any ionising radiation, is potentially harmful, and carries a small
risk of genetic damage, including increased risk of cancer and inheritable defects, as well
as (in very high doses) the possibility of causing skin burns, hair loss and cataracts. For-
tunately, the dose required for imaging is very small and cannot cause noticeable damage
to skin or other tissue. I received proper instruction on how to handle the equipment and
all of my use of it was monitored by Cheyney Design. Additionally, the equipment itself
was designed so that it could not function unless the (radiation-shielded) door was closed
and all X-rays were contained within the small chamber. This all makes me confident
that all use of X-ray equipment was safe and any risks were effectively eliminated.
As this project is essentially a computational project, most of my work for it took
place on a computer. This comes with a few risks, including possible long-term health
problems due to improper posture.
Appropriate measures were taken to mitigate this, such as taking regular breaks and
adjusting the chair and screen height to obtain an ergonomic position. Therefore, no
issues were encountered with computer use for the duration of this project.
No other risks were encountered which were not explicitly stated in the risk assessment.
Overall, it was deemed that this project was handled safely and that any potential risks
were sufficiently addressed.
48