Photogrammetry has emerged as a leading approach for photorealistic digital replication and 3D scanning of real-world objects, particularly in areas of cinematic visual effects and interactive entertainment. While the technique generally relies on simple photography methods, the foundational practices for the field of human photogrammetry remain relatively undocumented.Human subjects are significantly more complex than still life, both in terms of photogrammetric capture, and in digital reproduction. Without the documentation of foundational practices for human subjects, there is a significant knowledge barrier for new creative practitioners to operate in the field, stifling innovation and adoption of the technique.Researchers and commercial practitioners currently working in this field continually distribute learnings and research outcomes. These learnings tend to centralise more on advanced practices such as capturing micro-geometry (skin pores), reflectance and skin distortion. However, the standard principles for building capture systems, considerations for human subjects, processing considerations and technology requirements remain elusive. The purpose of this research is to establish foundational practices for human photogrammetry systems. These practices encapsulate the underlying architectures of capture systems, through to necessary data processing for the 3D reconstruction of human subjects.Design-led research was used to construct a scale 21-camera system, designed for high-quality data capture of the human head. Due to its incredible level of surface complexity, the face was used to experiment with a variety of capture techniques and system arrangements, using several human subjects. The methods used were a result of the analysis of existing practitioners and research, refined through numerous iterations of system design.A distinct set of findings were synthesised to form a foundational architecture and blueprint for a scale, human photogrammetry multi-camera system. It covers the necessary knowledge and principles required to construct a production-ready photogrammetry system capable of consistent, high-quality capture that meets the needs of visual effects and interactive entertainment production.
Modern computer-aided design (CAD) systems and software tools have played a significant role in improving the efficiency of the overall product design process, ensuring geometric accuracy and the exchange of product model data. However, the impact of these technologies is largely restricted to the detailed modeling and engineering analysis that occur during the embodiment design phase. Conceptual design has not benefited from these sophisticated and highly precise software tools to the same degree because the creative activities associated with developing and communicating potential solutions with minimal details is far less formulaic in its implementation. At the early stages of product design the specifications and constraints have not been fully established. The industrial designers and engineers need the freedom to change and modify the product configuration and mechanical behavior to investigate a wide range of alternative solutions. Any CAD system that seeks to support and enhance conceptual design must, therefore, enable natural and haptic modes of human-computer interaction. Recent advancements in high-speed, multi-core computer hardware and virtual reality (VR) technology provide opportunities to link the more fluid processes of creative conceptual design with the rigidly defined tasks of product detailing and engineering analysis. This paper discusses the role that virtual reality can play for concept design module.
Panorama Technique for 3D Animation movie, Design and EvaluatingIOSRjournaljce
This paper presents an applied approach for Panorama 3D movies enhanced with visual sound effects. The case study that considered is IIPS@UOIT. Many selected S/W have been used to introduce the 3D Movie. 3D Animation is a modern technology in the field of the world of filmmaking and is considered the core of multimedia, where the vast majority of movies such as Hollywood movies that we see today, it was using 3D technology. Where this technique is used in all the magazines, such as medical experiments, engineering, astronomy, planets and stars, to prove scientific theories, history, geography, etc., where they are building models or scenes or characters simulates reality and the movement of the viewer to the heart of the event. A three-dimensional film was made to (IIPS @ UOITC) to give it a future vision and published in the global sites such as YouTube, Facebook and Google earth. By using many specialized 3d software and cinematic tricks, with a focusing on movement, characters, lighting, cameras and final render.
This document discusses using 3D printing to create custom optical elements, called "Printed Optics", that can be embedded in interactive devices. It presents four categories of Printed Optics fabrication techniques: 1) Light Pipes that can guide light through devices for display purposes, 2) Internal Illumination techniques using hollow spaces, 3) Sensing user input by tracking mechanical movements of printed elements, and 4) Embedding optoelectronic components. The document outlines examples like using light pipes in a mobile projector accessory or sensing touch inputs on a device's surface. Overall, it explores how 3D printing optical elements opens up new possibilities for interaction design.
This document discusses various topics related to visualization and advertising management. It begins by defining visualization and discussing its historical uses. It then covers types of visualization like scientific, educational, information, knowledge and product visualization. It also discusses visualization strategies, elements of advertising execution like creative and media execution, and persuasion techniques used in advertising like pathos, logos and ethos. Finally, it briefly describes sales promotion tools and techniques.
Development Prototype Design of Virtual Assembly Application-Based Leap MotionIJAEMSJORNAL
Innovation in design engineering practice is very important in the world of manufacturing in the increasingly competitive global market. Prototyping and evaluation measures are inseparable from the design process in the manufacture of a product. And made one of many physical prototypes require very expensive and time consuming, so the technology of Virtual Reality (VR) is needed, so the industry can quickly and precisely in the decision. VR technology combines a human being with a computer environment visually, touch and hearing, so that the user as if into the virtual world. The goal is that users with hand movements can interact with what is displayed on the computer screen or the user can interact with the environment is unreal to be added into the real world. VR is required for simulations that require a lot of interaction such as prototype assembly methods, or better known as the Virtual Assembly. Virtual Assembly concept which was developed as the ability to assemble a real representation of the physical model, the 3D models in CAD software by simulating the natural movement of the human hand. Leap Motion (accuracy of 0.01mm) was used to replace Microsoft's Kinect (accuracy of 1.5cm) and Motion Glove with flex sensors (accuracy of 1°) in several previous research. Leap mot ion controller is a device that captures every movement of the hand to then be processed and integrated with 3D models in CAD software. And simulation of assembly process virtually in CA D software with hand gestures detected by the leap mot ion, assembly parts can be driven either in translation or rotation, zooming and adding the assembly constraint. It also can perform mouse functions (such as left-click, middle-click, right-click and move the mouse cursor position) to a virtual assembly process simulation on CAD software.
This document discusses a technique for detecting copy-paste or cut-paste digital image forgeries. The technique uses wavelet decomposition to decompose the input image into sub-bands, including low-low, low-high, high-low, and high-high. Edge detection is then performed to extract edges. The cut-paste or copy-paste region is identified by examining edge pixels in the wavelet domain, as this region is normally rectangular or square. Parameters like entropy, power energy, and standard deviation are compared between the input image and suspected forged images to detect forgeries. The technique was tested on images with copy-pasted areas and could accurately identify the forged regions.
Elucidating Digital deception: Spot counterfeit fragmentVIT-AP University
An ordinary person always has confidence in the integrity of visual imagery and believes it without any doubt. But today's digital technology has eroded this trust. A relatively new method called image forgery is extensively being used everywhere. This paper proposes a method to depict forged regions in the digital image. The results for the proposed work are obtained using the MATLAB version 7.10.0.499(R2010a). The projected design is such that it extracts the regions that are forged. The proposed scheme is composed for uncompressed still images. Experimental outcome reveals well the validity of the proposed approach.
A FRAMEWORK FOR AUTOMATED PROGRESS MONITORING BASED ON HOG FEATURE RECOGNITIO...ijcsit
Construction target monitoring became one of the key concerns of the construction field. Computer vision
based monitoring methods like 2D image, video and 3D laser scan methods have difficulties in meeting
real-time and data objectivity requirements. This paper points out the applicability of high resolution
remote sensing image (HRRSI) in this task. This paper proposes a way for enhancing Histogram of
Gradients (HOG) feature for HRRSI. HOG is applied as the key feature for machine learning, which is
carried out by the boosting method. To overcome the complexity of HOGs in HRRSIs, this paper builds a
feature framework according to the top, side and height characteristics of engineering ground objects.
Based on the feature framework, this paper presents a HRRSI based framework for construction progress
monitoring. Through an application case, the performance of the framework is presented. Judging from the
result, this method is robust under different seasonal and illuminative conditions.
Modern computer-aided design (CAD) systems and software tools have played a significant role in improving the efficiency of the overall product design process, ensuring geometric accuracy and the exchange of product model data. However, the impact of these technologies is largely restricted to the detailed modeling and engineering analysis that occur during the embodiment design phase. Conceptual design has not benefited from these sophisticated and highly precise software tools to the same degree because the creative activities associated with developing and communicating potential solutions with minimal details is far less formulaic in its implementation. At the early stages of product design the specifications and constraints have not been fully established. The industrial designers and engineers need the freedom to change and modify the product configuration and mechanical behavior to investigate a wide range of alternative solutions. Any CAD system that seeks to support and enhance conceptual design must, therefore, enable natural and haptic modes of human-computer interaction. Recent advancements in high-speed, multi-core computer hardware and virtual reality (VR) technology provide opportunities to link the more fluid processes of creative conceptual design with the rigidly defined tasks of product detailing and engineering analysis. This paper discusses the role that virtual reality can play for concept design module.
Panorama Technique for 3D Animation movie, Design and EvaluatingIOSRjournaljce
This paper presents an applied approach for Panorama 3D movies enhanced with visual sound effects. The case study that considered is IIPS@UOIT. Many selected S/W have been used to introduce the 3D Movie. 3D Animation is a modern technology in the field of the world of filmmaking and is considered the core of multimedia, where the vast majority of movies such as Hollywood movies that we see today, it was using 3D technology. Where this technique is used in all the magazines, such as medical experiments, engineering, astronomy, planets and stars, to prove scientific theories, history, geography, etc., where they are building models or scenes or characters simulates reality and the movement of the viewer to the heart of the event. A three-dimensional film was made to (IIPS @ UOITC) to give it a future vision and published in the global sites such as YouTube, Facebook and Google earth. By using many specialized 3d software and cinematic tricks, with a focusing on movement, characters, lighting, cameras and final render.
This document discusses using 3D printing to create custom optical elements, called "Printed Optics", that can be embedded in interactive devices. It presents four categories of Printed Optics fabrication techniques: 1) Light Pipes that can guide light through devices for display purposes, 2) Internal Illumination techniques using hollow spaces, 3) Sensing user input by tracking mechanical movements of printed elements, and 4) Embedding optoelectronic components. The document outlines examples like using light pipes in a mobile projector accessory or sensing touch inputs on a device's surface. Overall, it explores how 3D printing optical elements opens up new possibilities for interaction design.
This document discusses various topics related to visualization and advertising management. It begins by defining visualization and discussing its historical uses. It then covers types of visualization like scientific, educational, information, knowledge and product visualization. It also discusses visualization strategies, elements of advertising execution like creative and media execution, and persuasion techniques used in advertising like pathos, logos and ethos. Finally, it briefly describes sales promotion tools and techniques.
Development Prototype Design of Virtual Assembly Application-Based Leap MotionIJAEMSJORNAL
Innovation in design engineering practice is very important in the world of manufacturing in the increasingly competitive global market. Prototyping and evaluation measures are inseparable from the design process in the manufacture of a product. And made one of many physical prototypes require very expensive and time consuming, so the technology of Virtual Reality (VR) is needed, so the industry can quickly and precisely in the decision. VR technology combines a human being with a computer environment visually, touch and hearing, so that the user as if into the virtual world. The goal is that users with hand movements can interact with what is displayed on the computer screen or the user can interact with the environment is unreal to be added into the real world. VR is required for simulations that require a lot of interaction such as prototype assembly methods, or better known as the Virtual Assembly. Virtual Assembly concept which was developed as the ability to assemble a real representation of the physical model, the 3D models in CAD software by simulating the natural movement of the human hand. Leap Motion (accuracy of 0.01mm) was used to replace Microsoft's Kinect (accuracy of 1.5cm) and Motion Glove with flex sensors (accuracy of 1°) in several previous research. Leap mot ion controller is a device that captures every movement of the hand to then be processed and integrated with 3D models in CAD software. And simulation of assembly process virtually in CA D software with hand gestures detected by the leap mot ion, assembly parts can be driven either in translation or rotation, zooming and adding the assembly constraint. It also can perform mouse functions (such as left-click, middle-click, right-click and move the mouse cursor position) to a virtual assembly process simulation on CAD software.
This document discusses a technique for detecting copy-paste or cut-paste digital image forgeries. The technique uses wavelet decomposition to decompose the input image into sub-bands, including low-low, low-high, high-low, and high-high. Edge detection is then performed to extract edges. The cut-paste or copy-paste region is identified by examining edge pixels in the wavelet domain, as this region is normally rectangular or square. Parameters like entropy, power energy, and standard deviation are compared between the input image and suspected forged images to detect forgeries. The technique was tested on images with copy-pasted areas and could accurately identify the forged regions.
Elucidating Digital deception: Spot counterfeit fragmentVIT-AP University
An ordinary person always has confidence in the integrity of visual imagery and believes it without any doubt. But today's digital technology has eroded this trust. A relatively new method called image forgery is extensively being used everywhere. This paper proposes a method to depict forged regions in the digital image. The results for the proposed work are obtained using the MATLAB version 7.10.0.499(R2010a). The projected design is such that it extracts the regions that are forged. The proposed scheme is composed for uncompressed still images. Experimental outcome reveals well the validity of the proposed approach.
A FRAMEWORK FOR AUTOMATED PROGRESS MONITORING BASED ON HOG FEATURE RECOGNITIO...ijcsit
Construction target monitoring became one of the key concerns of the construction field. Computer vision
based monitoring methods like 2D image, video and 3D laser scan methods have difficulties in meeting
real-time and data objectivity requirements. This paper points out the applicability of high resolution
remote sensing image (HRRSI) in this task. This paper proposes a way for enhancing Histogram of
Gradients (HOG) feature for HRRSI. HOG is applied as the key feature for machine learning, which is
carried out by the boosting method. To overcome the complexity of HOGs in HRRSIs, this paper builds a
feature framework according to the top, side and height characteristics of engineering ground objects.
Based on the feature framework, this paper presents a HRRSI based framework for construction progress
monitoring. Through an application case, the performance of the framework is presented. Judging from the
result, this method is robust under different seasonal and illuminative conditions.
06 history of cv computer vision - toweards a simple bruth-force utilityzukun
The document provides a history of computer vision from early developments in the 1980s to recent advances. It discusses early challenges like limited hardware and software. While remarkable progress has been made in areas like 3D reconstruction and object detection, challenges remain like developing high-level semantic understanding and overcoming segmentation difficulties. The document also examines research motives and whether computer vision problems are truly engineering problems or require more fundamental advances.
Generating 3 d model in virtual reality and analyzing its performanceijcsit
In this paper is presented an virtual environment of a real model. Here are given all analyzes for
making and vizualization of virtual environment in Quest3D. All analyzes of performance of the system in
real time is presented.We described advantages and disadvantages of interactions in virtual environment
and made a critical analysis on a rendering speed and quality on different machines
3D modeling is used across many industries like video games, film, and education. It involves creating 3D objects and environments using software like 3D Max. There are three main ways to represent 3D models: polygonal modeling using vertices and faces, curve modeling using control points, and digital sculpting. Popular 3D modeling software includes 3ds Max, SketchUp, LightWave, and Blender, which offer tools for modeling, animation, rendering, and more. While 3D modeling is useful, large or complex projects can experience lag and have huge file sizes and render times.
Iirdem screen less displays – the imminent vanguardIaetsd Iaetsd
This document discusses screenless display technology, which involves displaying information without using a physical screen. It describes three main types of screenless displays: visual images like holograms, retinal displays that project images directly onto the retina, and synaptic interfaces that transmit visual information directly to the brain. The document outlines the working principles and potential applications of screenless displays, such as in virtual reality systems. It also discusses the structure and implementation of retinal displays specifically.
This document discusses using deep learning techniques for semantic segmentation of indoor point clouds. It provides an overview of initial ideas for using deep learning models trained on 3D CAD models to classify and label points in an indoor point cloud. It also discusses preprocessing steps like denoising and simplifying the point cloud as well as potential techniques for primitive-based reconstruction from the segmented point cloud.
An Enhanced Method to Detect Copy Move Forgery in Digital Images processing u...IRJET Journal
This document proposes an enhanced method to detect copy-move forgery in digital images using a 2D discrete wavelet transform (DWT) approach. It begins with an abstract that introduces copy-move forgery as a technique where parts of an image are copied and pasted within the same image. The document then reviews related work on copy-move forgery detection and existing methods. It proposes a method that uses 2D-DWT to extract features from blocks of the image for matching copied and pasted regions, aiming to improve upon conventional approaches. Experimental results on a database of images indicate the method can effectively detect copy-move forgeries under different transformations.
Motaz Ahmad El-Saban is a senior applied researcher at Microsoft Research Advanced Technology Labs in Cairo (ATL Cairo). He has over 18 years of experience in machine learning and data science, especially in computer vision, image processing, and speech processing. He has worked on projects involving web ranking, social network analysis, object recognition, video processing, and deep learning. He has authored several patents and publications in these areas.
This document discusses portable retinal imaging and medical diagnostics using deep learning. It focuses on hardware-centric deep learning and end-to-end deep learning pipelines for diagnosis, including optimizing imaging. The document provides an overview of deep learning concepts, case examples in ophthalmology and optometry, and discusses training models versus deploying them for inference. It also covers computing options like GPUs, CPUs, cloud and edge devices, and frameworks for medical image analysis using deep learning.
Augustin Marty, CEO @deepomatic, discussed computer visions' progress thanks to deep learning, at the 2016 Hello Tomorrow Summit. He puts forward a solution to tackle the challenges in computer vision, making AI for every company. Learn more at www.deepomatic.com
Augustin Marty, CEO @deepomatic, discussed computer visions' progress thanks to deep learning at the 2016 Hello Tomorrow Summit. He put forward a solution to tackle the challenges in computer vision, making AI for every company.
This document discusses Motaz El Saban's research experience and interests which focus on analyzing, modeling, learning from, and predicting digital media content such as text, images, and speech. Some key areas of research include real-time video stitching, annotating mobile videos, object and activity recognition from videos, and facial expression recognition using deep learning techniques. The document also outlines El Saban's educational background and provides an agenda for his upcoming presentation.
IRJET- A Survey on Reversible Watermarking Techniques for Image SecurityIRJET Journal
This document summarizes and compares various reversible watermarking techniques for image security. It discusses how reversible watermarking allows a watermark to be fully extracted from an image while also restoring the original cover image. The document reviews early techniques that used modulo addition and later techniques that improved imperceptibility by compressing image bits. It categorizes reversible techniques into data compression, difference expansion, and histogram shifting methods. Comparisons of techniques are presented to improve understanding of advances in reversible watermarking for image security.
In the last decade, new classes of devices for accessing information have emerged along with increased connectivity. In parallel to the proliferation of these devices, new interaction styles have been explored. One major problem is the restrictive human machine interface like gloves, magnetic trackers, and head mounted displays. Such interfaces with excessive connections are hampering the interactivity. This presentation introduces a method to develop a robust computer vision module to capture and recognise the human gestures using a regular video camera and use the full potential of these gestures in a user friendly graphical environment.
Augmented Reality and Virtual Reality: Research Advances in Creative IndustryZi Siang See
This research seminar presents and discusses recent advances in augmented reality (AR) and virtual reality (VR). Recent developments have been implied that AR and VR innovation is more broadly accessible for different expert areas, these include the information technology sectors, education operations, build environment and the creative industries. During the session, participants will be able to experience AR and VR using mobile and head mount devices (HMD). This research talk will provide an overview of AR and VR interface development, industrial use cases and research direction.
A very quick walkthrough on Computer VisionMatias Iacono
Computer vision is a field that teaches computers to understand images and videos in the same way humans do. It aims to automate visual tasks like object recognition, facial recognition, and hand tracking. Some examples include Google's Mediapipe which can track up to two hands with 20 joint points each in 3D space. The presentation provided a quick overview of computer vision and demonstrated a hand tracking demo.
This document summarizes an article from the International Journal of Electronics and Communication Engineering & Technology. The article discusses content-based image retrieval (CBIR) techniques and face recognition methods for face-based image retrieval. CBIR uses features like color, shape and texture to automatically match images without manual annotation. Face-based retrieval combines CBIR and face detection/recognition to find images of a person based on a query image. The document reviews different CBIR and face recognition approaches, discussing their advantages and limitations. It aims to provide background for future work in face-based image retrieval.
This document provides an overview of how virtual reality technology is revolutionizing visual communication and photographic education. It describes how darkrooms have been replaced by digital photography labs equipped with virtual reality workstations. Students can now digitally manipulate photos and even take "photos" within simulated virtual reality environments. While this technology offers new educational opportunities, it also raises concerns about overuse and addiction if not properly monitored. The document outlines various safety mechanisms that were implemented to address these concerns.
Edge AI: Deep Learning techniques for Computer Vision applied to Embedded Sys...Giacomo Bartoli
This document discusses edge AI and applying deep learning techniques for computer vision to embedded systems. It covers convolutional neural networks (CNNs) as the base model for vision tasks, comparing architectures like SSD and YOLO for object detection. It also discusses running models for tasks like object detection on embedded devices like the Google AIY Vision kit and in applications like detecting Pikachu images and in the Audi Autonomous Driving Cup competition. Models are trained and tested on datasets like COCO, Pascal VOC and Core50 to detect objects in real-time on edge devices.
IRJET - 3D Virtual Dressing Room ApplicationIRJET Journal
This document describes a 3D virtual dressing room application that was developed using the Microsoft Kinect sensor. The application aims to address issues with traditional dressing rooms like wasting time trying on clothes, limited variety, and privacy concerns. The proposed approach uses Kinect to extract the user from the video stream, align 3D cloth models to the user's body, and apply skin color detection to handle occlusions. The body joints are used for positioning, scaling, and rotating the cloth models. The models are then overlaid on the user in real-time. The document discusses related work on virtual dressing rooms and 3D alignment of clothes to user models. It also outlines the methodology, including using Kinect's image and depth streams to develop
RECOGNITION OF RECAPTURED IMAGES USING PHYSICAL BASED FEATUREScsandit
With the development of multimedia technology and digital devices, it is very simple and easier to recapture a high quality images from LCD screens. In authentication, the use of such
recaptured images can be very dangerous. So, it is very important to recognize the recaptured images in order to increase authenticity. Image recapture detection (IRD) is to distinguish realscene images from the recaptured ones. An image recapture detection method based on set of physical based features is proposed in this paper, which uses combination of low-level features including texture, HSV colour and blurriness. Twenty six dimensions of features are xtracted to train a suppo rt vector machine classifier with linear kernel. The experimental results show that the proposed method is efficient with good recognition rate of distinguishing real scene images from the recaptured ones. The proposed method also possesses low dimensional features compared to the state-of-the-art recaptured methods.
An Approach for Copy-Move Attack Detection and Transformation RecoveryIRJET Journal
This document presents an approach for detecting copy-move forgery and recovering transformations in digital images. It uses the SIFT algorithm to extract features from images and detect similar features that indicate a copy-move forgery. The RANSAC algorithm is then used to detect any geometric transformations, such as rotation or scaling, that were applied to the copied region. The proposed methodology extracts SIFT features, performs keypoint matching to detect potential forgeries, clusters matched keypoints, and then uses RANSAC to estimate any transformations between the original and copied areas. This process aims to both detect copy-move forgeries and recover images to their original state before transformations were applied.
06 history of cv computer vision - toweards a simple bruth-force utilityzukun
The document provides a history of computer vision from early developments in the 1980s to recent advances. It discusses early challenges like limited hardware and software. While remarkable progress has been made in areas like 3D reconstruction and object detection, challenges remain like developing high-level semantic understanding and overcoming segmentation difficulties. The document also examines research motives and whether computer vision problems are truly engineering problems or require more fundamental advances.
Generating 3 d model in virtual reality and analyzing its performanceijcsit
In this paper is presented an virtual environment of a real model. Here are given all analyzes for
making and vizualization of virtual environment in Quest3D. All analyzes of performance of the system in
real time is presented.We described advantages and disadvantages of interactions in virtual environment
and made a critical analysis on a rendering speed and quality on different machines
3D modeling is used across many industries like video games, film, and education. It involves creating 3D objects and environments using software like 3D Max. There are three main ways to represent 3D models: polygonal modeling using vertices and faces, curve modeling using control points, and digital sculpting. Popular 3D modeling software includes 3ds Max, SketchUp, LightWave, and Blender, which offer tools for modeling, animation, rendering, and more. While 3D modeling is useful, large or complex projects can experience lag and have huge file sizes and render times.
Iirdem screen less displays – the imminent vanguardIaetsd Iaetsd
This document discusses screenless display technology, which involves displaying information without using a physical screen. It describes three main types of screenless displays: visual images like holograms, retinal displays that project images directly onto the retina, and synaptic interfaces that transmit visual information directly to the brain. The document outlines the working principles and potential applications of screenless displays, such as in virtual reality systems. It also discusses the structure and implementation of retinal displays specifically.
This document discusses using deep learning techniques for semantic segmentation of indoor point clouds. It provides an overview of initial ideas for using deep learning models trained on 3D CAD models to classify and label points in an indoor point cloud. It also discusses preprocessing steps like denoising and simplifying the point cloud as well as potential techniques for primitive-based reconstruction from the segmented point cloud.
An Enhanced Method to Detect Copy Move Forgery in Digital Images processing u...IRJET Journal
This document proposes an enhanced method to detect copy-move forgery in digital images using a 2D discrete wavelet transform (DWT) approach. It begins with an abstract that introduces copy-move forgery as a technique where parts of an image are copied and pasted within the same image. The document then reviews related work on copy-move forgery detection and existing methods. It proposes a method that uses 2D-DWT to extract features from blocks of the image for matching copied and pasted regions, aiming to improve upon conventional approaches. Experimental results on a database of images indicate the method can effectively detect copy-move forgeries under different transformations.
Motaz Ahmad El-Saban is a senior applied researcher at Microsoft Research Advanced Technology Labs in Cairo (ATL Cairo). He has over 18 years of experience in machine learning and data science, especially in computer vision, image processing, and speech processing. He has worked on projects involving web ranking, social network analysis, object recognition, video processing, and deep learning. He has authored several patents and publications in these areas.
This document discusses portable retinal imaging and medical diagnostics using deep learning. It focuses on hardware-centric deep learning and end-to-end deep learning pipelines for diagnosis, including optimizing imaging. The document provides an overview of deep learning concepts, case examples in ophthalmology and optometry, and discusses training models versus deploying them for inference. It also covers computing options like GPUs, CPUs, cloud and edge devices, and frameworks for medical image analysis using deep learning.
Augustin Marty, CEO @deepomatic, discussed computer visions' progress thanks to deep learning, at the 2016 Hello Tomorrow Summit. He puts forward a solution to tackle the challenges in computer vision, making AI for every company. Learn more at www.deepomatic.com
Augustin Marty, CEO @deepomatic, discussed computer visions' progress thanks to deep learning at the 2016 Hello Tomorrow Summit. He put forward a solution to tackle the challenges in computer vision, making AI for every company.
This document discusses Motaz El Saban's research experience and interests which focus on analyzing, modeling, learning from, and predicting digital media content such as text, images, and speech. Some key areas of research include real-time video stitching, annotating mobile videos, object and activity recognition from videos, and facial expression recognition using deep learning techniques. The document also outlines El Saban's educational background and provides an agenda for his upcoming presentation.
IRJET- A Survey on Reversible Watermarking Techniques for Image SecurityIRJET Journal
This document summarizes and compares various reversible watermarking techniques for image security. It discusses how reversible watermarking allows a watermark to be fully extracted from an image while also restoring the original cover image. The document reviews early techniques that used modulo addition and later techniques that improved imperceptibility by compressing image bits. It categorizes reversible techniques into data compression, difference expansion, and histogram shifting methods. Comparisons of techniques are presented to improve understanding of advances in reversible watermarking for image security.
In the last decade, new classes of devices for accessing information have emerged along with increased connectivity. In parallel to the proliferation of these devices, new interaction styles have been explored. One major problem is the restrictive human machine interface like gloves, magnetic trackers, and head mounted displays. Such interfaces with excessive connections are hampering the interactivity. This presentation introduces a method to develop a robust computer vision module to capture and recognise the human gestures using a regular video camera and use the full potential of these gestures in a user friendly graphical environment.
Augmented Reality and Virtual Reality: Research Advances in Creative IndustryZi Siang See
This research seminar presents and discusses recent advances in augmented reality (AR) and virtual reality (VR). Recent developments have been implied that AR and VR innovation is more broadly accessible for different expert areas, these include the information technology sectors, education operations, build environment and the creative industries. During the session, participants will be able to experience AR and VR using mobile and head mount devices (HMD). This research talk will provide an overview of AR and VR interface development, industrial use cases and research direction.
A very quick walkthrough on Computer VisionMatias Iacono
Computer vision is a field that teaches computers to understand images and videos in the same way humans do. It aims to automate visual tasks like object recognition, facial recognition, and hand tracking. Some examples include Google's Mediapipe which can track up to two hands with 20 joint points each in 3D space. The presentation provided a quick overview of computer vision and demonstrated a hand tracking demo.
This document summarizes an article from the International Journal of Electronics and Communication Engineering & Technology. The article discusses content-based image retrieval (CBIR) techniques and face recognition methods for face-based image retrieval. CBIR uses features like color, shape and texture to automatically match images without manual annotation. Face-based retrieval combines CBIR and face detection/recognition to find images of a person based on a query image. The document reviews different CBIR and face recognition approaches, discussing their advantages and limitations. It aims to provide background for future work in face-based image retrieval.
This document provides an overview of how virtual reality technology is revolutionizing visual communication and photographic education. It describes how darkrooms have been replaced by digital photography labs equipped with virtual reality workstations. Students can now digitally manipulate photos and even take "photos" within simulated virtual reality environments. While this technology offers new educational opportunities, it also raises concerns about overuse and addiction if not properly monitored. The document outlines various safety mechanisms that were implemented to address these concerns.
Edge AI: Deep Learning techniques for Computer Vision applied to Embedded Sys...Giacomo Bartoli
This document discusses edge AI and applying deep learning techniques for computer vision to embedded systems. It covers convolutional neural networks (CNNs) as the base model for vision tasks, comparing architectures like SSD and YOLO for object detection. It also discusses running models for tasks like object detection on embedded devices like the Google AIY Vision kit and in applications like detecting Pikachu images and in the Audi Autonomous Driving Cup competition. Models are trained and tested on datasets like COCO, Pascal VOC and Core50 to detect objects in real-time on edge devices.
IRJET - 3D Virtual Dressing Room ApplicationIRJET Journal
This document describes a 3D virtual dressing room application that was developed using the Microsoft Kinect sensor. The application aims to address issues with traditional dressing rooms like wasting time trying on clothes, limited variety, and privacy concerns. The proposed approach uses Kinect to extract the user from the video stream, align 3D cloth models to the user's body, and apply skin color detection to handle occlusions. The body joints are used for positioning, scaling, and rotating the cloth models. The models are then overlaid on the user in real-time. The document discusses related work on virtual dressing rooms and 3D alignment of clothes to user models. It also outlines the methodology, including using Kinect's image and depth streams to develop
RECOGNITION OF RECAPTURED IMAGES USING PHYSICAL BASED FEATUREScsandit
With the development of multimedia technology and digital devices, it is very simple and easier to recapture a high quality images from LCD screens. In authentication, the use of such
recaptured images can be very dangerous. So, it is very important to recognize the recaptured images in order to increase authenticity. Image recapture detection (IRD) is to distinguish realscene images from the recaptured ones. An image recapture detection method based on set of physical based features is proposed in this paper, which uses combination of low-level features including texture, HSV colour and blurriness. Twenty six dimensions of features are xtracted to train a suppo rt vector machine classifier with linear kernel. The experimental results show that the proposed method is efficient with good recognition rate of distinguishing real scene images from the recaptured ones. The proposed method also possesses low dimensional features compared to the state-of-the-art recaptured methods.
An Approach for Copy-Move Attack Detection and Transformation RecoveryIRJET Journal
This document presents an approach for detecting copy-move forgery and recovering transformations in digital images. It uses the SIFT algorithm to extract features from images and detect similar features that indicate a copy-move forgery. The RANSAC algorithm is then used to detect any geometric transformations, such as rotation or scaling, that were applied to the copied region. The proposed methodology extracts SIFT features, performs keypoint matching to detect potential forgeries, clusters matched keypoints, and then uses RANSAC to estimate any transformations between the original and copied areas. This process aims to both detect copy-move forgeries and recover images to their original state before transformations were applied.
THE EFFECT OF PHYSICAL BASED FEATURES FOR RECOGNITION OF RECAPTURED IMAGESijcsit
It is very simple and easier to recapture a high quality images from LCD screens with the development of multimedia technology and digital devices. In authentication, the use of such recaptured images can be very dangerous. So, it is very important to recognize the recaptured images in order to increase authenticity. Even though, there are a number of features that have been proposed in various state-of-theart
visual recognition tasks, but it is still difficult to decide which feature or combination of features have more significant impact on this task. In this paper an image recapture detection method based on set of physical based features including texture, HSV colour and blurriness is proposed. Also, this paper evaluates the performance of different distinctive featuresin the context of recognition of recaptured
images. Several experimental setups have been conducted in order to demonstrate the performance of the proposed method. In all these experimental results, the proposed method is efficient with good recognition rate. Among the combination of low-level features, CS-LBP detection is to operator which is used to extract the texture feature is the most robust feature.
Seminar Report with proper format. Includes Front page, Certificate and Acknowledgement pages. This is full report of seminar topic Augmented Reality. - See more at: http://seminartopics.info/sample-seminar-reports-format/#sthash.Y3hnq2Ca.dpuf
Markerless motion capture for 3D human model animation using depth cameraTELKOMNIKA JOURNAL
3D animation is created using keyframe based system in 3D animation software such as Blender and Maya. Due to the long time interval and the need of high expertise in 3D animation, motion capture devices were used as an alternative and Microsoft Kinect v2 sensor is one of them. This research analyses the capabilities of the Kinect sensor in producing 3D human model animations using motion capture and keyframe based animation system in reference to a live motion performance. The quality, time interval and cost of both animation results were compared. The experimental result shows that motion capture system with Kinect sensor consumed less time (only 2.6%) and cost (30%) in the long run (10 minutes of animation) compare to keyframe-based system, but it produced lower quality animation. This was due to the lack of body detection accuracy when there is obstruction. Moreover, the sensor’s constant assumption that the performer’s body faces forward made it unreliable to be used for a wide variety of movements. Furthermore, standard test defined in this research covers most body parts’ movements to evaluate other motion capture system.
Rise of augmented reality : current and future applicationsU Reshmi
This seminar presentation discusses the rise of augmented reality, including its current and future applications. It provides an introduction to augmented reality and discusses its implementation through components like head-mounted displays, tracking systems, and mobile computing power. Examples of current augmented reality applications are given in the medical, entertainment, military, engineering, robotics, education and other fields. Challenges like accurate tracking and limited computing power are also outlined. The conclusion discusses the future scope of augmented reality becoming indistinguishable from real world as technology advances.
Face Mask Detection System Using Artificial IntelligenceIRJET Journal
This document describes a face mask detection system using artificial intelligence. The system is designed as a two-phase model, with the first phase involving training a convolutional neural network (CNN) model on a dataset of images containing faces with and without masks. In the second phase, the trained model can detect masks in real-time videos and classify faces as with or without a mask. The goal is to implement the system in public places to help enforce mask policies and reduce COVID-19 transmission. The model achieves accurate detection on both static images and videos by using data augmentation techniques to increase variability in the training dataset.
iMinds insights - 3D Visualization TechnologiesiMindsinsights
Transforming the way we deal with information - from consumption to interaction.
iMinds insights is a quarterly publication providing you with relevant tech updates based on interviews with academic and industry experts. iMinds is a digital research center and incubator based in Belgium.
A novel visual tracking scheme for unstructured indoor environmentsIJECEIAES
In the ever-expanding sphere of assistive robotics, the pressing need for advanced methods capable of accurately tracking individuals within unstructured indoor settings has been magnified. This research endeavours to devise a realtime visual tracking mechanism that encapsulates high performance attributes while maintaining minimal computational requirements. Inspired by the neural processes of the human brain’s visual information handling, our innovative algorithm employs a pattern image, serving as an ephemeral memory, which facilitates the identification of motion within images. This tracking paradigm was subjected to rigorous testing on a Nao humanoid robot, demonstrating noteworthy outcomes in controlled laboratory conditions. The algorithm exhibited a remarkably low false detection rate, less than 4%, and target losses were recorded in merely 12% of instances, thus attesting to its successful operation. Moreover, the algorithm’s capacity to accurately estimate the direct distance to the target further substantiated its high efficacy. These compelling findings serve as a substantial contribution to assistive robotics. The proficient visual tracking methodology proposed herein holds the potential to markedly amplify the competencies of robots operating in dynamic, unstructured indoor settings, and set the foundation for a higher degree of complex interactive tasks.
This document provides an overview of computer imaging, which can be separated into digital image processing and computer vision. Digital image processing involves examining image data to solve problems and typically outputs images for human consumption, covering topics like image restoration, enhancement, and compression. Computer vision is intended to analyze images for computer use, outputting attributes rather than images, and covers topics like segmentation, recognition, and 3D reconstruction. The document outlines several applications of computer imaging in fields like medicine, security, and robotics, and discusses the current state of the art in areas like object recognition, medical imaging, and vision-based human-computer interaction.
A Review on Various Forgery Detection Techniquesijtsrd
This document provides a review of various techniques for detecting digital image forgeries. It begins by discussing the importance of authenticating digital images and the rise of image manipulation. The main types of forgeries are described as copy-move, image splicing, and image retouching. Techniques for forgery detection are classified as active/intrusive or passive/non-intrusive. Key passive techniques discussed include pixel-based analysis of statistical anomalies, format-based analysis exploiting properties of file formats like JPEG, and camera-based analysis of camera-related features. The document then focuses on copy-move forgery detection, outlining the general workflow and reviewing block-based and key-point based matching approaches. Specific algorithms
This document summarizes a research paper that introduces a novel approach to bypass modern face authentication systems using virtual reality. Specifically, the researchers show that by leveraging a handful of publicly available photos of a target user, they can create a realistic 3D facial model that undermines the security of widely used face authentication solutions. Their framework uses VR systems to display the synthetic facial model, along with animations like smiling or eyebrow raising, in order to trick liveness detectors. They conduct experiments demonstrating that their approach is able to undermine the security of several commercial face authentication systems.
Broadcasting Forensics Using Machine Learning Approachesijtsrd
Broadcasting forensic is the practice of using scientific methods and techniques to analyse and authenticate Multimedia content. Over the past decade, consumer grade imaging sensors have become increasingly prevalent, generating vast quantities of images and videos that are used for various public and private communication purposes. Such applications include publicity, advocacy, disinformation, and deception, among others. This paper aims to develop tools that can extract knowledge from these visuals and comprehend their provenance. However, many images and videos undergo modification and manipulation before public release, which can misrepresent the facts and deceive viewers. To address this issue, we propose a set of forensics and counter forensic techniques that can help establish the authenticity and integrity of Multimedia content. Additionally, we suggest ways to modify the content intentionally to mislead potential adversaries. Our proposed tools are evaluated using publicly available datasets and independently organized challenges. Our results show that the forensics and counter forensic techniques can accurately identify manipulated content and can help restore the original image or video. Furthermore, in this paper demonstrate that the modified content can successfully deceive potential adversaries while remaining undetected by state of the art forensic methods. Amit Kapoor | Prof. Vinod Mahor "Broadcasting Forensics Using Machine Learning Approaches" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-7 | Issue-3 , June 2023, URL: https://www.ijtsrd.com.com/papers/ijtsrd57545.pdf Paper URL: https://www.ijtsrd.com.com/engineering/computer-engineering/57545/broadcasting-forensics-using-machine-learning-approaches/amit-kapoor
Wireless network implementation is a viable option for building network infrastructure in rural communities. Rural people lack network infrastructures for information services and socio-economic development. The aim of this study was to develop a wireless network infrastructure architecture for network services to rural dwellers. A user-centered approach was applied in the study and a wireless network infrastructure was designed and deployed to cover five rural locations. Data was collected and analyzed to assess the performance of the network facilities. The results shows that the system had been performing adequately without any downtime with an average of 200 users per month and the quality of service has remained high. The transmit/receive rate of 300Mbps was thrice as fast as the normal Ethernet transmit/receive specification with an average throughput of 1 Mbps. The multiple output/multiple input (MIMO) point-to-multipoint network design increased the network throughput and the quality of service experienced by the users.
3D reconstruction is a technique used in computer vision which has a wide range of applications in areas like object recognition, city modelling, virtual reality, physical simulations, video games and special effects. Previously, to perform a 3D reconstruction, specialized hardwares were required. Such systems were often very expensive and was only available for industrial or research purpose. With the rise of the availability of high-quality low cost 3D sensors, it is now possible to design inexpensive complete 3D scanning systems. The objective of this work was to design an acquisition and processing system that can perform 3D scanning and reconstruction of objects seamlessly. In addition, the goal of this work also included making the 3D scanning process fully automated by building and integrating a turntable alongside the software. This means the user can perform a full 3D scan only by a press of a few buttons from our dedicated graphical user interface. Three main steps were followed to go from acquisition of point clouds to the finished reconstructed 3D model. First, our system acquires point cloud data of a person/object using inexpensive camera sensor. Second, align and convert the acquired point cloud data into a watertight mesh of good quality. Third, export the reconstructed model to a 3D printer to obtain a proper 3D print of the model.
The document describes the development of a low-cost 3D scanning system using an integrated turntable. Key points:
1) The system uses an inexpensive Kinect sensor and open-source Point Cloud Library to acquire 3D point cloud data of an object placed on an automated turntable.
2) The turntable is designed to be low-cost, using a modified twist board powered by a DC motor controlled via an Arduino microcontroller.
3) The software synchronizes point cloud acquisition with turntable rotation to automatically capture data from multiple angles and register them into a single aligned point cloud for surface reconstruction.
COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...ijcsit
3D reconstruction is a technique used in computer vision which has a wide range of applications in
areas like object recognition, city modelling, virtual reality, physical simulations, video games and
special effects. Previously, to perform a 3D reconstruction, specialized hardwares were required.
Such systems were often very expensive and was only available for industrial or research purpose.
With the rise of the availability of high-quality low cost 3D sensors, it is now possible to design
inexpensive complete 3D scanning systems. The objective of this work was to design an acquisition and
processing system that can perform 3D scanning and reconstruction of objects seamlessly. In addition,
the goal of this work also included making the 3D scanning process fully automated by building and
integrating a turntable alongside the software. This means the user can perform a full 3D scan only by
a press of a few buttons from our dedicated graphical user interface. Three main steps were followed
to go from acquisition of point clouds to the finished reconstructed 3D model. First, our system
acquires point cloud data of a person/object using inexpensive camera sensor. Second, align and
convert the acquired point cloud data into a watertight mesh of good quality. Third, export the
reconstructed model to a 3D printer to obtain a proper 3D print of the model.
Fast discrimination of fake video manipulationIJECEIAES
Deepfakes have become possible using artificial intelligence techniques, replacing one person’s face with another person’s face (primarily a public figure), making the latter do or say things he would not have done. Therefore, contributing to a solution for video credibility has become a critical goal that we will address in this paper. Our work exploits the visible artifacts (blur inconsistencies) which are generated by the manipulation process. We analyze focus quality and its ability to detect these artifacts. Focus measure operators in this paper include image Laplacian and image gradient groups, which are very fast to compute and do not need a large dataset for training. The results showed that i) the Laplacian group operators, as a value, may be lower or higher in the fake video than its value in the real video, depending on the quality of the fake video, so we cannot use them for deepfake detection and ii) the gradient-based measure (GRA7) decreases its value in the fake video in all cases, whether the fake video is of high or low quality and can help detect deepfake.
This document summarizes augmented reality (AR) technology. It discusses how AR enhances the real-world environment by incorporating digital information like graphics. Examples of AR applications discussed include Intel's x-ray glasses that allow seeing inside objects and Google's Project Tango, which uses sensors and cameras to integrate 3D environments into mobile devices. The document traces the history of AR concepts back to Rene Descartes in the 1600s and discusses ongoing research areas like improving depth sensing and object recognition to advance AR capabilities.
Similar to Human Photogrammetry: Foundational Techniques for Creative Practitioners (20)
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
TIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEMHODECEDSIET
Time Division Multiplexing (TDM) is a method of transmitting multiple signals over a single communication channel by dividing the signal into many segments, each having a very short duration of time. These time slots are then allocated to different data streams, allowing multiple signals to share the same transmission medium efficiently. TDM is widely used in telecommunications and data communication systems.
### How TDM Works
1. **Time Slots Allocation**: The core principle of TDM is to assign distinct time slots to each signal. During each time slot, the respective signal is transmitted, and then the process repeats cyclically. For example, if there are four signals to be transmitted, the TDM cycle will divide time into four slots, each assigned to one signal.
2. **Synchronization**: Synchronization is crucial in TDM systems to ensure that the signals are correctly aligned with their respective time slots. Both the transmitter and receiver must be synchronized to avoid any overlap or loss of data. This synchronization is typically maintained by a clock signal that ensures time slots are accurately aligned.
3. **Frame Structure**: TDM data is organized into frames, where each frame consists of a set of time slots. Each frame is repeated at regular intervals, ensuring continuous transmission of data streams. The frame structure helps in managing the data streams and maintaining the synchronization between the transmitter and receiver.
4. **Multiplexer and Demultiplexer**: At the transmitting end, a multiplexer combines multiple input signals into a single composite signal by assigning each signal to a specific time slot. At the receiving end, a demultiplexer separates the composite signal back into individual signals based on their respective time slots.
### Types of TDM
1. **Synchronous TDM**: In synchronous TDM, time slots are pre-assigned to each signal, regardless of whether the signal has data to transmit or not. This can lead to inefficiencies if some time slots remain empty due to the absence of data.
2. **Asynchronous TDM (or Statistical TDM)**: Asynchronous TDM addresses the inefficiencies of synchronous TDM by allocating time slots dynamically based on the presence of data. Time slots are assigned only when there is data to transmit, which optimizes the use of the communication channel.
### Applications of TDM
- **Telecommunications**: TDM is extensively used in telecommunication systems, such as in T1 and E1 lines, where multiple telephone calls are transmitted over a single line by assigning each call to a specific time slot.
- **Digital Audio and Video Broadcasting**: TDM is used in broadcasting systems to transmit multiple audio or video streams over a single channel, ensuring efficient use of bandwidth.
- **Computer Networks**: TDM is used in network protocols and systems to manage the transmission of data from multiple sources over a single network medium.
### Advantages of TDM
- **Efficient Use of Bandwidth**: TDM all
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...University of Maribor
Slides from talk presenting:
Aleš Zamuda: Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapter and Networking.
Presentation at IcETRAN 2024 session:
"Inter-Society Networking Panel GRSS/MTT-S/CIS
Panel Session: Promoting Connection and Cooperation"
IEEE Slovenia GRSS
IEEE Serbia and Montenegro MTT-S
IEEE Slovenia CIS
11TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONIC AND COMPUTING ENGINEERING
3-6 June 2024, Niš, Serbia
Human Photogrammetry: Foundational Techniques for Creative Practitioners
1. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
DOI: 10.5121/ijcga.2020.10101 1
HUMAN PHOTOGRAMMETRY: FOUNDATIONAL
TECHNIQUES FOR CREATIVE PRACTITIONERS
Trendt Boe and Chris Carter
Film, Screen & Animation, Queensland University of Technology, Brisbane, Australia
ABSTRACT
Photogrammetry has emerged as a leading approach for photorealistic digital replication and 3D scanning
of real-world objects, particularly in areas of cinematic visual effects and interactive entertainment. While
the technique generally relies on simple photography methods, the foundational practices for the field of
human photogrammetry remain relatively undocumented.Human subjects are significantly more complex
than still life, both in terms of photogrammetric capture, and in digital reproduction. Without the
documentation of foundational practices for human subjects, there is a significant knowledge barrier for
new creative practitioners to operate in the field, stifling innovation and adoption of the
technique.Researchers and commercial practitioners currently working in this field continually distribute
learnings and research outcomes. These learnings tend to centralise more on advanced practices such as
capturing micro-geometry (skin pores), reflectance and skin distortion. However, the standard principles
for building capture systems, considerations for human subjects, processing considerations and technology
requirements remain elusive. The purpose of this research is to establish foundational practices for human
photogrammetry systems. These practices encapsulate the underlying architectures of capture systems,
through to necessary data processing for the 3D reconstruction of human subjects.Design-led research was
used to construct a scale 21-camera system, designed for high-quality data capture of the human head. Due
to its incredible level of surface complexity, the face was used to experiment with a variety of capture
techniques and system arrangements, using several human subjects. The methods used were a result of the
analysis of existing practitioners and research, refined through numerous iterations of system design.A
distinct set of findings were synthesised to form a foundational architecture and blueprint for a scale,
human photogrammetry multi-camera system. It covers the necessary knowledge and principles required to
construct a production-ready photogrammetry system capable of consistent, high-quality capture that
meets the needs of visual effects and interactive entertainment production.
KEYWORDS
Photogrammetry, 3D scan, body scan, photoscan, digital double, digital actor, face scan, photorealistic
human, photogrammetry pipeline, digital replication, multi-camera, foundations
1. INTRODUCTION AND BACKGROUND
Photogrammetry has recently emerged as a leading approach for the digital replication of real-
world objects in photorealistic quality. Through the use of consumer-grade photographic
equipment, creative practitioners can generate accurate Three Dimensional (3D) models for use in
any digital application; from 3D printing to cinematic visual effects sequences. This approach has
also risen as a method for capturing photorealistic human ‘digital-doubles’, generally through the
use of purpose-built multi-camera systems.
Despite some specialist practitioners operating within the domain of human photogrammetry,
very little information is readily available regarding the fundamental practices and techniques for
designing or building these systems. Without access to this information, practitioners looking to
enter the field face significant knowledge barriers and uncertainty. The purpose of this project is
2. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
2
to establish a foundational knowledge set for creating human photogrammetry systems, as well as
processing techniques, to make the field more accessible to new creative practitioners.
This project was conducted using a design-led approach, to iteratively construct a scale, 21-
camera human photogrammetry system that was informed by analysis of existing practitioners
and rigorous testing of various designs. This resulting system, in combination with findings
throughout the project, was used as a model to devise a foundational architecture and blueprint
intended for future practitioners to adopt as a starting point when entering the field.
The creation of photorealistic, computer-generated characters for cinema and interactive
entertainment has historically been a challenge. As technology provides us with the capability to
render characters with an increasing amount of human likeness, creations become more
susceptible to the ‘Uncanny Valley’ phenomenon. That is, when we realise something we initially
thought to be a real human form is artificial, we experience a feeling of discomfort and see the
subject with a negative sense of affinity [1]. Flueckiger[2]highlighted how detrimental this
discomfort could be to the success of a work, in her discussion about the shortcomings of
‘unlifelike’ characters in The Polar Express [3].
Recent advancement in technology and techniques for the creation of digital human characters, or
‘digital doubles’, has led to some works crossing the uncanny valley with increasing frequency.
One such example of this was Image Metrics’ Digital Emily [Figure 1] project in 2008 [4]which
was followed closely by ICT’s Digital Ira[5], and real-time versions of this work through
collaboration with Activision, Inc.
Figure 1. Still frame from the Digital Emily [6] project
Fundamental to each of these achievements was a modern process known as photogrammetry,
which involvesphotographing a subject from many angles,to generate a 3Dreconstruction by
computationally solving correlations between each image [7]. For human subjects, this is
typically done through the use of multi-camera arrays [8], triggered simultaneously or in specific
sequences. The image below [Figure 2] features a contemporary multi-camera array used for
human photogrammetry. This system relies on 170 Digital SLR cameras to capture the human
form in its entirety in 1/10,000th of a second, with the resulting reconstruction resolving under
2mm [9].
3. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
3
Figure 2. Ten24’s [10]T170 Capture Stage
As several practitioners have advanced in this field, the process has become heavily utilised in
cinematic visual effects production processes [11] and more recently, interactive entertainment.
Captain America: Civil War[12] saw the technology used to create digital body doubles of actors
[13], whileThe Curious Case of Benjamin Button[14] utilised the technology more creatively for
photo-realistic ageing effects [15]. Interactive entertainment is progressively adopting the
technology for character creation pipelines in games [16] such as Infamous Second Son[17], and
emerging practices such as real-time cinematography and performance capture, as seen in Ninja
Theory’s collaborative work presented at SIGGRAPH 2016 [18].
Due to its accuracy and efficiency of capture, the use of photogrammetric assets becoming more
prevalent across several fields in addition to cinematic visual effects and interactive
entertainment. Such applications include forensic documentation [19], anthropometry[20]and
biomedical applications [21]. Findings in this paper can apply to wider domains of practice;
however, this research remains focussed on determining foundational practices for human
photogrammetry for the needs of cinematic visual effects and interactive entertainment
productions.As such, this paper focussed on the design of a system that can capture detailed
geometry and accurate colour information for use in these production environments. As this
research focuses on establishing foundational practices, it is expected that outcomes may not
reach the same fidelity as the leading practitioners. However, the knowledge acquired and the
distilled set of practices should provide enough information for future practitioners to establish a
system, and continue to move forward with the technology with significantly lower budgets.
2. METHODOLOGY AND METHODS
The overarching intent of the research was to reduce the gaps in readily-available knowledge of
building human photogrammetry systems through the analysis of existing practices and the
development of a functioning capture system. This project adhered to a Design Research
methodology relying heavily on iterative design processes to achieve this outcome. Adopting
Simonsen’s [22] definition of Design Research, this project focused on changing the available
knowledge of human photogrammetry system design. Referring to Simonsen’s model, the design
process is described as fostering change from a particular situation,“A”, to another, “B”[22]. For
this project, “A”, refers to the current situation where available knowledge for human
photogrammetry system design is scarce; and the alternative, “B”, is the ideal situation where this
knowledge is more clearly defined and available for creative practitioners.
4. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
4
Project activities were partitioned into distinct, yet interdependent areas of focus, and conducted
using an iterative process. These partitions align with three objectives: design and development of
a capture system, examination of techniques for high-quality capture, and establishing a data
processing workflow.The crucial finalgoal – defining a foundational architecture, was the
culmination of all other objectives. Each of the three partitions followed Simonsen’s iterative
design process.
Starting with a critical problem as the current situation, ‘A’, an idea, vision or strategy to solve
each issue would be established. A plan to conduct activities would then be derived, and an
iterative process of work would be undertaken to reach a workable solution ‘B’. The resulting
solution would be deployed as intended, with observations of outcomes documented and any
relevant data stored for later analysis. The findings were then evaluated against the intent of the
design to determine the next appropriate steps for the project.As an example of this approach, one
of the first key activities was to design, plan and develop the first iteration of the scale system.
Shown below in Figure 3, the early stage of this cycle involved analysis of existing practitioners,
their methods and any relevant knowledge available [‘A1’)]A low-fidelity design was derived,
and a plan made to begin iteration. Iterations of development activities occurred until a
functioning capture system was created and put into use [‘B1’].
Figure 3. An adapted iterative design process used in this project.
The system behaviour, performance and characteristics were continually observed as it was in
use. As the captured data was processed, the quality of results was compared to that of model
practitioners to evaluate the next steps moving forward. In the first instance, this meant
conducting additional analysis and research to find ways to improve scan quality. In Figure 3, a
different circle, ‘A2’,demonstrates the pattern of continual iteration for the project.
Throughout the project, threesignificantversionsof system design and building occurred. Within
each version, continual iteration was conducted to identify the best techniques for high-quality
5. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
5
subject capture and reconstruction. The capabilities of the system progressed from partial, low-
resolution facial capture, to medium resolution full head capture.These iterations continually
worked towards building a system that could achieve results comparable to leading practitioners,
with consideration given for differences in available hardware and resources. Each version’s
capture capability was tested against a variety of human subjects, with performance qualitatively
assessed against the resulting 3D reconstructions. It was important that any resulting solution was
also robust and efficient enough for rapid data acquisition and reconstructing subjects of varying
dimensions to meet the needs of cinematic visual effects and interactive entertainment production.
Table 1 and Figure 4 below demonstrate the technical specifications and physical configuration of
each significantversion, which is referred to as ‘Build 1’, ‘Build 2’ and ‘Build 3’, concerning the
chronological order of their development. The cameras in Figure 4 are coloured blue to emphasise
their physical configuration.
Table 1. Key technical specifications of the three main system variations.
Feature Build 1 Build 2 Build 3
No. of Cameras 12-16 16 21
Camera Model
Canon 400D
DSLR
Canon 400D
DSLR
Canon 400D DSLR
Lens 18-55mm, varied 18-55mm, varied 18-55mm, all 55mm
Lighting Continuous LED
2x Continuous
Fluorescent
3x 300W Strobe
Shutter Control Software via. USB Software via. USB Custom Hardware
Camera
Positioning
Even distribution Even distribution
Paired, with reference
cameras
Subject
Stabilisation
None None Rod behind subject head
Subject Seating Fixed chair Fixed chair Adjustable height chair
Linked Computer
2x Laptop
Computers
1x Desktop
Computer
1x Desktop Computer
Figure 4Diagrams and photographs of Build 1 (top left), Build 2 (top right) and Build 3 (bottom).
6. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
6
All three system builds adhered to the same general process for capture throughout the project. A
subject would perform a static facial pose inside a multi-camera array. Multiple images were then
captured and parsed through publically accessible software for processing and 3D reconstruction.
Eight subjects participated in this process, with some performing up to 20 different expressions
for capture. Toward the end of the project, one sample progressed from acquisition to integration
in a real-time rendering engine for performance capture in under 6 hours.
3. SYSTEM DESIGN
This section presents findings from investigating vital elements of the physical system design that
influence both the quality of image capture and overlap of information captured from each
camera. These two aspects were prioritised to aid the feature-based matching process [7]typically
used by automated photogrammetry solutions, as this process requires apparent identifiable
features in multiple images of the same scene for 3D correlation. From a technical standpoint, this
meant acquiring sharp and evenly lit imagery, while positioning cameras in a way that balances
overlapping fields of view and subject coverage. Additionally, blurry images are hugely
detrimental to reconstruction processes [23], so a considerable depth of field and fast subject
exposure was pursued to minimise the risk of blurred images.
3.1. Lighting
Subject illumination proved to be critically important to achieving consistently high-quality data.
This project experimented with a varied use of continuous and strobe lighting to understand the
impacts on image and reconstruction quality. As discussed in Section 6, as a result of these
experiments, the most reliable and highest quality architecture relied on the use of strobe lighting,
triggered manually during a 2-second camera exposure.
3.2. Continuous Lighting
Build 1 and 2 relied on variations ofconsumer-grade continuous lighting to achieve even lighting
across the subject. The first build used LED strip lighting mounted to the system frame. The
second build used 2, 2x600W 60hz fluorescent battens positioned around the capture volume in a
butterfly pattern. Diagrams of these lighting configurations can be seen below in Figure 5.
Figure 5Diagrams of LED strip lighting solution (top) and batten solution (bottom).
The essential advantage of continuous lighting was that it allowed asynchronous shutter
activation across the sensors enabling the system to function without the need for any additional
7. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
7
hardware.The lack of output and brightness was the main drawback of these solutions. In turn,
this limited depth of field and generally relied on higher ISO settings to increase image
brightness, which increased the noise in the images, as shown below in Figure 6.
Figure 6. Comparison of coloured noise present in high ISO imagery
3.3. Strobe Lighting
Build 3 used strobe lighting, which depended on a synchronised trigger. The setup used 3x300W
strobe lights at full power, evenly arranged around the subject and positioned for bounce-
diffusion off clean white walls. The project investigated two capture techniques: strobe activation
synchronised to a single sensor, and manual strobe activation during a long exposure. A diagram
of this setup is shown below in Figure 7.
Figure 7. Depictions of strobe lighting setup.
The first iteration synchronised the strobe flash to a single camera in the array. Triggering the
camera shutters and flash in this fashion was inadequate, as some images were frequently
appearing underexposed. Delays occurring in each camera, known as ‘shutter lag’, caused the
8. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
8
underexposure. The Canon 400Ds used in this project were prone to lag times ranging from 66ms
to 116ms [24], which was a large enough variance for shutters to cycle out of time with the strobe
flash.An alternative capture technique was used to overcome this issue.
A wireless remote was used to trigger the system manually, instead of synchronising the strobes
and a 2-second exposure was used in a darkened environment to eliminate the impact of shutter
lag between the cameras. Similar to the approach used by Beeler et al.[25], roughly 1 second into
the exposure, the strobe units were triggered, briefly illuminating the subject for capture on each
camera. A graphical representation of these capture sequences is shown below in Figure 8.
Figure 8. A timeline of trigger execution using a 2-second exposure.
Using a long exposure ensured there was enough overlap of ‘open shutter’ time between all
cameras, illuminating the possibility of missing the strobe flash. A darkened environment was
necessary for capture to avoid unwanted visual artefacts during the long exposure.
3.4. Data Connectivity and Trigger Mechanism
Underlying the design of each build was a motivation to minimise the hands-on effort required to
operate the system, as this would reduce operational complexity. The two aspects focused on for
these builds were: establishing computer connectivity to the system for data acquisition and
enabling tight control of the camera shutters. The most effective method was to use a custom
hardware shutter release for synchronised shutter release and rely on camera controlling software
for automating the data acquisition step.
3.5. Computer Connectivity
During the development of Build 1, the controlling laptop computers had difficulty supporting
such a high number of connected USB peripherals. Build 3 overcame these connectivity issues by
using a desktop computer and three powered USB hubs connected to a 21 camera array.
3.6. Camera Shutter Control
This project explored two methods for activating the camera shutters. Builds 1 and 2 relied on
activation via the camera controlling software and USB connection. Build 3 used a custom
hardware trigger, activated with a physical push-button for synchronised shutter release. Each
method possessed unique characteristics which influenced overall system architecture.
It was found that using the camera controlling software to activate the camera shutters, while
convenient for system operation, had disadvantages for system synchronisation. When triggering
the shutter release, each camera shutter activated sequentially. Depending on the chosen shutter
speed and number of connected cameras, this could lead to extended capture times – creating a
risk that the subject may move or change pose during the capture process. A sequential shutter
activation also creates difficulty for using strobe flashes due to the lack of synchronisation.
The shutters of each camera could be activated almost simultaneously by using a physical push-
button and custom hardware (Figure 9). Labels 1 and 2 feature the in-progress construction of the
device. Labels 3 and 4 feature the device mounted into a sealed container with connections for
synchronising cameras. Label 5 features the equipmentattached to a stand for easy access within
9. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
9
the system.Technical suggestions from Breeze Systems [26] and an existing product, Esper
Design’s Shuttercell [27] influenced the concept and design of this system.
Figure 9. The custom hardware for synchronised triggering.
Although this created additional steps for system operation, it enabled the use of strobe flashes for
lighting and mitigated the risk of subject movement as synchronisation could reach speeds of
1/200th of a second. A comparison of these two techniques is graphically represented below in
Figure 10.
Figure 10. Graphical representation of a sequential (top) shutter control via USB vs. Synchronised shutter
control (bottom) via custom hardware.
3.7. Camera Positioning
Several variations were used throughout each build, with continual iteration to improve overall
reconstruction fidelity and subject coverage. Build 3 used side-by-side paired camera arrangement
to surround the subject. Some individual cameras also provided additional imagery from isolated
locations. A balance was found between the camera distances to subject, desirable depth of field,
and subject framing. Builds 1 and 2 used an even spacing method for the distribution of cameras
around the subject. There were noticeable inconsistencies in the processing quality and camera
alignment effectiveness across different test cases. Experimentation of camera position revealed
insufficient image overlap between each image was causing the inconsistent performance.
10. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
10
Build 3, tested a variety of positioning options for reliability and reconstruction quality. Initially,
the same method of evenly distributed positioning was used to validate the impact of lighting
changes. Positioning patterns were then modified iteratively to improve outcomes. The Build 3
pattern relied on the use ofclose ‘pairs’ of cameras, in intervals of 45 degrees around the subject.
Additional individual cameras were positioned to provide imagery of occluded areas, or to
increase correlation with the paired cameras.The figures below illustrate example layouts of
camera positioning in each build (Figure 11), as well as side by side results of paired and
unpaired methodologies (Figure 12).
Figure 11. Example camera positions for Build 1 (top left), Build 2 (top right) and Build 3 (bottom).
Figure 12. Comparison of reconstruction quality using unpaired (left) and paired (right) camera positions.
3.8. Subject Framing
For the duration of this project, each build aimed to ensure a fully framed subject was present
within each image.An example of this framing is seen below in Figure 13. It is recommended by
Agisoft (2013) not to be concerned with placing the entire object in the frames of the image,
11. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
11
however using a solution with few cameras meant it was essential to capture as much subject
surface information as possible. The impact of failing to encompass the entire subject in this
scenario presented as issues with subject stabilisation.
Figure 13. Afully framed subject within every image.
3.9. Subject Stabilisation
Having the subject entirely in-frame for each image captured was central to the design of each
build. In practice, this leads to several situations where the subject would inadvertently move out
of the frame as a result of postural adjustments. To overcome this, Build 3 used a rigid plastic rod
to indicate the desired position for the subject. The subject would rest the rear of their head
against the rod, and this would stabilise their positioning for each capture. Figure 14 shows
acomparison of the resulting imagery.
Figure 14. Image taken without (left) stabilisation, and with a stabilisation rod (right).
3.10. Varying Subject Dimensions
Natural variation in body height required consideration in designing the system. In Builds 1 and
2, a rigid, non-configurable seat was used for the subject to sit in the capture volume. The lack of
adjustment was found to be problematic, as subjects who were taller or shorter would frequently
find their face and head out of frame.Use of a height-adjustable seat and the stabilising rod as an
12. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
12
indicator enabled the subject to freely adjust their vertical seat positioning to become framed
appropriately within the capture region - as illustrated below in Figure 15.
Figure 15. Diagram showing the use of an adjustable chair to cater to different body sizes. By adjusting
chair height, the subject matter can remain within the intended capture region.
Full-body systems demonstrate similar considerations, such as Ten24’s T170 Capture Stage
which features an adjustable floor with a 3-foot range of motion [9]. Providing a mechanism to
move the subject into an ideal framing minimises the repositioning and recalibration of the
cameras across different capture sessions.
3.11. Occluding Hair
Attempting to capture and reconstruct subjects’ hair proved problematic; typically, visual
information within the images was noisy and featureless, which was detrimental to the
reconstruction process (Figure 16).Furthermore, any successful reconstruction would suffer from
a loss in facial geometry due to occlusion caused by the hair.
Figure 16. Subject reconstruction from capture with visible loose hair.
Wearing a cap to occlude the hair from capture maintainsthe fidelity of the head reconstruction
and retains the general shape of a subject’s head (Figure 17).
Figure 17. Cleaner reconstructions achieved by occluding hair from the capture process.
Early iterations of this practice utilised a skin-tone material for the hair occlusion – however,this
provided few reference points for the reconstructive processes. Further analysis of Graham’s
13. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
13
captured imagery (2014) showed the use of a patterned material, likely for this reason. Also seen
in Figure 17 is the use of a patterned material as a reflection of these findings, which typically
provided higher quality reconstruction results.
4. DATA HANDLING FOR RECONSTRUCTION
This project pursued methods that aid in creating an optimised workflow, increasing
reconstruction efficiency, and maintaining reconstruction accuracy across subjects. Section 4.1
focuses on the consumer-ready software packages used during this project and presents a
quantitative comparison of capabilities, as well as a qualitative side-by-side comparison of
reconstruction results. Section 4.2 explains the process for scale, alignment and colour
calibration,and finally, Section 4.3 briefly outlines an efficient method used for automated data
management.
4.1. Software Capabilities and Configuration
This project relied on existing software solutions for processing the image data and computing the
resulting meshes. Two of these solutions, Reality Capture and Agisoft Photoscan (“Photoscan”),
were compared for quantitative and qualitative analysis. Both were found to provide high-quality
meshes, although Reality Capture generally appeared to more consistently solve camera
correlation and extract more detail from the data at a higher speed. Table 2 contains some side by
side results of reconstruction and alignment performance between software packages.
Table 2. Mesh processing performance comparison for Agisoft Photoscan and Reality Capture.
Agisoft Photoscan Reality Capture
Cameras Matched Time* Matched Time
21 18/21 00:05:47 19/21 00:05:16
21 21/21 00:10:39 21/21 00:06:14
21 21/21 00:08:41 21/21 00:07:34
21 21/21 00:08:13 21/21 00:07:24
The most noticeable difference between Reality Capture and Photoscan was the way interpolation
was handled for floating points in the captured data. Generally, Reality Capture would attempt to
ensure any 3D reconstruction was ‘watertight’, i.e. there are no holes in the resulting mesh.
Photoscan, on the other hand, would leave holes visible. Figure 18 shows a comparison of
qualitative reconstructions.
Figure 18 Qualitative comparison of results from side-by-side testing for Agisoft Photoscan and Reality
Capture, rendered using different materials to highlight surface details.
14. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
14
4.2. Image Processing
Once captured, the images required some manual processing to ensure scale and colour
correctness. Both software solutions had methods of saving some of these manual steps for reuse,
which aided in processing efficiency and mesh consistency for later use.
4.2.1. Scale and Alignment
For the resulting mesh to be scaled and oriented correctly, points of a known distance and
orientation needed defining in several images before processing. Reality Capture and Photoscan
provided tools for this activity. A measuring tape was mounted to the previously mentioned
stabilisation rod to give this visual reference for manual processing, as shown below in Figure 19.
Figure 19. Manual alignment in Reality Capture using known distances for computation. Two points are
defined, and a known range (10mm) is used to provide scale information.
Once this information is defined, the camera positions and reconstruction volume can then be
solved with the correct orientation and scale with the real-world counterpart. The metadata for the
camera positions, directions and other settings are then able to be exported for reuse with other
scans using the same capture system. This step decreases the processing time and solves any
previously misaligned cameras with greater accuracy.
4.2.2. Colour Calibration
Images were captured using Canon RAW (.CR2) format with manual white balance settings
applied. Calibration of colours took place in Adobe Lightroom post-acquisition. At the beginning
of each session, images were taken of the subject holding a colour chart. Using this card as a
reference enabled the colour information to be corrected and the settings applied to all other
session data. The before and after imagery is shown below in Figure 20.
15. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
15
Figure 20. Before (left) calibration and after (right) using a colour chart (centre) as reference.
4.3. Data Workflow
Capturing images from 21 cameras for each session resulted in large volumes of data being stored
and processed. Some key steps were taken to optimise the data workflow and maintain production
efficiency. Using this workflow made it significantly easier to reuse the metadata mentioned in
Section 4.2 and globally apply any calibrations as needed.
The reuse of metadata in Reality Capture relied on naming conventions to match cameras with
resulting images. A similar constraint emerged when using some specific functions within Agisoft
Photoscan. Mismatching filenames were found to be detrimental to 3D reconstruction due to
misalignment of image data and camera positioning. Using the camera controlling software to
maintain a stream of data from each camera during capture enabled this problem to be
overcome.Images were stored directly on a computer’s hard drive for later use, using an
automated file hierarchy and naming convention, which is shown below in Figure21.
Figure 21. The automated data storage hierarchy, determined by file path logic set in the camera controlling
software.
16. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
16
5. DISCUSSION& OUTCOMES
5.1. Recommendations based on Findings
This project examinedseveral key aspects that influence the overall efficiency and resulting
reconstruction quality of a scale photogrammetry system across three distinctively different
configurations. Based on these specific examinations, the highest quality, and most robust system
architecture and process were found in Build 3. On this basis, the recommended architecture for a
scale, foundational human photogrammetry system consists of:
A multi-camera array with shutters synchronised using custom or dedicated hardware;
Powerful strobe flash lighting for even, instantaneous illumination of the subject;
A 2-second long shutter exposure in a dark environment to eliminate synchronisation
issues;
Cameras positioned in pairs around the subject, keeping the subject in the frame;
A device for stabilising subject positioning, with a height-adjustable seat or platform;
The use of a colour card or reference for colour correction;
A measuring device, or scale bar for calibration; and
An automated data synchronisation process between the cameras and controlling
computer.
A diagram of this system architecture and process is depicted below in Figure 22. This
architecture takes advantage of the methods found to produce the highest quality
resultsthroughout this project.
Figure 22. Recommended system architecture (top) and layout (bottom). A timeline for the capture process
is depicted with the architecture, linking particular events into the system architecture to display the
procedural flow.
17. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
17
These recommendations are the culmination of numerous iterations of configurations, capture
techniques, and hardware arrangements. Below, Figure 23 illustrates a quality growth in 3D
reconstructions gathered from the initial system version, through to the final configuration of
Build 3 during the project.
Figure 23. A collection of 3D reconstructions gathered throughout the project, illustrating the growth in
quality and system capability over time.
5.2. Limitations
The hardware used in this project was typically low-end or dated. For example, the 21 Canon
400D DSLRs were a set of decommissioned hire-cameras that were nine years old at the time of
this writing. It would be a reasonable expectation that newer hardware would possess higher
image capturing capabilities. As such, the drawbacks of using continuous lighting presented in
this work may not be as severe for practitioners using newer, or higher-end camera solutions.
Additionally, although this work recommends the use of strobe lighting for high-quality capture,
it should be noted that more advanced approaches to system architecture and image capture are
practised using continuous lighting. For example, practitioners at ICT Graphics Lab utilise
custom hardware, and high intensity LED light to capture their subjects in phenomenal detail.
However, the required knowledge and resource requirements for such a system deemed the
approach out-of-scope for this project.
The camera positioning used in Build 3 meant some typically occluded areas of subjects remained
challenging to capture, such as behind the ears or inside the nostrils.A reduction in occlusion
occurs by increasing subject coverage with additional cameras. Unfortunately, other cameras
were not available for this research, but the missing sections were considered to be low-impact
18. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
18
quality losses and could be corrected in modelling software after the reconstruction process
completes.
USB connectivity issues caused by a limitation on the maximum number of supported devices on
the computer’s USB bus were a significant challenge that would prove more challenging to
overcome as the system scales to a full-body system. Godse [28]states the upper limit should be
127, minus any number of device hubs used, which is consistent with the approachof Ten24 [9],
and Infinite Realities [29]where they mention the use of 2 controlling computers in their
architecture due to the high volume of devices; this is significant for any practice that scales
beyond the findings in this research.
6. CONCLUSION
6.1. Summary of Outcomes
This project aimed to address the gap in available knowledge regarding the foundational practices
and design considerations for the construction of human photogrammetry systems.A design-led
approachwas adopted to analyse the existing solutions and techniques of leading practitioners and
todesign a scale, 21-camera human photogrammetry system. Several iterative design cycles were
used to build the final photogrammetry system. Throughout these iterations, varied capture
techniques were examined and tested in the pursuit of reaching a compromise between image
quality and resource availability.
An architecture for a scale, human photogrammetry system is presented alongside a simple
processing workflow for practitioners. The given architecture includes specific details on system
connectivity, capture procedures and arrangement patterns for equipment – including diagrams of
positional pairing layouts for other practitioners. The architecture and recommendations made are
grounded in the analytical findings from existing practice and results from the activities in this
project.By condensing these findings into a succinct architecture and design pattern, future
practitioners now have a foundational blueprint for building a basic human photogrammetry
system – lowering the knowledge requirements to get started in the field.
6.2. Future Research Opportunities
As an extension of the work conducted throughout this project, two critical opportunities for
future research have emerged.Firstly, as this project has been done using a scale system, the next
step is to extend these findings and apply them to a full-body system. As an ongoing research
project, this hasalready presented new challenges as the system and capture complexity increases.
Creative Work C involved an artistic process for preparing the 3D reconstructions for animation
and posing, which was beyond the scope of this project. Developing a clearer understanding of
the methods and techniques of taking the reconstructed data from this system to a usable state
may be grounds for further research -for instance, explicitly examining the process of preparing
resulting assets for performance capture and representation as a digital double.
REFERENCES
[1] M. Mori, K. F. MacDorman, and N. Kageki, "The Uncanny Valley [From the Field]," IEEE Robotics
&Automation Magazine, vol. 19, no. 2, pp. 98-100, 2012.
[2] B. Flueckiger, "Digital Bodies," in Visual Effects. Filmbilder aus dem Computer., 2008, pp. 1-52.
[3] R. Zemeckis, "The Polar Express," ed, 2004.
19. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
19
[4] O. Alexander et al. (2009). The Digital Emily Project: Achieving a Photoreal Digital Actor.
Available: http://gl.ict.usc.edu/Research/DigitalEmily/
[5] J. v. d. Pahlen, J. Jimenez, E. Danvoye, P. Debevec, G. Fyffe, and O. Alexander, "Digital ira and
beyond: creating real-time photoreal digital actors," presented at the ACM SIGGRAPH 2014 Courses,
Vancouver, Canada, 2014. Available:
http://dl.acm.org.ezp01.library.qut.edu.au/citation.cfm?id=2615407
[6] I. Metrics, "The Digital Emily Project: Achieving a Photoreal Digital Actor," ed, 2011.
[7] G. Forlani, R. Roncella, and C. Nardinocchi, "Where is photogrammetry heading to? State of the art
and trends," Rendiconti Lincei, vol. 26, no. 1, pp. 85-96, 2015.
[8] P. R. Graham, "A Framework for High-Resolution, High-Fidelity, Inexpensive Facial Scanning,"
3644623 Ph.D., University of Southern California, Ann Arbor, 2014.
[9] Ten24. (20/10/2016). 3d-scanning. Available: http://ten24.info/3d-scanning/
[10] Ten24, "T170 Capture Stage," ed, 2016.
[11] S. Zwerman and C. Finance, The Visual Effects Producer. Routledge, 2015, pp. -1.
[12] A. Russo and J. Russo, "Captain America: Civil War," ed: Walt Disney Studios Motion Pictures,
2016.
[13] V. Frei. (2016). Captain America - Civil War: Russell Earl - VFX Supervisor - Industrial Light &
Magic.Available:http://www.artofvfx.com/captain-america-civil-war-russell-earl-vfx-supervisor-lm/
[14] D. Fincher, "The Curious Case of Benjamin Button," ed: Warner Bros. Pictures, 2008.
[15] B. Flueckiger, "Computer-Generated Characters in Avatar and Benjamin Button,"Available:
https://www.zauberklang.ch/AvatarButtonFlueckiger.pdf
[16] M. Seymour. (2015, 21/8/2016). Facing Second Son. Available:
https://www.fxguide.com/featured/facing-second-son/
[17] (2014). Infamous Second Son.
[18] E. Games. (2016, 15/05/2016). Unreal Engine 4 Powers Real-Time Cinematography at SIGGRAPH.
Available: https://www.unrealengine.com/blog/unreal-engine-4-powers-real-time-cinematography-at-
siggraph
[19] Leipner, R. Baumeister, M. J. Thali, M. Braun, E. Dobler, and L. C. Ebert, "Multi-camera system for
3D forensic documentation," Forensic Sci Int, vol. 261, pp. 123-8, Apr 2016.
[20] K. Han, H. J. Kwon, T. H. Choi, J. H. Kim, and D. Son, "Comparison of anthropometry with
photogrammetry based on a standardized clinical photographic technique using a cephalostat and
chair," J Craniomaxillofac Surg, vol. 38, no. 2, pp. 96-107, Mar 2010.
[21] M. Pesce, L. M. Galantucci, and F. Lavecchia, "A 12-camera body scanning system based on close-
range photogrammetry for precise applications," Virtual and Physical Prototyping, vol. 11, no. 1, pp.
49-56, 2015.
[22] J. O. Bærenholdt, M. Büscher, J. D. Scheuer, and J. Simonsen, Design Research. Routledge, 2010, p.
237.
[23] T. Sieberth, R. Wackrow, and J. H. Chandler, "Motion blur disturbs – the influence of motion-blurred
images in photogrammetry," The Photogrammetric Record, vol. 29, no. 148, pp. 434-453, 2014.
20. International Journal of Computer Graphics & Animation (IJCGA) Vol.10, No.1, January 2020
20
[24] Luk. (3/10/2016). Shutter lag measurements - Canon EOS 400D / Digital Rebel XTi. Available:
http://www.doc-diy.net/photo/shutter_lag/
[25] T. Beeler, B. Bickel, P. Beardsley, B. Sumner, and M. Gross, "High-Quality Single-Shot Capture of
Facial Geometry," in ACM SIGGRAPH, Los Angeles, USA, 2010, vol. 29, no. 3, pp. 1-9.
[26] B. Systems. (2016, 3/10/2016). Firing multiple cameras at the same time. Available:
https://www.breezesys.com/MultiCamera/release.htm
[27] E. Design. (2014, 20/10/2016). ESPER Bright Ideas in Black Boxes. Available:
http://www.esperdesign.co.uk/
[28] A. P. Godse and D. A. Godse, Advance Microprocessors. Technical Publications, 2009.
[29] Infinite-Realities. (2016, 20/10/2016). What We do. Available: http://ir-ltd.net/#what-we-do