Neural Radiance Fields (NeRF) represents scenes as neural radiance fields that can be used for novel view synthesis. NeRF learns a continuous radiance field from a sparse set of input views using a multi-layer perceptron that maps 5D coordinates to RGB color and density values. It uses volumetric rendering to integrate these values along camera rays and optimizes the network via differentiable rendering and a reconstruction loss. NeRF produces high-fidelity novel views and has inspired extensions like handling dynamic scenes and reconstructing scenes from unstructured internet photos.
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
해당 논문은 3D Aware 모델입니다 StyleGAN 같은 경우에는 어떤 하나의 피처에 대해서 Editing 하고 싶을 때 입력에 해당하는 레이턴트 백터를 찾아서 레이턴트 백터를 수정함으로써 입에 해당하는 피쳐를 바꿀 수 있었는데 이런 컨셉을 그대로 착안해서
GAN 스페이스 논문에서는 인풋이 들어왔을 때 어떤 공간적인 정보까지도 에디팅하려고 시도했습니다 결과를 봤을 때 로테이션 정보가 어느 정도 잘 학습된 것 같지만 같은 사람이 아닌 것 같이 인식되기도 합니다 이러한 문제를 이제 disentangle 되지 않았다라고 하는 게 원하는 피처만 변화시켜야 되는 것과 달리 다른 피처까지도 모두 학습 모두 변했다는 것인데 이를 좀 더 효율적으로 3D를 더 잘 이해시키기 위해서 탄생한 논문입니다.
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisHyeongmin Lee
드디어 PR12 Season 4가 시작되었습니다! 제가 이번 시즌에서 발표하게 된 첫 논문은 ""NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis"라는 논문입니다. View Synthesis라는 Task는 몇 개의 시점에서 대상을 찍은 영상이 주어지면 주어지지 않은 위치와 방향에서 바라본 대상의 영상을 합성해내는 기술입니다. 이를 위해서 본 논문에서는 대상의 3D 정보를 통째로 Neural Network가 외우게 하는 방법을 선택했는데요, 이 방식은 Implicit Neural Representation이라는 이름으로 유명해지고 있는 추세고, 2D 이미지에 대해서도 적용하려는 접근들이 늘고 있습니다.
영상 링크: https://youtu.be/zkeh7Tt9tYQ
논문 링크: https://arxiv.org/abs/2003.08934
Image restoration techniques covered such as denoising, deblurring and super-resolution for 3D images and models.
From classical computer vision techniques to contemporary deep learning based processing for both ordered and unordered point clouds, depth maps and meshes.
Depth estimation do we need to throw old things awayNAVER Engineering
발표의 개요 : Human visual system 기반의 CNN for depth estimation과 CNN inspired by conventional methods
Case1: Cross-channel stereo matching
Case2: Depth from light field
Case3: Multiview stereo
Conclusion
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
해당 논문은 3D Aware 모델입니다 StyleGAN 같은 경우에는 어떤 하나의 피처에 대해서 Editing 하고 싶을 때 입력에 해당하는 레이턴트 백터를 찾아서 레이턴트 백터를 수정함으로써 입에 해당하는 피쳐를 바꿀 수 있었는데 이런 컨셉을 그대로 착안해서
GAN 스페이스 논문에서는 인풋이 들어왔을 때 어떤 공간적인 정보까지도 에디팅하려고 시도했습니다 결과를 봤을 때 로테이션 정보가 어느 정도 잘 학습된 것 같지만 같은 사람이 아닌 것 같이 인식되기도 합니다 이러한 문제를 이제 disentangle 되지 않았다라고 하는 게 원하는 피처만 변화시켜야 되는 것과 달리 다른 피처까지도 모두 학습 모두 변했다는 것인데 이를 좀 더 효율적으로 3D를 더 잘 이해시키기 위해서 탄생한 논문입니다.
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisHyeongmin Lee
드디어 PR12 Season 4가 시작되었습니다! 제가 이번 시즌에서 발표하게 된 첫 논문은 ""NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis"라는 논문입니다. View Synthesis라는 Task는 몇 개의 시점에서 대상을 찍은 영상이 주어지면 주어지지 않은 위치와 방향에서 바라본 대상의 영상을 합성해내는 기술입니다. 이를 위해서 본 논문에서는 대상의 3D 정보를 통째로 Neural Network가 외우게 하는 방법을 선택했는데요, 이 방식은 Implicit Neural Representation이라는 이름으로 유명해지고 있는 추세고, 2D 이미지에 대해서도 적용하려는 접근들이 늘고 있습니다.
영상 링크: https://youtu.be/zkeh7Tt9tYQ
논문 링크: https://arxiv.org/abs/2003.08934
Image restoration techniques covered such as denoising, deblurring and super-resolution for 3D images and models.
From classical computer vision techniques to contemporary deep learning based processing for both ordered and unordered point clouds, depth maps and meshes.
Depth estimation do we need to throw old things awayNAVER Engineering
발표의 개요 : Human visual system 기반의 CNN for depth estimation과 CNN inspired by conventional methods
Case1: Cross-channel stereo matching
Case2: Depth from light field
Case3: Multiview stereo
Conclusion
In this project, we propose methods for semantic segmentation with the deep learning state-of-the-art models. Moreover,
we want to filterize the segmentation to the specific object in specific application. Instead of concentrating on unnecessary objects we
can focus on special ones and make it more specialize and effecient for special purposes. Furtheromore, In this project, we leverage
models that are suitable for face segmentation. The models that are used in this project are Mask-RCNN and DeepLabv3. The
experimental results clearly indicate that how illustrated approach are efficient and robust in the segmentation task to the previous work
in the field of segmentation. These models are reached to 74.4 and 86.6 precision of Mean of Intersection over Union. The visual
Results of the models are shown in Appendix part.
Super resolution in deep learning era - Jaejun YooJaeJun Yoo
Abstract (Eng/Kor):
Image restoration (IR) is one of the fundamental problems, which includes denoising, deblurring, super-resolution, etc. Among those, in today's talk, I will more focus on the super-resolution task. There are two main streams in the super-resolution studies; a traditional model-based optimization and a discriminative learning method. I will present the pros and cons of both methods and their recent developments in the research field. Finally, I will provide a mathematical view that explains both methods in a single holistic framework, while achieving the best of both worlds. The last slide summarizes the remaining problems that are yet to be solved in the field.
영상 복원(Image restoration, IR)은 low-level vision에서 매우 중요하게 다루는 근본적인 문제 중 하나로서 denoising, deblurring, super-resolution 등의 다양한 영상 처리 문제를 포괄합니다. 오늘 발표에서는 영상 복원 분야 중에서도 super-resolution 문제에 대해 집중적으로 다루겠습니다. 전통적인 model-based optimization 방식과 deep learning을 적용하여 문제를 푸는 방식에 대해, 각각의 장단점과 최신 연구 발전 흐름을 소개하겠습니다. 마지막으로는 이 둘을 하나로 잇는 통일된 관점을 제시하고 관련 연구들 살펴본 후, super-resolution 분야에서 아직 남아있는 문제점들을 정리하겠습니다.
Filmic Tonemapping for Real-time Rendering - Siggraph 2010 Color Coursehpduiker
Filmic Tonemapping for Real-time Rendering, a presentation from the Siggraph 2010 Course on Color, on a technique developed from film that became very applicable to games with the addition of support for HDR lighting and rendering in graphics cards.
Slide for study session given by Ryosuke Sasaki at Arithmer inc.
It is a summary of recent methods for object pose estimation in robotics using deep learning.
He entered Ph.D course at Univ. of Tokyo in April 2020.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
In this project, we propose methods for semantic segmentation with the deep learning state-of-the-art models. Moreover,
we want to filterize the segmentation to the specific object in specific application. Instead of concentrating on unnecessary objects we
can focus on special ones and make it more specialize and effecient for special purposes. Furtheromore, In this project, we leverage
models that are suitable for face segmentation. The models that are used in this project are Mask-RCNN and DeepLabv3. The
experimental results clearly indicate that how illustrated approach are efficient and robust in the segmentation task to the previous work
in the field of segmentation. These models are reached to 74.4 and 86.6 precision of Mean of Intersection over Union. The visual
Results of the models are shown in Appendix part.
Super resolution in deep learning era - Jaejun YooJaeJun Yoo
Abstract (Eng/Kor):
Image restoration (IR) is one of the fundamental problems, which includes denoising, deblurring, super-resolution, etc. Among those, in today's talk, I will more focus on the super-resolution task. There are two main streams in the super-resolution studies; a traditional model-based optimization and a discriminative learning method. I will present the pros and cons of both methods and their recent developments in the research field. Finally, I will provide a mathematical view that explains both methods in a single holistic framework, while achieving the best of both worlds. The last slide summarizes the remaining problems that are yet to be solved in the field.
영상 복원(Image restoration, IR)은 low-level vision에서 매우 중요하게 다루는 근본적인 문제 중 하나로서 denoising, deblurring, super-resolution 등의 다양한 영상 처리 문제를 포괄합니다. 오늘 발표에서는 영상 복원 분야 중에서도 super-resolution 문제에 대해 집중적으로 다루겠습니다. 전통적인 model-based optimization 방식과 deep learning을 적용하여 문제를 푸는 방식에 대해, 각각의 장단점과 최신 연구 발전 흐름을 소개하겠습니다. 마지막으로는 이 둘을 하나로 잇는 통일된 관점을 제시하고 관련 연구들 살펴본 후, super-resolution 분야에서 아직 남아있는 문제점들을 정리하겠습니다.
Filmic Tonemapping for Real-time Rendering - Siggraph 2010 Color Coursehpduiker
Filmic Tonemapping for Real-time Rendering, a presentation from the Siggraph 2010 Course on Color, on a technique developed from film that became very applicable to games with the addition of support for HDR lighting and rendering in graphics cards.
Slide for study session given by Ryosuke Sasaki at Arithmer inc.
It is a summary of recent methods for object pose estimation in robotics using deep learning.
He entered Ph.D course at Univ. of Tokyo in April 2020.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
The single image dehazing based on efficient transmission estimationAVVENIRE TECHNOLOGIES
We propose a novel haze imaging model for single image haze removal. Haze imaging model is formulated using dark channel prior (DCP), scene radiance, intensity, atmospheric light and transmission medium. The dark channel prior is based on the statistics of outdoor haze-free images. We find that, in most of the local regions which do not cover the sky, some pixels (called dark pixels) very often have very low intensity in at least one color (RGB) channel. In hazy images, the intensity of these dark pixels in that channel is mainly contributed by the air light. Therefore, these dark pixels can directly provide an accurate estimation of the haze transmission. Combining a haze imaging model and a interpolation method, we can recover a high-quality haze free image and produce a good depth map.
Single Image Depth Estimation using frequency domain analysis and Deep learningAhan M R
Using Machine Learning and Deep Learning Techniques, we train the ResNet CNN Model and build a model for estimating Depth using the Discrete Fourier Domain Analysis, and generate results including the explanation of the Loss function and code snippets.
WEBINAR ON FUNDAMENTALS OF DIGITAL IMAGE PROCESSING DURING COVID LOCK DOWN by by K.Vijay Anand , Associate Professor, Department of Electronics and Instrumentation Engineering , R.M.K Engineering College, Tamil Nadu , India
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
Neural Radiance Fields & Neural Rendering.pdf
1. Neural Radiance Fields
& Neural Rendering
Mildenhall, Ben, et al. "Nerf: Representing scenes as neural radiance fields
for view synthesis." Communications of the ACM 65.1 (2021): 99-106.
Navneet Paul
PlayerUnknown Productions
2. Rendering
● Process of generating an image from a 2D or 3D model using a computer program. The
resulting image is called render.
● A rendering application takes into account inputs such as the model (2D/3D), texture,
shading, lighting, viewpoints, etc, as features during the rendering process.
● It means we can say that each scene file contains multiple features that need to be
understood and processed by the rendering algorithm or application to generate a
processed image.
3. Rendering Equation
● The rendering algorithm or technique which tries to solve the problem of image
generation based on all given features is mostly trying to optimize the rendering equation.
● At a high-level the rendering equation computes for a radiance that is illumination
(reflection, refraction, and emittance of light) on an object from a source to an observer in
a given space.
● NeRF essentially computes for the Volume Rendering.
4. Volume Rendering
● Volume rendering (as per Wikipedia) is a set of a technique used to display a 2D projection of a
3D discretely sampled dataset.
● To render a 2D projection (output) of a 3D dataset we first need to define the camera position in
space relative to the volume then we need to define the RGBα (Red, Green. Blue, Alpha → it
stands for opacity channel) for every voxel.
● The primary objective in volume rendering is to get a transfer function which defines RGBα for
every value for every possible voxel value in a given space.
5. View Synthesis
● Click photos of an object from multiple camera angles and superimpose the images to
have a look at the same object from different known camera angles and positions.
● For NeRF, we are trying to predict the third missing axis (the first two being length &
breadth) which is the depth.
● Core application of NeRF: to predict a function for depth determination at various points
in the plane against the object itself.
6. Neural Radiance Fields
● Generate novel views of complex scenes by optimizing an underlying continuous
volumetric scene function using a sparse set of input views.
● The input can be provided as a blender model or a static set of images.
● The input is provided as a continuous 5D function that outputs the radiance emitted in
each direction (θ; Φ) at each point (x; y; z) in space, and a density at each point which
acts like a differential opacity controlling how much radiance is accumulated by a ray
passing through (x; y; z)
7. ● A continuous scene can be described as a 5D vector-valued function whose input is a 3D
location x = (x; y; z) and 2D viewing direction (θ; Φ), and whose output is an emitted color
c = (r; g; b) and volume density (𝜎)
Volume
Rendering
Final rendering
MLP network F𝚯
: (x,y,z,d) →(RGB,𝜎)
8. Process overview
To generate a Neural Radiance Field from a particular viewpoint following steps were done:
● March camera rays through the scene to generate a sampled set of 3D points (Use either
COLMAP* or SfM for generating camera poses and viewing directions).
● Use those points and their corresponding 2D viewing directions as input to the neural network to
produce an output set of colors (RGB) and densities (𝜎)
● Use classical volume rendering approach to accumulate those colors and densities into a 2D
image
* a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline with a graphical and command-line interface
9. Network Architecture
● NeRF is an implicit Multi Layer Perceptron (MLP) based model that maps 5D vectors (3D
coordinates plus 2D viewing directions) to output RGB feature vector (c) & volume density (𝜎)
at that spatial location, using fully connected deep networks.
→ : layers with ReLU activation, 𝛾(x) : positional encoding, 𝛾(d) : directional encoding → : layer with no activation,
⇢ : layers with sigmoid activation, + : vector concatenation
10. Volume Rendering
● The authors used discrete data samples to estimate the expected color C(r) of camera ray r(t) with
the quadrature rule in classical volume rendering techniques.
Predicted colors
Volume density
Opacity
11. NeRF Optimization - Positional Encoding
● Previous studies show that optimizing the inputs to a higher dimensional space using high frequency functions
before passing them to the network enables better fitting of data that contains high frequency variation.
● Positional & Directional Encoding: A Fourier based feature mapping function that encodes features (pertaining
to position & direction) from lower dimensional space to a higher dimensional space.
Positional Encoding func.
Tancik, Srinivasan, Mildenhall et al., Fourier Features Let Networks Learn High Frequency
Functions in Low Dimensional Domains, NeurIPS 2020
No
Positional
Encoding
With
Positional
Encoding
12. NeRF Optimization - Hierarchical Sampling
● During the volume rendering phase, our model simultaneously optimizes two networks: coarse and fine
● We first sample a set of NC
locations with the RGB feature vector and density [σ (t)] outputs from the proposed
NeRF model, using stratified sampling, and evaluate the “coarse” network at these locations.
● The main function of coarse network is to compute the final rendered color of the ray for the coarse samples.
● a second set of Nf
locations are sampled from the [RGB + density] distribution using inverse transform sampling &
evaluate our “fine” network.
● All the samples are considered while computing the final rendered ray color, i.e, (NC
+ Nf
), at fine network stage.
This is done to ensure that more samples are allocated to regions we expect to contain visible content.
13. Final Rendering & Loss Function
● Optimize a separate neural continuous volume representation network, for each scene.
● At each optimization iteration, we randomly sample a batch of camera rays from the set of
all pixels in the dataset, and then follow the hierarchical sampling.
● NC
samples from the coarse network and NC
+ Nf
samples from the fine network.
● We then use the volume rendering procedure to render the color of each ray from both
sets of samples.
● Loss function is based on the the total squared error between the rendered and true pixel
colors for both the coarse and fine samples.
ℛ: set of rays in each batch; C(r): ground truth, ĈC
(r): coarse volume prediction and Ĉf
(r): fine volume prediction for RGB colors for ray “r”
14. Performance of NeRF
Comparison to other view synthesis techniques
● Neural Volumes, Local Light Field Fusion (LLFF)
& Scene Representation Networks (SRN
(Ours = NeRF)
15. Performance of NeRF
Ablation Studies
● To validate the model’s performance with respect to different parameters.
16. Summary
● Learn the radiance field of a scene based on a
collection of calibrated images
○ Use an MLP to learn continuous
geometry and view-dependent
appearance
● Use fully differentiable volume rendering with
reconstruction loss
● Combines hierarchical sampling and
Fourier-based encoding of 5D inputs to produce
high-fidelity novel view synthesis results
Some associated challenges
● Handling dynamic scenes when acquiring
calibrated views
● One network trained per scene - no
generalization
17. Related NeRF Research
● NeRF in Wild: a novel approach for 3D scene reconstruction of complex environments from unstructured
internet photo collections that incorporates transient and latent scene embedding upon conventional NeRF
model.
*Martin-Brualla, Ricardo, et al. "Nerf in the wild: Neural radiance fields for unconstrained photo collections." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
● The model captures lighting and photometric variations in a low-dimensional latent embedding space in
rendering appearance without affecting 3D geometry.
18. ● Neural Radiance Fields for Dynamic Scenes : for synthesizing novel views, at an arbitrary point in
time, of dynamic scenes with complex non-rigid geometries.
● Optimize an underlying deformable volumetric function (using a deformation network) from a sparse set
of input monocular views without the need of ground-truth geometry nor multi-view images
Pumarola, Albert, et al. "D-nerf: Neural radiance fields for dynamic scenes." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
Related NeRF Research