Motion Estimation in h.264 encoder

•Download as PPTX, PDF•

2 likes•283 views

Talal Khaliq

Discusses different algorithms for Motion Estimation and Compensation

Engineering

NS Talal Khaliq
Project Supervisor:
Dr. Shoiab A Khan

Motivation
‘A picture is worth a thousands words.’
If this holds true, how a moving picture (video) which
contains so much information is transmitted so efficiently?

Problem background
Example an single video frame with 720x576 pixels with color depth of 24
bits per pixel with 29.97 frames per second uses approximately 200Mbs,
thus for a two hour program at this rate takes over 200 GB which is
practically impossible to store.

What is Video Compression?
It refers to reducing the quantity of data, and is
a combination of spatial image compression and
temporal motion compensation.

Temporal
Correlation
Spatial Correlation

Temporal Model
 It reduces redundancy between transmitted frames by
forming a predicted frame and subtracting this from the
current frame.
 The resulting residual (difference) frame contains less
energy.
 The residual frame is then encoded.

Block-based Motion Estimation
This method is used to ‘compensate’ for motion of
rectangular frames or ‘blocks’ in current frame.
 It involves finding a 4x4 sample region in a reference frame
that closely matches the current macroblock.
 Macroblock with minimum energy is chosen as ‘best match.’

Cost Function
Mean Absolute Difference(MAD),
Mean Squared Error(MSE),
where N is the side of macroblock, Cij and Rij are the pixels being compared.






1
0
1
0
2
||
1 N
i
N
j
ijij RC
N
MAD






1
0
1
0
2
2
)(
1 N
i
N
j
ijij RC
N
MSE

Motion Compensation
 The selected best matching region in the reference frame is
subtracted from the current macroblock to produce a residual
macroblock.
 This residual macroblock is encoded and transmitted together with a
motion vector describing the position of the best matching
macroblock.
 Motion vector is the offset between the current block and the position
of the candidate region.

Past Frame Current Frame
Frame Segmentation Blocks
Search Threshold
Block Matching
Motion Vectors
Motion vector Correction
Blocks
Prediction Error
Transmission

Adaptive Rood Pattern Search Algorithm
 General motion in the frame is usually coherent.
 It uses the motion vector of macro block to its immediate left to
predict its own motion vector.
 It directly puts the search in an area where there is a high probability
of finding a good matching block.

Predicted motion vector
is (3,-2) and step size S,
S=max(3,-2)=> 3.

Frames
Macro block area defined
Frame Scan
S=max(|X|,|Y|)
SDSP
Calculate min cost
LDSP
Start
loop
again
Motion vectors

Advantages
 We do not have to compute whole frame like in Exhaustive Search.
 It does not waste time doing LDSP. It starts with SDSP unlike in
Diamond Search.
 It does not always start from centre or extreme left and thus saves
computation time.

Video compression has two major issues to be handled, one is video compression rate and other one is quality. There is always a trade-off between speed and quality. Full search block matching algorithm (FSBMA) is most popular motion estimation algorithm. But high computational complexity is the major challenge of FSBM. This makes FSBM to be very difficult to use for real time video processing with the low power batteries. Other algorithm gives better speed on the expense of quality of video. The proposed algorithm i.e. modified full search block matching algorithm (MFSBMA) reduces the computational complexity by keeping the PSNR same as of FSBMA. MFSBMA skips the SAD calculations for a current background macroblock and it does SAD calculations for foreground current microblock. This method reduces SAD calculations drastically. This work presents apipelined architecture forMFSBMA which can work on real time HDTV video processing. The proposed algorithm reduces computational complexity by 50% keeping PSNR same with the full search algorithm.

The motion estimation

sakshij91

Efficient Architecture for Variable Block Size Motion Estimation in H.264/AVC

IDES Editor

This paper proposes an efficient VLSI architecture for the implementation of variable block size motion estimation (VBSME). To improve the performance video compression the Variable Block Size Motion Estimation (VBSME) is the critical path. Variable Block Size Motion Estimation feature has been introduced in to the H.264/AVC. This feature induces significant complexities into the design of the H.264/AVC video codec. This paper we compare the existing architectures for VBSME. An efficient architecture to improve the performance of Spiral Search for Variable Size Motion Estimation in H.264/AVC is proposed. Among various architectures available for VBSME spiral search provides hardware friendly data flow with efficient utilization of resources. The proposed implementation is verified using the MATLAB on foreman, coastguard and train sequences. The proposed Adaptive thresholding technique reduces the average number of computations significantly with negligible effect on the video quality. The results are verified using hardware implementation on Xilinx Virtex 4 it was able to achieve real time video coding of 60 fps at 95.56 MHz CLK frequency.

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

taeseon ryu

해당 논문은 3D Aware 모델입니다 StyleGAN 같은 경우에는 어떤 하나의 피처에 대해서 Editing 하고 싶을 때 입력에 해당하는 레이턴트 백터를 찾아서 레이턴트 백터를 수정함으로써 입에 해당하는 피쳐를 바꿀 수 있었는데 이런 컨셉을 그대로 착안해서 GAN 스페이스 논문에서는 인풋이 들어왔을 때 어떤 공간적인 정보까지도 에디팅하려고 시도했습니다 결과를 봤을 때 로테이션 정보가 어느 정도 잘 학습된 것 같지만 같은 사람이 아닌 것 같이 인식되기도 합니다 이러한 문제를 이제 disentangle 되지 않았다라고 하는 게 원하는 피처만 변화시켜야 되는 것과 달리 다른 피처까지도 모두 학습 모두 변했다는 것인데 이를 좀 더 효율적으로 3D를 더 잘 이해시키기 위해서 탄생한 논문입니다.

3D reconstruction

Jorge Leandro, Ph.D.

DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...

Joonhyung Lee

Convolutional neural network

MojammilHusain

드디어 PR12 Season 4가 시작되었습니다! 제가 이번 시즌에서 발표하게 된 첫 논문은 ""NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis"라는 논문입니다. View Synthesis라는 Task는 몇 개의 시점에서 대상을 찍은 영상이 주어지면 주어지지 않은 위치와 방향에서 바라본 대상의 영상을 합성해내는 기술입니다. 이를 위해서 본 논문에서는 대상의 3D 정보를 통째로 Neural Network가 외우게 하는 방법을 선택했는데요, 이 방식은 Implicit Neural Representation이라는 이름으로 유명해지고 있는 추세고, 2D 이미지에 대해서도 적용하려는 접근들이 늘고 있습니다. 영상 링크: https://youtu.be/zkeh7Tt9tYQ 논문 링크: https://arxiv.org/abs/2003.08934

PR-317: MLP-Mixer: An all-MLP Architecture for Vision

Jinwon Lee

Computer Vision 분야에서 CNN은 과연 살아남을 수 있을까요? 안녕하세요 TensorFlow Korea 논문 읽기 모임 PR-12의 317번째 논문 리뷰입니다. 이번에는 Google Research, Brain Team의 MLP-Mixer: An all-MLP Architecture for Vision을 리뷰해보았습니다. Attention의 공격도 버거운데 이번에는 MLP(Multi-Layer Perceptron)의 공격입니다. MLP만을 사용해서 Image Classification을 하는데 성능도 좋고 속도도 빠르고.... 구조를 간단히 소개해드리면 ViT(Vision Transformer)의 self-attention 부분을 MLP로 변경하였습니다. MLP block 2개를 사용하여 하나는 patch(token)들 간의 연산을 하는데 사용하고, 하나는 patch 내부 연산을 하는데 사용합니다. 사실 MLP를 사용하긴 했지만 논문에도 언급되어 있듯이, 이 부분을 일종의 convolution이라고 볼 수 있는데요... 그래도 transformer 기반의 network이 가질 수밖에 없는 quadratic complexity를 linear로 낮춰주고 convolution의 inductive bias 거의 없이 아주아주 simple한 구조를 활용하여 이렇게 좋은 성능을 보여준 점이 멋집니다. 반면에 역시나 data를 많이 써야 한다거나, MLP의 한계인 fixed length의 input만 받을 수 있다는 점은 단점이라고 생각하는데요, 이 연구를 시작으로 MLP도 다시한번 조명받는 계기가 되면 좋을 것 같네요 비슷한 시점에 나온 비슷한 연구들도 마지막에 간략하게 소개하였습니다. 재미있게 봐주세요. 감사합니다! 논문링크: https://arxiv.org/abs/2105.01601 영상링크: https://youtu.be/KQmZlxdnnuY

Review-image-segmentation-by-deep-learning

Trong-An Bui

Background Subtraction Algorithm for Moving Object Detection Using Denoising ...

International Journal of Science and Research (IJSR)

Currently, in both market and the academic communities have required applications based on image and video processing with several real-time constraints. On the other hand, detection of moving objects is a very important task in mobile robotics and surveillance applications. In order to achieve this, we are using a alternative means for real time motion detection systems. This paper proposes hardware architecture for motion detection based on the background subtraction algorithm, which is implemented on FPGAs (Field Programmable Gate Arrays). For achieving this, the following steps are executed: (a) a background image (in gray-level format) is stored in an external SRAM memory, (b) a low-pass filter is applied to both the stored and current images, (c) a subtraction operation between both images is obtained, and (d) a morphological filter is applied over the resulting image. Afterward, the gravity center of the object is calculated and sent to a PC (via RS-232 interface).

Tree structured partitioning into transform blocks and units and interpicture...

LainAcarolu

"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...

Edge AI and Vision Alliance

For the full video of this presentation, please visit: http://www.embedded-vision.com/platinum-members/auvizsystems/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit For more information about embedded vision, please visit: http://www.embedded-vision.com Nagesh Gupta, Founder and CEO of Auviz Systems, presents the "Semantic Segmentation for Scene Understanding: Algorithms and Implementations" tutorial at the May 2016 Embedded Vision Summit. Recent research in deep learning provides powerful tools that begin to address the daunting problem of automated scene understanding. Modifying deep learning methods, such as CNNs, to classify pixels in a scene with the help of the neighboring pixels has provided very good results in semantic segmentation. This technique provides a good starting point towards understanding a scene. A second challenge is how such algorithms can be deployed on embedded hardware at the performance required for real-world applications. A variety of approaches are being pursued for this, including GPUs, FPGAs, and dedicated hardware. This talk provides insights into deep learning solutions for semantic segmentation, focusing on current state of the art algorithms and implementation choices. Gupta discusses the effect of porting these algorithms to fixed-point representation and the pros and cons of implementing them on FPGAs.

Different Approach of VIDEO Compression Technique: A Study

Editor IJCATR

The main objective of video compression is to achieve video compression with less possible losses to reduce the transmission bandwidth and storage memory. This paper discusses different approach of video compression for better transmission of video frames for multimedia application. Video compression methods such as frame difference approach, PCA based method, accordion function, fuzzy concept, and EZW and FSBM were analyzed in this paper. Those methods were compared for performance, speed and accuracy and which method produces better visual quality.

DICTA 2017 poster

Ashek Ahmmed

Poster for our conference paper titled "4K Ultra High Definition Video Coding using Homogeneous Motion Discovery Oriented Prediction" published in the Digital Image Computing: Techniques and Applications (DICTA) 2017 conference. Abstract: State of the art video compression techniques use the motion model to approximate geometric boundaries of moving objects where motion discontinuities occur. Motion hints based inter-frame prediction paradigm moves away from this redundant approach and employs an innovative framework consisting of motion hint fields that are continuous and invertible, at least, over their respective domains. However, estimation of motion hint is computationally demanding, in particular for high resolution video sequences. Discovery of homogeneous motion models and their associated masks over the current frame and then use these models and masks to form a prediction of the current frame, provides a computationally simpler approach to video coding compared to motion hint. In this paper, the potential of this coherent motion model based approach, equipped with bigger blocks, is investigated for coding 4K Ultra High Definition (UHD) video sequences. Experimental results show a savings in bit rate of 4.68% is achievable over standalone HEVC.

Deep Learning Fast MRI Using Channel Attention in Magnitude Domain

Joonhyung Lee

My presentation on how we participated in the fastMRI Challanege in 2019. Aside from theoretical considerations, it also explains key implementation issues that arise in all deep learning for MRI such as disk I/O and CPU/GPU load balancing. Used for presentation at ISBI 2020 Oral session. Accidentally wrote the title as "Deep Learning Sum-of-Squares Images in Accelerated Parallel MRI". Sorry for the mistake!

PR-366: A ConvNet for 2020s

Jinwon Lee

#PR12 #PR366 안녕하세요 논문 읽기 모임 PR-12의 366번째 논문리뷰입니다. 올해가 AlexNet이 나온지 10주년이 되는 해네요. AlexNet이 2012년에 혜성처럼 등장한 이후, Solve computer vision problem = Use CNN이 공식처럼 사용되던 2010년대가 가고 2020년대 들어서 ViT의 등장을 시작으로 Transformer 기반의 network들이 CNN의 자리를 위협하고 상당부분 이미 뺏어간 상황입니다. 2020년대에 CNN의 가야할 길은 어디일까요? Inductive bias가 적은 Transformer가 대용량의 데이터로 학습하면 항상 CNN보다 더 낫다는 건 진실일까요? 이 논문에서는 2020년대를 위한 CNN이라는 제목으로 ConvNeXt라는 새로운(?) architecture를 제안합니다. 사실 새로운 건 없고 그동안 있었던 것들과 Transformer에서 적용한 것들을 copy해와서 CNN에 적용해보았는데요, Transformer보다 성능도 좋고 속도도 빠른 결과가 나왔다고 합니다. 결과에 대해서 약간의 논란이 twitter 상에서 나오고 있는데 이 부분 포함해서 자세한 내용은 영상을 통해서 보실 수 있습니다. 늘 재밌게 봐주시고 좋아요 댓글 구독 해주시는 분들께 감사드립니다 :) 논문링크: https://arxiv.org/abs/2201.03545 영상링크: https://youtu.be/Mw7IhO2uBGc

A Low Hardware Complex Bilinear Interpolation Algorithm of Image Scaling for ...

arpublication

In this brief, a low-complexity, low-memoryrequirement, and high-quality algorithm is proposed for VLSI implementation of an image scaling processor. The proposed image scaling algorithm consists of a sharpening spatial filter, a clamp filter, and a bilinear interpolation. To reduce the blurring and aliasing artifacts produced by the bilinear interpolation, the sharpening spatial and clamp filters are added as pre-filters. To minimize the memory buffers and computing resources for the proposed image processor design, a T-model and inversed T-model convolution kernels are created for realizing the sharpening spatial and clamp filters. Furthermore, two T-model or inversed T-model filters are combined into a combined filter which requires only a one-line-buffer memory. Moreover, a reconfigurable calculation unit is invented for decreasing the hardware cost of the combined filter. Moreover, the computing resource and hardware cost of the bilinear interpolator can be efficiently reduced by an algebraic manipulation and hardware sharing techniques. The VLSI architecture in this work can achieve 280 MHz with 6.08-K gate counts, and its core area is 30 378 μm2 synthesized by a 0.13-μm CMOS process. Compared with previous low-complexity techniques, this work reduces gate counts by more than 34.4% and requires only a one-line-buffer memory.

ImageNet classification with deep convolutional neural networks(2012)

WoochulShin10

Efficient Neural Architecture Search via Parameter Sharing

Jinwon Lee

HRNET : Deep High-Resolution Representation Learning for Human Pose Estimation

taeseon ryu

안녕하세요 딥러닝 논문읽기 모임 입니다! 오늘 소개 드릴 논문은 Deep High-Resolution Representation Learning for Human Pose Estimation 라는 제목의 논문입니다. 오늘 소개드릴 논문은 Pose Estimation에 관련된 논문 입니다. 기존 Pose Estimation 모델의 경우 직렬적인 네트워크 구조를 지녔지만, 직렬적인 구조는 압축하는 과정에서 지엽적인 정보들의 손실을 가져오게 되고 모든 프로세스가 upsampling에 과도하게 의존하고 있다는 한계점을 가지고 있습니다.그래서 이러한 한계점을 극복하고자HRNet은 이러한 직렬 구조에서 벗어나 병렬 구조로 subnetwork를 구성했습니다.

PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...

Jinwon Lee

안녕하세요 TensorFlow Korea 논문 읽기 모임 PR-12의 330번째 논문 리뷰입니다. 오늘은 무려 5만개의 학습된 ViT model을 제공하는 구글스러운 논문을 리뷰해보았습니다. ViT가 CNN을 조금씩 대체해가고 있는데요, ViT는 CNN과 달리 inductive bias가 적은 관계로 좋은 성능을 위해서는 굉장히 많은 data가 필요하거나, augmentation과 regularization을 많이 써줘야 합니다. 그런데 이렇게 다양한 경우 즉 다양한 data, 다양한 model size, 다양한 augmentation 방법, 다양한 regularization, 다양한 data size 등등에 따른 ViT의 성능과 속도 등의 비교 분석 실험이 지금까지는 없었죠. 이 논문에서는 그 어려운 걸(?) 해냈습니다. 그리고 수많은 ViT를 이용해 실험을 하면서 몇가지 중요한 finding들을 찾았습니다. 요약하면 다음과 같습니다. 1. augmentation과 regularization을 잘 쓰면 1/10의 data로도 전체 data 다 쓴거랑 대부분 비슷한 성능을 낼 수 있다. 그런데 항상 그런건 아니다. 반대로 말하면 data가 10배 있으면 augmentation이나 regularization안 쓰고도 좋은 성능을 낼 수 있다. 2. downstream task 학습할 때 scratch부터 학습하는거랑 large dataset으로 pre-trained한 걸 이용해서 transfer learning하는 건 후자가 좋다. 3. transfer learning 할 때도 pre-trained model 중에 data 많이 써서 학습한게 더 좋다. 4. augmentation/regularization은 data가 많으면 별 도움이 안되고 둘 중에는 augmenation이 더 좋다. 5. pre-trained model이 많을 때 model을 고르는 방법은 그냥 upstream에서 제일 잘됐던 걸 고르면 얼추 잘된다. 6. 속도를 빠르게 하고 싶을 때는 model을 작은거 쓰지말고 patch size를 키워라. 그래야 성능이 별로 안떨어진다. 입니다. 흥미로운 결과들이 많으니 자세한 내용은 아래 영상을 참고해주세요! 감사합니다! 영상링크: https://youtu.be/A3RrAIx-KCc 논문링크: https://arxiv.org/abs/2106.10270

2021 05-04-u2-net

JAEMINJEONG5

Optimization of Macro Block Size for Adaptive Rood Pattern Search Block Match...

IJERA Editor

In area of video compression, Motion Estimation is one of the most important modules and play an important role to design and implementation of any the video encoder. It consumes more than 85% of video encoding time due to searching of a candidate block in the search window of the reference frame. Various block matching methods have been developed to minimize the search time. In this context, Adaptive Rood Pattern Search is one of the less expensive block matching methods, which is widely acceptable for better Motion Estimation in video data processing. In this paper we have proposed to optimize the macro block size used in adaptive rood pattern search method for improvement in motion estimation.

Dsc

melisha monteiro

What's hot

Rethinking Attention with Performers

Joonhyung Lee

A Novel Background Subtraction Algorithm for Dynamic Texture Scenes

IJMER

Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...

changedaeoh

[Paper] Multiscale Vision Transformers(MVit)

Susang Kim

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Hyeongmin Lee

PR-317: MLP-Mixer: An all-MLP Architecture for Vision

Jinwon Lee

Review-image-segmentation-by-deep-learning

Trong-An Bui

Background Subtraction Algorithm for Moving Object Detection Using Denoising ...

International Journal of Science and Research (IJSR)

Tree structured partitioning into transform blocks and units and interpicture...

LainAcarolu

"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...

Edge AI and Vision Alliance

Different Approach of VIDEO Compression Technique: A Study

Editor IJCATR

DICTA 2017 poster

Ashek Ahmmed

Deep Learning Fast MRI Using Channel Attention in Magnitude Domain

Joonhyung Lee

PR-366: A ConvNet for 2020s

Jinwon Lee

A Low Hardware Complex Bilinear Interpolation Algorithm of Image Scaling for ...

arpublication

ImageNet classification with deep convolutional neural networks(2012)

WoochulShin10

Efficient Neural Architecture Search via Parameter Sharing

Jinwon Lee

HRNET : Deep High-Resolution Representation Learning for Human Pose Estimation

taeseon ryu

PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...

Jinwon Lee

2021 05-04-u2-net

JAEMINJEONG5

What's hot (20)

Rethinking Attention with Performers

A Novel Background Subtraction Algorithm for Dynamic Texture Scenes

Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...

[Paper] Multiscale Vision Transformers(MVit)

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

PR-317: MLP-Mixer: An all-MLP Architecture for Vision

Review-image-segmentation-by-deep-learning

Background Subtraction Algorithm for Moving Object Detection Using Denoising ...

Tree structured partitioning into transform blocks and units and interpicture...

"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...

Different Approach of VIDEO Compression Technique: A Study

DICTA 2017 poster

Deep Learning Fast MRI Using Channel Attention in Magnitude Domain

PR-366: A ConvNet for 2020s

A Low Hardware Complex Bilinear Interpolation Algorithm of Image Scaling for ...

ImageNet classification with deep convolutional neural networks(2012)

Efficient Neural Architecture Search via Parameter Sharing

HRNET : Deep High-Resolution Representation Learning for Human Pose Estimation

PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...

2021 05-04-u2-net

Similar to Motion Estimation in h.264 encoder

Optimization of Macro Block Size for Adaptive Rood Pattern Search Block Match...

Dsc

Dsc

Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...

CSCJournals

Motion estimation is the most challenging and time consuming stage in block based video codec. To reduce the computation time, many fast motion estimation algorithms were proposed and implemented. This paper proposes a quad-tree based Normalized Cross Correlation (NCC) measure for obtaining estimates of inter-frame motion. The measure operates in frequency domain using FFT algorithm as the similarity measure with an exhaustive full search in region of interest. NCC is a more suitable similarity measure than Sum of Absolute Difference (SAD) for reducing the temporal redundancy in video compression since we can attain flatter residual after motion compensation. The degrees of homogeneous and stationery regions are determined by selecting suitable initial fixed threshold for block partitioning. An experimental result of the proposed method shows that actual numbers of motion vectors are significantly less compared to existing methods with marginal effect on the quality of reconstructed frame. It also gives higher speed up ratio for both fixed block and quad-tree based motion estimation methods.

IRJET- Comparison and Simulation based Analysis of an Optimized Block Mat...

IRJET Journal

I0341042048

inventionjournals

International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.

40120140503006IAEME Publication

557 480-486

idescitation

Super-resolution (SR) is the process of obtaining a high resolution (HR) image or a sequence of HR images from a set of low resolution (LR) observations. The block matching algorithms used for motion estimation to obtain motion vectors between the frames in Super-resolution. The implementation and comparison of two different types of block matching algorithms viz. Exhaustive Search (ES) and Spiral Search (SS) are discussed. Advantages of each algorithm are given in terms of motion estimation computational complexity and Peak Signal to Noise Ratio (PSNR). The Spiral Search algorithm achieves PSNR close to that of Exhaustive Search at less computation time than that of Exhaustive Search. The algorithms that are evaluated in this paper are widely used in video super-resolution and also have been used in implementing various video standards like H.263, MPEG4, H.264.

An Effective Implementation of Configurable Motion Estimation Architecture fo...

ijsrd.com

This project introduces configurable motion estimation architecture for a wide range of fast block-matching algorithms (BMAs). Contemporary motion estimation architectures are either too rigid for multiple BMAs or the flexibility in them is implemented at the cost of reduced performance. .In block-based motion estimation, a block-matching algorithm (BMA) searches the best matching block for the current macro block from the reference frame. During the searching procedure, the checking point yielding the minimum block distortion (MBD) determines the displacement of the best matching block.

B0441418IOSR Journals

Fast Full Search for Block Matching Algorithms

ijsrd.com

This project introduces configurable motion estimation architecture for a wide range of fast block-matching algorithms (BMAs). Contemporary motion estimation architectures are either too rigid for multiple BMAs or the flexibility in them is implemented at the cost of reduced performance. In block-based motion estimation, a block-matching algorithm (BMA) searches for the best matching block for the current macro block from the reference frame. During the searching procedure, the checking point yielding the minimum block distortion (MBD) determines the displacement of the best matching block.

Multimedia basic video compression techniques

Mazin Alwaaly

538 207-219

idescitation

Internet data almost double every year. The need of multimedia communication is less storage space and fast transmission. So, the large volume of video data has become the reason for video compression. The aim of this paper is to achieve temporal compression for three-dimensional (3D) videos using motion estimation-compensation and wavelets. Instead of performing a two-dimensional (2D) motion search, as is common in conventional video codec’s, the use of a 3D motion search has been proposed, that is able to better exploit the temporal correlations of 3D content. This leads to more accurate motion prediction and a smaller residual. The discrete wavelet transform (DWT) compression scheme has been added for better compression ratio. The DWT has a high-energy compaction property thus greatly impacted the field of compression. The quality parameters peak signal to noise ratio (PSNR) and mean square error (MSE) have been calculated. The simulation results shows that the proposed work improves the PSNR from existing work.

Be36338341

IJERA Editor

International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.

Motion detection in compressed video using macroblock classification

acijjournal

n this paper, to detect the moving objects between frames in compressed video and to obtain the bes t compression video and the noiseless video. We describe a video in which frames by classifying macroblocks (MB), and describe motion estimation (ME), motion vector field (MV) and motion compensation (MC). we propose to classify Macroblocks of each video frame into different classes and use this class information to describe the frame content based on the motion vector. MB class informatio n video applications such as shot change detection, motion discontinuity detection, Outlier rejection for global motion estimation. To reduc e the noise and to improve the clarity of the compressed video by using contrast limited adaptive histogram equalization (CLAHE) Algorithm.

Motion Estimation - umit 5 (II).pdf

HeenaSyed6

L0936775

IOSR Journals

Background Estimation Using Principal Component Analysis Based on Limited Mem...

IJECEIAES

Given a video of 푀 frames of size ℎ × 푤. Background components of a video are the elements matrix which relative constant over 푀 frames. In PCA (principal component analysis) method these elements are referred as “principal components”. In video processing, background subtraction means excision of background component from the video. PCA method is used to get the background component. This method transforms 3 dimensions video (ℎ × 푤 × 푀) into 2 dimensions one (푁 × 푀), where 푁 is a linear array of size ℎ × 푤 . The principal components are the dominant eigenvectors which are the basis of an eigenspace. The limited memory block Krylov subspace optimization then is proposed to improve performance the computation. Background estimation is obtained as the projection each input image (the first frame at each sequence image) onto space expanded principal component. The procedure was run for the standard dataset namely SBI (Scene Background Initialization) dataset consisting of 8 videos with interval resolution [146 150, 352 240], total frame [258,500]. The performances are shown with 8 metrics, especially (in average for 8 videos) percentage of error pixels (0.24%), the percentage of clustered error pixels (0.21%), multiscale structural similarity index (0.88 form maximum 1), and running time (61.68 seconds).

Similar to Motion Estimation in h.264 encoder (20)

Optimization of Macro Block Size for Adaptive Rood Pattern Search Block Match...

Dsc

Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...

IRJET- Comparison and Simulation based Analysis of an Optimized Block Mat...

I0341042048

40120140503006

557 480-486

An Effective Implementation of Configurable Motion Estimation Architecture fo...

B0441418

Fast Full Search for Block Matching Algorithms

Multimedia basic video compression techniques

538 207-219

Be36338341

Motion detection in compressed video using macroblock classification

Motion Estimation - umit 5 (II).pdf

L0936775

Background Estimation Using Principal Component Analysis Based on Limited Mem...

Recently uploaded

Railway Signalling Principles Edition 3.pdf

TeeVichai

NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...

Amil Baba Dawood bangali

Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...

AJAYKUMARPUND1

H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf

MLILAB

ethical hacking-mobile hacking methods.ppt

Jayaprasanna4

J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf

MLILAB

MCQ Soil mechanics questions (Soil shear strength).pdf

Osamah Alsalih

CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx

R&R Consult

CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems! Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected. R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production. An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred. R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance. Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production. It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call! Work done in cooperation with James Malloy and David Moelling from Tetra Engineering. More examples of our work https://www.r-r-consult.dk/en/cases-en/

Vaccine management system project report documentation..pdf

Kamal Acharya

Automobile Management System Project Report.pdf

Kamal Acharya

The proposed project is developed to manage the automobile in the automobile dealer company. The main module in this project is login, automobile management, customer management, sales, complaints and reports. The first module is the login. The automobile showroom owner should login to the project for usage. The username and password are verified and if it is correct, next form opens. If the username and password are not correct, it shows the error message. When a customer search for a automobile, if the automobile is available, they will be taken to a page that shows the details of the automobile including automobile name, automobile ID, quantity, price etc. “Automobile Management System” is useful for maintaining automobiles, customers effectively and hence helps for establishing good relation between customer and automobile organization. It contains various customized modules for effectively maintaining automobiles and stock information accurately and safely. When the automobile is sold to the customer, stock will be reduced automatically. When a new purchase is made, stock will be increased automatically. While selecting automobiles for sale, the proposed software will automatically check for total number of available stock of that particular item, if the total stock of that particular item is less than 5, software will notify the user to purchase the particular item. Also when the user tries to sale items which are not in stock, the system will prompt the user that the stock is not enough. Customers of this system can search for a automobile; can purchase a automobile easily by selecting fast. On the other hand the stock of automobiles can be maintained perfectly by the automobile shop manager overcoming the drawbacks of existing system.

road safety engineering r s e unit 3.pdf

VENKATESHvenky89705

Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B

Sreedhar Chowdam

WATER CRISIS and its solutions-pptx 1234

AafreenAbuthahir2

Democratizing Fuzzing at Scale by Abhishek Arya

abh.arya

Presented at NUS: Fuzzing and Software Security Summer School 2024 This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.

Cosmetic shop management system project report.pdf

Kamal Acharya

Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry. Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks. Data file handling has been effectively used in the program. The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.

Student information management system project report ii.pdf

Kamal Acharya

Immunizing Image Classifiers Against Localized Adversary Attacks

gerogepatton

This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks (CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations. When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10 and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing accuracy improvements over previous techniques. The results indicate that the combination of the volumetric input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating adversary training.

Standard Reomte Control Interface - Neometrix

Neometrix_Engineering_Pvt_Ltd

About Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol. • Remote control: Parallel or serial interface. • Compatible with MAFI CCR system. • Compatible with IDM8000 CCR. • Compatible with Backplane mount serial communication. • Compatible with commercial and Defence aviation CCR system. • Remote control system for accessing CCR and allied system over serial or TCP. • Indigenized local Support/presence in India. • Easy in configuration using DIP switches. Technical Specifications Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol. Key Features Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol. • Remote control: Parallel or serial interface • Compatible with MAFI CCR system • Copatiable with IDM8000 CCR • Compatible with Backplane mount serial communication. • Compatible with commercial and Defence aviation CCR system. • Remote control system for accessing CCR and allied system over serial or TCP. • Indigenized local Support/presence in India. Application • Remote control: Parallel or serial interface. • Compatible with MAFI CCR system. • Compatible with IDM8000 CCR. • Compatible with Backplane mount serial communication. • Compatible with commercial and Defence aviation CCR system. • Remote control system for accessing CCR and allied system over serial or TCP. • Indigenized local Support/presence in India. • Easy in configuration using DIP switches.

addressing modes in computer architecture

ShahidSultan24

Quality defects in TMT Bars, Possible causes and Potential Solutions.

PrashantGoswami42

Maintaining high-quality standards in the production of TMT bars is crucial for ensuring structural integrity in construction. Addressing common defects through careful monitoring, standardized processes, and advanced technology can significantly improve the quality of TMT bars. Continuous training and adherence to quality control measures will also play a pivotal role in minimizing these defects.

Recently uploaded (20)

Railway Signalling Principles Edition 3.pdf

NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...

H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf

ethical hacking-mobile hacking methods.ppt

J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf

MCQ Soil mechanics questions (Soil shear strength).pdf

CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx

Vaccine management system project report documentation..pdf

Automobile Management System Project Report.pdf

road safety engineering r s e unit 3.pdf

Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B

WATER CRISIS and its solutions-pptx 1234

Democratizing Fuzzing at Scale by Abhishek Arya

Cosmetic shop management system project report.pdf

Student information management system project report ii.pdf

Immunizing Image Classifiers Against Localized Adversary Attacks

Standard Reomte Control Interface - Neometrix

addressing modes in computer architecture

Quality defects in TMT Bars, Possible causes and Potential Solutions.

Motion Estimation in h.264 encoder

1. NS Talal Khaliq Project Supervisor: Dr. Shoiab A Khan

2. Outline

3. Motivation ‘A picture is worth a thousands words.’ If this holds true, how a moving picture (video) which contains so much information is transmitted so efficiently?

4. Problem background Example an single video frame with 720x576 pixels with color depth of 24 bits per pixel with 29.97 frames per second uses approximately 200Mbs, thus for a two hour program at this rate takes over 200 GB which is practically impossible to store.

5. What is Video Compression? It refers to reducing the quantity of data, and is a combination of spatial image compression and temporal motion compensation.

6. Temporal Correlation Spatial Correlation

7. Temporal Model  It reduces redundancy between transmitted frames by forming a predicted frame and subtracting this from the current frame.  The resulting residual (difference) frame contains less energy.  The residual frame is then encoded.

8. Block-based Motion Estimation This method is used to ‘compensate’ for motion of rectangular frames or ‘blocks’ in current frame.  It involves finding a 4x4 sample region in a reference frame that closely matches the current macroblock.  Macroblock with minimum energy is chosen as ‘best match.’

9. Cost Function Mean Absolute Difference(MAD), Mean Squared Error(MSE), where N is the side of macroblock, Cij and Rij are the pixels being compared.       1 0 1 0 2 || 1 N i N j ijij RC N MAD       1 0 1 0 2 2 )( 1 N i N j ijij RC N MSE

10. Motion Compensation  The selected best matching region in the reference frame is subtracted from the current macroblock to produce a residual macroblock.  This residual macroblock is encoded and transmitted together with a motion vector describing the position of the best matching macroblock.  Motion vector is the offset between the current block and the position of the candidate region.

11. Past Frame Current Frame Frame Segmentation Blocks Search Threshold Block Matching Motion Vectors Motion vector Correction Blocks Prediction Error Transmission

12. Example Video Frame 10 Frame 11

13.

14. Adaptive Rood Pattern Search Algorithm  General motion in the frame is usually coherent.  It uses the motion vector of macro block to its immediate left to predict its own motion vector.  It directly puts the search in an area where there is a high probability of finding a good matching block.

15. Predicted motion vector is (3,-2) and step size S, S=max(3,-2)=> 3.

16. Frames Macro block area defined Frame Scan S=max(|X|,|Y|) SDSP Calculate min cost LDSP Start loop again Motion vectors

17. Advantages  We do not have to compute whole frame like in Exhaustive Search.  It does not waste time doing LDSP. It starts with SDSP unlike in Diamond Search.  It does not always start from centre or extreme left and thus saves computation time.

18. Video 2 Frame 110 Frame 113

19.

20. Video 3 Frame 220 Frame 222

Motion Estimation in h.264 encoder

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Motion Estimation in h.264 encoder

Similar to Motion Estimation in h.264 encoder (20)

Recently uploaded

Recently uploaded (20)

Motion Estimation in h.264 encoder