CDVS/CDVA

CDVS/CDVA
박종민 / jmpark@rcv.sejong.ac.kr

CDVS/CDVA
● 모바일 기기 보급, 무선 네트워크 기술 발달로 모바일 영상 정보 검색이 필요
○ Computing power 제한
○ 무선 네트워크 데이터 bandwidth 제한
● 다른 Video 압축 규격과 달리 Encoding이 표준

MPEG
● Moving Picture Experts Group
○ MPEG-4
■ Coding of audio-visual objects
○ MPEG-7
■ Multimedia content description

CDVS/CDVA
● CDVS
○ Compact Descriptors for Visual Search
○ MPEG-7, Part 13
○ Reference software: MPEG-7, Part 14
● CDVA
○ Compact Descriptors for Video Analysis
○ MPEG-7, Part 15
○ Reference software(WIP): MPEG-7, Part 16
○ Compression of neural networks: MPEG-7, Part 17

CDVS
● 특징 서술자를 이용한 이미지 검색 기술
○ Local descriptor
■ SIFT와 유사하게 동작하는 ALP (A Low-degree Polynomial detector)
■ Local feature와 location 정보를 압축하여 descriptor 생성
○ Global descriptor
■ Local descriptor를 결합
출처: http://www.tnt.uni-hannover.de/staff/cordes/AVSS2014-poster.pdf

CDVA
● Image에서 Video 검색 기술로 확장
● 중복성이 높은 video frame간의 feature descriptor를 압축
● CDVS 활용
○ Video 전체에 대한 descriptor가 아닌 개별 frame이 decoding된 각 image에 descriptor 사용
○ 다시 말해 frame 별로 CDVS 기술을 사용함

Deep feature descriptor
VGG16
NIP(Nested Invariance Pooling)

Encode segment descriptor
1. Representative frame 선택
2. 부호화 순서 결정
3. Global descriptor 차이 계산
4. Global descriptor 부호화
5. Local descriptor 선정 및 filtering
6. Local descriptor 부호화
7. Deep feature descriptor 차이 계산
8. Deep feature descriptor 부호화
9. Header bitstream 생성
10. Descriptor block 생성

CDVA Encoding
<= verTh
segTh -> shot_cut_th
verTh -> shot_ver_th

Decode Frame
● CdvaImpl.cpp 파일 중 extract (L1753)
● (L1857) isVideo
● (L1859) VideoCapture를 통해 path를 받아 instance 생성
● (L1908) frame을 읽고 colorimage에 저장

Spatial subsampling
● (L1907) skip_nframes 부분적 frame 선택
○ (L1427) 8로 정의 | (L1491) query bitrate에 따라 변경
● (L1554) skipFrames(), frame을 건너뛰는 함수

Temporal subsampling (key frames)

Compute color histogram
● (L1912) 호출, (L1650) computeColorHistogram() 함수
○ OpenCV의 calcHist() 함수를 사용하여 BGR 채널별 histogram을 계산
○ 각 채널별 [0, 1] 범위로 normalize
● (L1914~1971) isFirst, 첫 frame일 경우 threshold 값들과 관계없이 전체 파트
진행

● (L1977) distance = diffColorHistogram()
○ (L1679) diffColorHistogram() 함수
● (L1978) distance > drop_frame_th
○ kfTh가 아닌 drop_frame_th 사용
○ (L1427) drop_frame_th=0.7
○ 그 외에 query bitrate에 따라 값이 변경됨
● (FALSE) Skip frame
● (TRUE) Store color histogram
diff() > kfTH?

Skip frame
● diff() <= kfTh
● 위 가정의 else 구문은 따로 없음
● do while로 video(L1857)를 읽고 있으므로 자연스럽게 Spatial subsampling과
같이 skipFrames() 진행

Store color histogram
● diff() > kfTh
● (L2037) 각 채널의 histogram을 *_last로 저장하여 다음 histogram과 비교

Extract SCFV descriptor
● (L1989) cdvsclinet->encode()를 통해 desc instance에 cdvs의 global
descriptor를 얻어 옴
○ cdvs 결과에 global descriptor와 local descriptor 정보가 포함됨
● 이후 local descriptor와 deep feature descriptor의 flag는 확인할 수 없음
○ CDVA를 compile할 때 HAVE_NIP option을 주고 했으므로, deep feature descriptor를 추출할
수 있으나 CDVA 실행시 -D 옵션을 줘야함
● 점선이 뜻하는 것이 optional을 뜻하는 것인가?
○ 문서(기고문)에서는 각 segment 마다 local 또는 deep feature 중 하나만 선택한다 했으나 code
상에서 찾을 수 없음
○ 단지 flag를 이용하여 프로그램을 처음 실행할 때 결정 됨

Extract CDVS local descriptor
● 따로 local descriptor를 추출하는 것은 코드상에서 확인할 수 없음
○ CDVS에서 수행

Extract NIP descriptor
● (L2042 ~ L2053)
● cdva.cpp의 (L215)에서 calDeepSig을 활성화시키기 위한 옵션 필요
○ cdva extract FILE_LIST.txt 16 -D
○ 만약 calDeepSig를 활성화 시킨다면, 항상 deep feature descriptor를 계산하게 됨
● (L2042) colorimage resize
○ width: 640
○ height: 480
● (L2052) extract_deep_feature2() 실행, h_tensor.cpp (L183) 구현

● extract_deep_feature2()
○ height: 480, width: 640, feature_dim: 512
○ (L218) 90도씩 rotation
○ (L233) rotaion별 VGG pool5 결과 feature map 생성
■ emb_map, emb_90_map, …
○ (L256) xm_data, eigvec_data, eigval_data for PCA
● (L277) region_sample(), (L102) 함수
○ (height, width) / 20 = (15, 20)
○ (L144 ~ L171) Appendix B에 미리 정의된 sampling을 수행하는 과정
○ (L175) vec를 normalization_2d(), (L60) 함수
■ root-square pooling
○ (L176) PCA 실행 후에 normalizatioon_2d(), (L92) PCA

● (L277) region_sample(), (L102) 함수
○ (L178) OpenCV의 reduce(), column(0) sum = average pooling
● extract_deep_feature2()
○ (L282) 4 rotation 결과를 memcpy()로 concatenate
○ (L289) reduce(), colum(0) max = max pooling
○ 호출부의 입력값인 desc는 함수 내에서 frame으로 받음
■ (frame).deepSignature.deep_feature(i, 0)에 nip_feat_max를 512개까지 기록

diff() > segTh?
● (L1981) distshot = diffColorHistogram()
○ (L1679) diffColorHistogram() 함수
● (L1993) distshot > shot_cut_th
○ segTh가 아닌 shot_cut_th 사용
○ (L1478) shot_cut_th = 1.98
● (FALSE) Store descriptor for segment
● (TRUE) Compute SCFV similarity

Store descriptor for segment
● diff() <= segTh
● diff() > verTh
● (L2056) shot.keyframes.push_back(desc);
● segment(=shot)의 keyframe descriptor 쌓음

Compute SCFV similarity
● diff() > segTh
● (L1996) pp = cdvsserver->match(desc, prevDesc, …, 3);
○ cdvs match 활용
○ CdvsPoint.h (L63) MATCH_TYPE_GLOBAL = 3
■ compute only global matching score

diff() <= verTh?
● (L1997) pp.global <= shot_ver_th
○ verTh가 아닌 shot_ver_th 사용
○ 기고문과 다르게 부등호 방향이 반대 방향
○ (L1479) shot_ver_th = 18
● (FALSE) Store descriptor for segment
● (TRUE) Store histogram and SCFV for new segment
<= verTh

Store histogram and SCFV for new segment
● (L2000 ~ L2002) diff() > segTh계산에 필요한 현재 histogram 저장
● (L2008) 짧은 영상 무시 threshold, minShotLen = 0
● (L2060) prevDesc = desc;
○ 현재의 desc를 다음 keyframe과 비교하기 위해 저장

Segment descriptor encoding
● (L2505 ~ L2858)

Segment descriptor encoding
1. (L2525 ~ L2542) Representative frame 선택
2. (L2555 ~ L2591) 부호화 순서 결정
3. (L2597 ~ L2618) Global descriptor 차이 계산
4. (L2620 ~ L2659) Global descriptor 부호화
5. (L2669 ~ L2713) Local descriptor 선정 및 filtering
6. (L2721 ~ L2743) Local descriptor 부호화
7. (L2756 ~ L2812) Deep feature descriptor 차이 계산
8. (L2815 ~ L2853) Deep feature descriptor 부호화
9. (L1946) Header bitstream 생성
10. (L2070 ~ L2090) Descriptor block 생성

CDVS/CDVA

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to CDVS/CDVA

Similar to CDVS/CDVA (20)

More from Jongmin Park

More from Jongmin Park (7)

CDVS/CDVA