論文紹介：Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition

•

0 likes•134 views

Shreyank N Gowda, Marcus Rohrbach, Frank Keller, Laura Sevilla-Lara, "Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition" ECCV2022 https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136910234.pdf https://arxiv.org/abs/2206.04790

Technology

Learn2Augment:
Learning to Composite Videos
for Data Augmentation
in Action Recognition
Shreyank N Gowda, Marcus Rohrbach, Frank Keller, and Laura
Sevilla-Lara , ECCV2022
2022/11/25

nData Augmentation
• ActorCut [Zou+, arXiv2021] VideoMix [Yun+, arXiv2020]
•
• [Zhang+, arXiv2019]
• GAN
• self-paced selection
nSemi-supervised Video Action Recognition
•
• VideoSSL [Jing+, WACV2021]
•
• Temporal Contrastive Learning (TCL) [Singh+, CVPR2021]
• 2

nSample selection
• [Huang+, CVPR2018]
•
• SMART [Gowda+, arXiv2020]
•
• SCSampler [Korbar+, ICCV2019]
•
•
• RL [Yoon+, PMLR2020]
•
•

Learn2Augment
1. Semantic Match
•
2. Selector
1. Selector 𝜔
2. Video Composite
1.
•
•
2.
3. Classifier & Selector Reword
3.

Semantic Matching
n[Choi+, NIPS2019]
•
•
nSen2vec [Pagliardini+, arXiv2018]
•
• 𝑐! 𝑐" 𝑉!, 𝑉"

Selector
nSelector Architecture
• 3D ResNet-18 [He+, arXiv2016] + MLP
•
n
• 3D ResNet-18 validation loss
n
•

Training Selector
nSelector
• RL
1.
• 𝐷#$%: validation set ℒ&%':
• 𝑓(: 𝑉): 𝑦):
• 𝛿:
• 𝑆:

Training Selector
2. REINFORCE [Williams+, Machine learning1992]
•
• 𝐷*:
• 𝐷+:

Video Compositing
1.
• MaskRCNN [He+, ICCV2017]
• MaskRCNN COCO [Lin+, ECCV2014]
•
2.
•
• [Liu+, ECCV2018]
3.

Training Classifier
n
• ,
𝑦
• 𝛾 = ∑
*!
,-.
𝛼 = 4
• 𝑦/ 𝑦0

n
•
•
•
• Few-shot
• n+k k
• Novel-class
• 1~5
• Seen-class
•
•
• Standard split [Zhang+, arXiv2020]
• Truze split [Gowda+, arXiv2021]
•
•
•
•
• Sports1M [Karpathy+,
CVPR2014]
n
• HMDB51 [Jhuang+, ICCV2011]
• UCF101 [Soomro+, arXiv, 2012]
• Kinetics-400 [Kay+, arXiv2017]
• Kinetics-100

2
n
• Top-1 accuracy
• 5~50%
• L2A Pre-training
• Kinetics-400 Selector

3
nFew-shot
• Top-1 accuracy
• S: Standard split, T: Truze split

nLearn2Augment
•
•
n
•
• 8.6%
• Few-shot
• 3.7%
•
• 17.4%

What's hot

SSII2019OS: 深層学習にかかる時間を短くしてみませんか？～分散学習の勧め～

SSII

2020/6/11 画像センシングシンポジウムオーガナイズドセッション2 「限られたデータからの深層学習」 https://confit.atlas.jp/guide/event/ssii2020/static/organized#OS2 での招待講演資料です。コンピュータビジョン分野を中心とした転移学習についての講演です。パブリックなデータセットも増えていて、物体検出や領域分割などの研究も盛んですが、実際に社会実装しようとするときのデータは学習データと異なる性質（異なるドメイン）のデータである場合も非常に多いです。本講演では、そのような場合に有効なドメイン適応の原理となるアプローチ2つと応用としての物体検出と領域分割の事例を紹介しています。

ドメイン適応の原理と応用

Yoshitaka Ushiku

東北大学先端技術の基礎と実践_深層学習による画像認識とデータの話_菊池悠太

Preferred Networks

ECCV読み会 "Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone ...

Hajime Mihara

Layer Normalization@NIPS+読み会・関西

Keigo Nishida

【DL輪読会】Visual Classification via Description from Large Language Models (ICLR...

Deep Learning JP

SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用 6月10日 (木) 11:00 - 12:30 メイン会場（vimeo + sli.do）登壇者：Xueting Wang 氏（株式会社サイバーエージェント）概要：自己教師あり学習は、データ自身から用意した教師を用いてデータの表現を学習することで、人間のアノテーション作業の負担を大きく削減可能という高い実用性を持つ。その中でも、特に近年では対照学習(Contrastive Learning)と呼ばれる技術は、教師あり学習に匹敵、あるいはそれ以上の性能を示すことで注目されている。本講演では、講演者の研究事例に基づき、対照学習の原理や応用について紹介する。

SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用

SSII

Disentanglement Survey:Can You Explain How Much Are Generative models Disenta...

Hideki Tsunashima

近年のHierarchical Vision Transformer

Yusuke Uchida

[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models

Deep Learning JP

猫でも分かるVariational AutoEncoder

Sho Tatsuno

MIRU2013チュートリアル：SIFTとそれ以降のアプローチ

Hironobu Fujiyoshi

Action Recognitionの歴史と最新動向

Ohnishi Katsunori

【DL輪読会】ViT + Self Supervised Learningまとめ

Deep Learning JP

NLPにおけるAttention～Seq2Seq から BERTまで～

Takuya Ono

[DL輪読会]Geometric Unsupervised Domain Adaptation for Semantic Segmentation

Deep Learning JP

論文紹介：Deep Mutual Learning

Toru Tamaki

紹介論文 Invariant Information Clustering for Unsupervised Image Classification and Segmentation Xu J, João F. Henriques, Andrea Vedaldi 出典：Xu J, João F. Henriques, Andrea Vedaldi：Invariant Information Clustering forUnsupervised Image Classification and Segmentation, International Conference on Computer Vision (ICCV 2019), Seoul, Korea 概要：本論文では、正解ラベルを必要としない教師なし学習手法IICを提案しています。元画像に一般的なランダム変換を加えたペアを作成し、元画像とペアの相互情報量を最大化するよう学習を行います。画像のクラス分類・セグメンテーションタスクにおいて、8つのベンチマークでSOTAを達成しています。さらに、半教師あり学習にすることで、従来の教師あり学習精度を超える結果を得ています

Invariant Information Clustering for Unsupervised Image Classification and Se...

harmonylab

動画像理解のための深層学習アプローチ Deep learning approaches to video understanding

Toru Tamaki

【メタサーベイ】Neural Fields

cvpaper. challenge

What's hot (20)

SSII2019OS: 深層学習にかかる時間を短くしてみませんか？～分散学習の勧め～

ドメイン適応の原理と応用

東北大学先端技術の基礎と実践_深層学習による画像認識とデータの話_菊池悠太

ECCV読み会 "Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone ...

Layer Normalization@NIPS+読み会・関西

【DL輪読会】Visual Classification via Description from Large Language Models (ICLR...

SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用

Disentanglement Survey:Can You Explain How Much Are Generative models Disenta...

近年のHierarchical Vision Transformer

[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models

猫でも分かるVariational AutoEncoder

MIRU2013チュートリアル：SIFTとそれ以降のアプローチ

Action Recognitionの歴史と最新動向

【DL輪読会】ViT + Self Supervised Learningまとめ

NLPにおけるAttention～Seq2Seq から BERTまで～

[DL輪読会]Geometric Unsupervised Domain Adaptation for Semantic Segmentation

論文紹介：Deep Mutual Learning

Invariant Information Clustering for Unsupervised Image Classification and Se...

動画像理解のための深層学習アプローチ Deep learning approaches to video understanding

【メタサーベイ】Neural Fields

Recently uploaded

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

MadyBayot

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Juan lago vázquez

Architecting Cloud Native Applications

WSO2

Corporate and higher education. Two industries that, in the past, have had a clear divide with very little crossover. The difference in goals, learning styles and objectives paved the way for differing learning technologies platforms to evolve. Now, those stark lines are blurring as both sides are discovering they have content that’s relevant to the other. Join Tammy Rutherford as she walks through the pros and cons of corporate and higher ed collaborating. And the challenges of these different technology platforms working together for a brighter future.

Corporate and higher education May webinar.pptx

Rustici Software

Effective data discovery is crucial for maintaining compliance and mitigating risks in today's rapidly evolving privacy landscape. However, traditional manual approaches often struggle to keep pace with the growing volume and complexity of data. Join us for an insightful webinar where industry leaders from TrustArc and Privya will share their expertise on leveraging AI-powered solutions to revolutionize data discovery. You'll learn how to: - Effortlessly maintain a comprehensive, up-to-date data inventory - Harness code scanning insights to gain complete visibility into data flows leveraging the advantages of code scanning over DB scanning - Simplify compliance by leveraging Privya's integration with TrustArc - Implement proven strategies to mitigate third-party risks Our panel of experts will discuss real-world case studies and share practical strategies for overcoming common data discovery challenges. They'll also explore the latest trends and innovations in AI-driven data management, and how these technologies can help organizations stay ahead of the curve in an ever-changing privacy landscape.

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

TrustArc

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Zilliz

The action of the next cyber saga takes place in the mystical lands of the Asia-Pacific region, where the main characters began their digital activities in the middle of 2021 and qualitatively strengthened it in 2022. Corporate espionage, document theft, audio recordings, and data leaks from messaging platforms were all a matter of one day for Dark Pink. Their geographical focus may have started in the Asia-Pacific region, but their ambitions knew no bounds, targeting a European government ministry in a bold move to expand their portfolio. Their victim profile was as diverse as a UN meeting, targeting military organizations, government agencies, and even a religious organization. Because discrimination is not a fashionable agenda. In the world of cybercrime, they serve as a reminder that sometimes the most serious threats come in the most unassuming packages with a pink bow.

Cyberprint. Dark Pink Apt Group [EN].pdf

Overkill Security

Boost Fertility New Invention Ups Success Rates.pdf

sudhanshuwaghmare1

Three things you will take away from the session: • How to run an effective tenant-to-tenant migration • Best practices for before, during, and after migration • Tips for using migration as a springboard to prepare for Copilot in Microsoft 365 Main ideas: Migration Overview: The presentation covers the current reality of cross-tenant migrations, the triggers, phases, best practices, and benefits of a successful tenant migration Considerations: When considering a migration, it is important to consider the migration scope, performance, customization, flexibility, user-friendly interface, automation, monitoring, support, training, scalability, data integrity, data security, cost, and licensing structure Next Wave: The next wave of change includes the launch of Copilot, which requires businesses to be prepared for upcoming changes related to Copilot and the cloud, and to consolidate data and tighten governance ShareGate: ShareGate can help with pre-migration analysis, configurable migration tool, and automated, end-user driven collaborative governance

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

sammart93

MINDCTI Revenue Release Quarter One 2024

MIND CTI

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Zilliz

Join our latest Connector Corner webinar to discover how UiPath Integration Service revolutionizes API-centric automation in a 'Quote to Cash' process—and how that automation empowers businesses to accelerate revenue generation. A comprehensive demo will explore connecting systems, GenAI, and people, through powerful pre-built connectors designed to speed process cycle times. Speakers: James Dickson, Senior Software Engineer Charlie Greenberg, Host, Product Marketing Manager

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

DianaGray10

Dubai, often portrayed as a shimmering oasis in the desert, faces its own set of challenges, including the occasional threat of flooding. Despite its reputation for opulence and modernity, the emirate is not immune to the forces of nature. In recent years, Dubai has experienced sporadic but significant floods, testing the resilience of its infrastructure and communities. Among the critical lifelines in this bustling metropolis is the Dubai International Airport, a bustling hub that connects the city to the world. This article explores the intersection of Dubai flood events and the resilience demonstrated by the Dubai International Airport in the face of such challenges.

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Orbitshub

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Edi Saputra

Following the popularity of “Cloud Revolution: Exploring the New Wave of Serverless Spatial Data,” we’re thrilled to announce this much-anticipated encore webinar. In this sequel, we’ll dive deeper into the Cloud-Native realm by uncovering practical applications and FME support for these new formats, including COGs, COPC, FlatGeoBuf, GeoParquet, STAC, and ZARR. Building on the foundation laid by industry leaders Michelle Roby of Radiant Earth and Chris Holmes of Planet in the first webinar, this second part offers an in-depth look at the real-world application and behind-the-scenes dynamics of these cutting-edge formats. We will spotlight specific use-cases and workflows, showcasing their efficiency and relevance in practical scenarios. Discover the vast possibilities each format holds, highlighted through detailed discussions and demonstrations. Our expert speakers will dissect the key aspects and provide critical takeaways for effective use, ensuring attendees leave with a thorough understanding of how to apply these formats in their own projects. Elevate your understanding of how FME supports these cutting-edge technologies, enhancing your ability to manage, share, and analyze spatial data. Whether you’re building on knowledge from our initial session or are new to the serverless spatial data landscape, this webinar is your gateway to mastering cloud-native formats in your workflows.

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Safe Software

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Passkeys: Developing APIs to enable passwordless authentication Cody Salas, Sr Developer Advocate | Solutions Architect - Yubico Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

apidays

💥 You’re lucky! We’ve found two different (lead) developers that are willing to share their valuable lessons learned about using UiPath Document Understanding! Based on recent implementations in appealing use cases at Partou and SPIE. Don’t expect fancy videos or slide decks, but real and practical experiences that will help you with your own implementations. 📕 Topics that will be addressed: • Training the ML-model by humans: do or don't? • Rule-based versus AI extractors • Tips for finding use cases • How to start 👨‍🏫👨‍💻 Speakers: o Dion Morskieft, RPA Product Owner @Partou o Jack Klein-Schiphorst, Automation Developer @Tacstone Technology

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam

UiPathCommunity

The microservices honeymoon is over. When starting a new project or revamping a legacy monolith, teams started looking for alternatives to microservices. The Modular Monolith, or 'Modulith', is an architecture that reaps the benefits of (vertical) functional decoupling without the high costs associated with separate deployments. This talk will delve into the advantages and challenges of this progressive architecture, beginning with exploring the concept of a 'module', its internal structure, public API, and inter-module communication patterns. Supported by spring-modulith, the talk provides practical guidance on addressing the main challenges of a Modultith Architecture: finding and guarding module boundaries, data decoupling, and integration module-testing. You should not miss this talk if you are a software architect or tech lead seeking practical, scalable solutions. About the author With two decades of experience, Victor is a Java Champion working as a trainer for top companies in Europe. Five thousands developers in 120 companies attended his workshops, so he gets to debate every week the challenges that various projects struggle with. In return, Victor summarizes key points from these workshops in conference talks and online meetups for the European Software Crafters, the world’s largest developer community around architecture, refactoring, and testing. Discover how Victor can help you on victorrentea.ro : company training catalog, consultancy and YouTube playlists.

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Victor Rentea

When you’re building (micro)services, you have lots of framework options. Spring Boot is no doubt a popular choice. But there’s more! Take Quarkus, a framework that’s considered the rising star for Kubernetes-native Java. It always depends on what's best for your situation, but how to choose the best solution if you're comparing 2 frameworks? Both Spring Boot and Quarkus have their positives and negatives. Let us compare the two by live coding a couple of common use cases in Spring Boot and Quarkus. After this talk, you’ll be ready to get started with Quarkus yourself, and know when to select Quarkus or Spring Boot.

Spring Boot vs Quarkus the ultimate battle - DevoxxUK

Jago de Vreede

Recently uploaded (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Architecting Cloud Native Applications

Corporate and higher education May webinar.pptx

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Cyberprint. Dark Pink Apt Group [EN].pdf

Boost Fertility New Invention Ups Success Rates.pdf

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

MINDCTI Revenue Release Quarter One 2024

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

How to Troubleshoot Apps for the Modern Connected Worker

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Spring Boot vs Quarkus the ultimate battle - DevoxxUK

論文紹介：Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition

1. Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition Shreyank N Gowda, Marcus Rohrbach, Frank Keller, and Laura Sevilla-Lara , ECCV2022 2022/11/25

2. nLearn2Augment • •

3. nData Augmentation • ActorCut [Zou+, arXiv2021] VideoMix [Yun+, arXiv2020] • • [Zhang+, arXiv2019] • GAN • self-paced selection nSemi-supervised Video Action Recognition • • VideoSSL [Jing+, WACV2021] • • Temporal Contrastive Learning (TCL) [Singh+, CVPR2021] • 2

4. nSample selection • [Huang+, CVPR2018] • • SMART [Gowda+, arXiv2020] • • SCSampler [Korbar+, ICCV2019] • • • RL [Yoon+, PMLR2020] • •

5. Learn2Augment 1. Semantic Match • 2. Selector 1. Selector 𝜔 2. Video Composite 1. • • 2. 3. Classifier & Selector Reword 3.

6. Semantic Matching n[Choi+, NIPS2019] • • nSen2vec [Pagliardini+, arXiv2018] • • 𝑐! 𝑐" 𝑉!, 𝑉"

7. Selector nSelector Architecture • 3D ResNet-18 [He+, arXiv2016] + MLP • n • 3D ResNet-18 validation loss n •

8. Training Selector nSelector • RL 1. • 𝐷#$%: validation set ℒ&%': • 𝑓(: 𝑉): 𝑦): • 𝛿: • 𝑆:

9. Training Selector 2. REINFORCE [Williams+, Machine learning1992] • • 𝐷*: • 𝐷+:

10. Video Compositing 1. • MaskRCNN [He+, ICCV2017] • MaskRCNN COCO [Lin+, ECCV2014] • 2. • • [Liu+, ECCV2018] 3.

11. Training Classifier n • , 𝑦 • 𝛾 = ∑ *! ,-. 𝛼 = 4 • 𝑦/ 𝑦0

12. n • • • • Few-shot • n+k k • Novel-class • 1~5 • Seen-class • • • Standard split [Zhang+, arXiv2020] • Truze split [Gowda+, arXiv2021] • • • • • Sports1M [Karpathy+, CVPR2014] n • HMDB51 [Jhuang+, ICCV2011] • UCF101 [Soomro+, arXiv, 2012] • Kinetics-400 [Kay+, arXiv2017] • Kinetics-100

13. 1 n • 13.4% •

14. 2 n • Top-1 accuracy • 5~50% • L2A Pre-training • Kinetics-400 Selector

15. 3 nFew-shot • Top-1 accuracy • S: Standard split, T: Truze split

16. 4 n • L2A

17. nLearn2Augment • • n • • 8.6% • Few-shot • 3.7% • • 17.4%