Cpgan content-parsing generative

•Download as PPTX, PDF•

0 likes•69 views

KyuYeolJung

Technology

Kyonggi Univ. AI Lab.
Index
 도입 배경
 CP-GAN
 Coarse-to-fine Generative Framework
 Memory-Attended Text Encoder
 Fine-grained Conditional Discriminator
 실험
 결론

Kyonggi Univ. AI Lab.
도입 배경
 기존까지 제안된 text-to-image 모델들의 특징
 Text을 이미지로 변환하기 위한 구조적 제안이 대부분 이었다.
 이 방법은 서로 교차 해석을 해야 하기 때문에 상당히 어렵다.
 CP GAN
 Text와 합성된 Image 모두 Parsing한 content 에 집중한다.
 Memory structure 사용
 conditional discriminator를 단어와 이미지의 sub-regions 사이의 관계를 세분화
하도록 맞춤 설정 함
소스코드 : https://github.com/dongdongdong666/CPGAN
학습기능은 미포함(사실상 공개 안 할 것으로 보임)

Kyonggi Univ. AI Lab.
도입 배경
 전체 구조
• 1 : 단어와 다양한 visual 맥락 사이의 일치 시킴
• 2 : 이미지를 의미의 관점에 맞춰 생성함
• 3 : 문장과 생성된 이미지 사이의 일관성을 체크한다.

Kyonggi Univ. AI Lab.
도입 배경
 현재 시점에서 Inception score가 높은 알고리즘 이다.

Kyonggi Univ. AI Lab.
CP-GAN
 CP-GAN : Coarse-to-fine Generative Framework
CP-GAN Attn-GAN
1, 잔차(residual)를 적용함 -> Generator사이의 정보 전달을 용이하게 함.
2, discriminator를 세분화 시킴 -> unconditional, conditional
Attn-GAN에서 추가된 요소

Kyonggi Univ. AI Lab.
CP-GAN
 CP-GAN : Coarse-to-fine Generative Framework
 Generator
 Discriminator
notations
𝐼 : Generator로 부터 생성된 이미지
X : textual description Encoding 기법이 기존의 Attn_GAN이랑 다르다.

Kyonggi Univ. AI Lab.
CP-GAN
 CP-GAN : Memory-Attended Text Encoder
 기존의 Encoding 방식
 현재 학습중인 이미지와 문장에만 집중이 가능하다.
 제안하는 방법
 과거의 이미지와 문장도 고려한다.

Kyonggi Univ. AI Lab.
CP-GAN
 CP-GAN : Memory-Attended Text Encoder
 Memory Construction
 단어를 visual 맥락과 서로 맞춘다. (parsing)
Visual feature :
m : Attention score가 가장 높은 Visual feature를 뽑은 후 가공함

Kyonggi Univ. AI Lab.
CP-GAN
 CP-GAN : Memory-Attended Text Encoder
 Text Encoding with Memory
 이전에 생성한 m으로부터 Text를 encoding 함.
 단어의 embedding 값도 같이 적용한다.(e)

Kyonggi Univ. AI Lab.
CP-GAN
 CP-GAN : Fine-grained Conditional Discriminator
 입력된 자연어와 합성된 이미지를 의미적으로 일치 시킴.

Kyonggi Univ. AI Lab.
실험
 정량적 평가
여러가지 평가지표 모두 CP GAN이 우수하다.

Kyonggi Univ. AI Lab.
실험
 정량적 평가
비교적 가벼운 신경망으로도 성능이 좋았다.

Kyonggi Univ. AI Lab.
실험
 직접 실행한 결과
Sever airplanes are parked
on an airport runway.
The room is situated on the dark side of the house.

Kyonggi Univ. AI Lab.
결론
 Text와 Image를 Parsing 하여 의미적으로 매칭 시키려 하였다.
 Attn Gan에서 Text와 Image encoder 부분을 수정 하였다.
 단어와 sub region간의 연관성을 높이려 하였다.
 fine-grained conditional discriminator
 개인적의견
 이전 모델에 비해 성능은 많이 향상되었다.
 또한 이전 모델에 비해 상대적으로 가벼운 편이다.
 그러나 생성된 품질은 아직은 아쉽다.

Recently uploaded

Microsoft CSP Briefing Pre-Engagement - Questionnaire

Exakis Nelite

Hyatt driving innovation and exceptional customer experiences with FIDO passw...

FIDO Alliance

Webinar Recording: https://www.panagenda.com/webinars/easier-faster-and-more-powerful-notes-document-properties-reimagined/ Have you ever felt frustrated by the small properties dialog in Notes? Had to create an agent or button to quickly change a field? Searched endlessly for the field you wanted to compare each time you selected a new document? Wished you could just make the damned thing bigger? Luckily, there is a solution – and you probably already have it installed! With the free panagenda Document Properties (Pro) you get the properties dialog you always needed. Big, resizable, full-text searchable. View multiple documents at once or compare them with a diff viewer. Modify any field, and finally have an easy way to handle profile documents for all users. Join HCL Lifetime Ambassador Julian Robichaux to discover how Document Properties can simplify your work and assist you daily when using Domino applications – in the client or the designer. You will never look back! Key takeaways from this session - What Document Properties is, which editions there are, and how you can find it in Notes and Domino Designer - How you can search for and edit any field, compare documents, or CSV export all data - How to find, edit, and even delete profile documents - Which configuration settings are available to customize feature

Easier, Faster, and More Powerful – Notes Document Properties Reimagined

panagenda

Design and Development of a Provenance Capture Platform for Data Science

Paolo Missier

In today's digital world, trust is key to customer relationships, but keeping it is a huge challenge. Customers are well-informed and empowered, quick to change brands if their trust is broken, even if it costs them more. This puts a lot of pressure on organizations to handle trust and safety issues with great care and transparency. The challenge, however, is real. Fragmented solutions have left privacy, legal, and security teams in a perpetual cycle of catch-up, struggling to update privacy notices, manage customer data rights, and answer lengthy security questionnaires—all while trying to prove ROI to the business. It's a thankless job, filled with repetition, tedious tasks, and constant interdepartmental coordination. Combine this with fast regulatory changes and the quick evolution of AI, and it becomes overwhelming. Join this webinar to learn more about TrustArc's new innovative solution Trust Center, the only unified, no-code online hub for trust and safety information built for privacy, security, compliance, and legal teams. Trust Center streamlines your path to compliance, shortens the pre-sales cycle, and reduces both legal and regulatory risks, saving time, effort, and cost. This webinar will review: - Why companies are building unified Trust Centers for a robust privacy program. - How unified Trust Centers streamline sales cycles, ensure regulatory compliance, and reduce operational bottlenecks. - How compliance, legal, security, GRC, and privacy teams benefit from a unified Trust Center in terms of needs, pains, and outcomes. - How TrustArc Trust Center saves time and work while reducing legal, reputational, and compliance risk by effectively managing policies, notices, terms, and disclosures, and providing real-time updates on subprocessors.

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...

TrustArc

Explore the latest trends and insights on JavaScript usage with Pixlogix's informative blog. Discover key statistics and facts about JavaScript's role in web development, its popularity among developers, and its impact on modern websites. Stay updated with the evolving landscape of JavaScript frameworks and libraries, and learn how they're shaping the future of web development. Gain valuable insights to enhance your JavaScript skills and stay ahead in the digital realm.

JavaScript Usage Statistics 2024 - The Ultimate Guide

Pixlogix Infotech

Working together SRE & Platform Engineering

Marcus Vechiato

ERP Contender Series: Acumatica vs. Sage Intacct

BrainSell Technologies

State of the Smart Building Startup Landscape 2024!

Memoori

In the ever-evolving landscape of data management, Zero-ETL is an approach that is reshaping how businesses handle and integrate their data. This webinar explores Zero-ETL, a paradigm shift from the traditional Extract, Transform, Load (ETL) process, offering a more streamlined, efficient, and real-time data integration method. We will begin with an introduction to the concept of Zero-ETL, including how it allows direct access to data in its native environment and real-time data transformation, providing up-to-date information with significantly reduced data redundancy. Next, we'll take you through several demonstrations showing how Zero-ETL can deliver real-time data and enable the free movement of data between systems. We will also discuss the various tools that support all aspects of Zero-ETL, providing attendees with an understanding of how they can adopt this innovative approach in their organizations. Lastly, the session will conclude with an interactive Q&A segment, allowing participants to gain deeper insights into how Zero-ETL can be tailored to their specific business needs and how they can get started today. Join us to discover how Zero-ETL can elevate your organization's data strategy.

The Zero-ETL Approach: Enhancing Data Agility and Insight

Safe Software

Introduction To Iamnobody89757 the vast expanse of the online realm, where anonymity and individuality intertwine, a phenomenon has emerged that captivates the collective curiosity – Iamnobody89757. This enigmatic entity, once a mere username, has transcended its humble origins to become a symbol of the intricate dance between privacy and expression in the digital age. Through this exploration, we delve into the origins, evolution, and far-reaching implications of this intriguing moniker, shedding light on the profound questions it raises about our digital selves. The Origins of iamnobody89757 The origins of iamnobody89757 are shrouded in the virtual mists of the wide internet. The term gained popularity on a niche discussion group that examined the most puzzling puzzles on the internet. In this instance, iamnobody89757 surfaced as a mysterious character who captivated other users with a conversation that was both mysterious and perceptive. These initial interactions were nuanced, filled with veiled references and subtle hints that painted a picture of someone—or something—with an intricate understanding of the digital domain’s darker corners. Although initially dismissed by many as just another anonymous user, the accuracy of certain predictions shared by iamnobody89757 soon captured the collective imagination. The iamnobody89757 enigma began with this shift from an unnoticed commenter to a fascinating subject, laying the groundwork for a growing tale that would weave its way through the fabric of online communities. The Birth of a Digital Persona Tracing the Roots The inception of Iamnobody89757 can be traced back to the early days of online forums and chat rooms, where users sought a delicate balance between anonymity and connectivity. In a world where personal information was a coveted commodity, this seemingly random assemblage of words and numbers became a shield, protecting the user’s identity while allowing them to engage in discourse without fear of judgment or repercussions.

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf

Muhammad Subhan

Introduction to FIDO Authentication and Passkeys.pptx

FIDO Alliance

Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...

ScyllaDB

Google I/O Extended 2024 Warsaw

GDSC PJATK

Intro to Passkeys and the State of Passwordless.pptx

FIDO Alliance

How to Check GPS Location with a Live Tracker in Pakistan

danishmna97

Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf

AnubhavMangla3

Event-Driven Architecture Masterclass: Challenges in Stream Processing

ScyllaDB

Portal Kombat : extension du réseau de propagande russe

中央社

Discover the top CodeIgniter development companies that can elevate your project to new heights. Our blog explores the best firms known for their expertise in CodeIgniter framework development. From robust web applications to scalable solutions, these companies deliver excellence. Whether you're a startup or an enterprise, find the perfect match for your development needs on Top CSS Gallery's blog.

Top 10 CodeIgniter Development Companies

TopCSSGallery

Recently uploaded (20)

Microsoft CSP Briefing Pre-Engagement - Questionnaire

Hyatt driving innovation and exceptional customer experiences with FIDO passw...

Easier, Faster, and More Powerful – Notes Document Properties Reimagined

Design and Development of a Provenance Capture Platform for Data Science

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...

JavaScript Usage Statistics 2024 - The Ultimate Guide

Working together SRE & Platform Engineering

ERP Contender Series: Acumatica vs. Sage Intacct

State of the Smart Building Startup Landscape 2024!

The Zero-ETL Approach: Enhancing Data Agility and Insight

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf

Introduction to FIDO Authentication and Passkeys.pptx

Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...

Google I/O Extended 2024 Warsaw

Intro to Passkeys and the State of Passwordless.pptx

How to Check GPS Location with a Live Tracker in Pakistan

Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf

Event-Driven Architecture Masterclass: Challenges in Stream Processing

Portal Kombat : extension du réseau de propagande russe

Top 10 CodeIgniter Development Companies

Cpgan content-parsing generative

1. Kyonggi Univ. AI Lab. CPGAN : CONTENT-PARSING GENERATIVE ADVERSARIAL NETWORKS FOR TEXT-TO-IMAGE SYNTHESIS 2021.1.18 정규열 Artificial Intelligence Lab Kyonggi Univiersity

2. Kyonggi Univ. AI Lab. Index  도입 배경  CP-GAN  Coarse-to-fine Generative Framework  Memory-Attended Text Encoder  Fine-grained Conditional Discriminator  실험  결론

3. Kyonggi Univ. AI Lab. 도입 배경

4. Kyonggi Univ. AI Lab. 도입 배경  기존까지 제안된 text-to-image 모델들의 특징  Text을 이미지로 변환하기 위한 구조적 제안이 대부분 이었다.  이 방법은 서로 교차 해석을 해야 하기 때문에 상당히 어렵다.  CP GAN  Text와 합성된 Image 모두 Parsing한 content 에 집중한다.  Memory structure 사용  conditional discriminator를 단어와 이미지의 sub-regions 사이의 관계를 세분화 하도록 맞춤 설정 함 소스코드 : https://github.com/dongdongdong666/CPGAN 학습기능은 미포함(사실상 공개 안 할 것으로 보임)

5. Kyonggi Univ. AI Lab. 도입 배경  전체 구조 • 1 : 단어와 다양한 visual 맥락 사이의 일치 시킴 • 2 : 이미지를 의미의 관점에 맞춰 생성함 • 3 : 문장과 생성된 이미지 사이의 일관성을 체크한다.

6. Kyonggi Univ. AI Lab. 도입 배경  현재 시점에서 Inception score가 높은 알고리즘 이다.

7. Kyonggi Univ. AI Lab. CP-GAN

8. Kyonggi Univ. AI Lab. CP-GAN  CP-GAN : Coarse-to-fine Generative Framework CP-GAN Attn-GAN 1, 잔차(residual)를 적용함 -> Generator사이의 정보 전달을 용이하게 함. 2, discriminator를 세분화 시킴 -> unconditional, conditional Attn-GAN에서 추가된 요소

9. Kyonggi Univ. AI Lab. CP-GAN  CP-GAN : Coarse-to-fine Generative Framework  Generator  Discriminator notations 𝐼 : Generator로 부터 생성된 이미지 X : textual description Encoding 기법이 기존의 Attn_GAN이랑 다르다.

10. Kyonggi Univ. AI Lab. CP-GAN  CP-GAN : Memory-Attended Text Encoder  기존의 Encoding 방식  현재 학습중인 이미지와 문장에만 집중이 가능하다.  제안하는 방법  과거의 이미지와 문장도 고려한다.

11. Kyonggi Univ. AI Lab. CP-GAN  CP-GAN : Memory-Attended Text Encoder  Memory Construction  단어를 visual 맥락과 서로 맞춘다. (parsing) Visual feature : m : Attention score가 가장 높은 Visual feature를 뽑은 후 가공함

12. Kyonggi Univ. AI Lab. CP-GAN  CP-GAN : Memory-Attended Text Encoder  Text Encoding with Memory  이전에 생성한 m으로부터 Text를 encoding 함.  단어의 embedding 값도 같이 적용한다.(e)

13. Kyonggi Univ. AI Lab. CP-GAN  CP-GAN : Fine-grained Conditional Discriminator  입력된 자연어와 합성된 이미지를 의미적으로 일치 시킴.

14. Kyonggi Univ. AI Lab. 실험

15. Kyonggi Univ. AI Lab. 실험  정량적 평가 여러가지 평가지표 모두 CP GAN이 우수하다.

16. Kyonggi Univ. AI Lab. 실험  정량적 평가 비교적 가벼운 신경망으로도 성능이 좋았다.

17. Kyonggi Univ. AI Lab. 실험  정성적 평가

18. Kyonggi Univ. AI Lab. 실험  정성적 평가

19. Kyonggi Univ. AI Lab. 실험  직접 실행한 결과 Sever airplanes are parked on an airport runway. The room is situated on the dark side of the house.

20. Kyonggi Univ. AI Lab. 결론

21. Kyonggi Univ. AI Lab. 결론  Text와 Image를 Parsing 하여 의미적으로 매칭 시키려 하였다.  Attn Gan에서 Text와 Image encoder 부분을 수정 하였다.  단어와 sub region간의 연관성을 높이려 하였다.  fine-grained conditional discriminator  개인적의견  이전 모델에 비해 성능은 많이 향상되었다.  또한 이전 모델에 비해 상대적으로 가벼운 편이다.  그러나 생성된 품질은 아직은 아쉽다.

Cpgan content-parsing generative

Recommended

Recommended

More Related Content

Similar to Cpgan content-parsing generative

Similar to Cpgan content-parsing generative (20)

More from KyuYeolJung

More from KyuYeolJung (8)

Recently uploaded

Recently uploaded (20)

Cpgan content-parsing generative