SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
6/10 (木) 14:30~15:00
講師:Huy H. Nguyen 氏(総合研究大学院大学/国立情報学研究所)
概要: Advances in machine learning and their interference with computer graphics allow us to easily generate high-quality images and videos. State-of-the-art manipulation methods enable the real-time manipulation of videos obtained from social networks. It is also possible to generate videos from a single portrait image. By combining these methods with speech synthesis, attackers can create a realistic video of some person saying something that they never said and distribute it on the internet. This results in loosing social trust, making confusion, and harming people’s reputation. Several countermeasures have been proposed to tackle this problem, from using hand-crafted features to using convolutional neural network. Some countermeasures use images as input and other leverage temporal information in videos. Their output could be binary (bona fide or fake) or muti-class (deepfake detection), or segmentation masks (manipulation localization). Since deepfake methods evolve rapidly, dealing with unseen ones is still a challenging problem. Some solutions have been proposed, however, this problem is not completely solved. In this talk, I will provide an overview on both deepfake generation and deepfake detection/localization. I will mainly focus on image and video domain and also introduce some audiovisual-based methods on both sides. Some open discussions and future directions are also included.
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
6/10 (木) 14:30~15:00
講師:Huy H. Nguyen 氏(総合研究大学院大学/国立情報学研究所)
概要: Advances in machine learning and their interference with computer graphics allow us to easily generate high-quality images and videos. State-of-the-art manipulation methods enable the real-time manipulation of videos obtained from social networks. It is also possible to generate videos from a single portrait image. By combining these methods with speech synthesis, attackers can create a realistic video of some person saying something that they never said and distribute it on the internet. This results in loosing social trust, making confusion, and harming people’s reputation. Several countermeasures have been proposed to tackle this problem, from using hand-crafted features to using convolutional neural network. Some countermeasures use images as input and other leverage temporal information in videos. Their output could be binary (bona fide or fake) or muti-class (deepfake detection), or segmentation masks (manipulation localization). Since deepfake methods evolve rapidly, dealing with unseen ones is still a challenging problem. Some solutions have been proposed, however, this problem is not completely solved. In this talk, I will provide an overview on both deepfake generation and deepfake detection/localization. I will mainly focus on image and video domain and also introduce some audiovisual-based methods on both sides. Some open discussions and future directions are also included.
(0661799) The Bicycle: A Human Powered VehicleAdam V
This integrated media series is about the bicycle. Throughout time, bikes have provided a means of transportation, recreation and physical fitness. There are three common types of biking which include cross country, downhill/free-ride and street riding, all of which are strongly reliant on endurance, handling/control and self-reliance.
Renewable Sources of Energy- Dynamo in Bicycleadithebest15
How can we use a renewable Energy Source to ride a bicycle which can emit light in the night and that too, without no money spent?
This Presentation depicts the production of electricity by simply paddling your bicycle.
Of course it is a little expensive, but surely it is better than the battery system...
You can apply it too in your bicycle!
To know more, download the Powerpoint Presentation.
Natalie shares how irrationality, timing and escalating consumer expectations are three important concepts that impact how we think about (and practice) innovation and consumer experience design.
Crowd Agents: Interactive Crowd-Powered Systems in the Real WorldJeffrey Bigham
In this talk, I discuss several interactive crowd-powered systems
that help people address real-world problems. For instance, VizWiz
sends questions blind people have about their visual environment to
the crowd, Legion allows outsourcing of desktop tasks to the crowd,
and Scribe allows the crowd to caption audio in real-time. The
thousands of people have engaged with these systems, providing an
interesting look at how end users want to interact with crowd work.
Collectively, these systems illustrate a new approach to human
computation in which the dynamic crowd is provided the computational
support needed to act as a single, high-quality agent. The classic
advantage of the crowd has been its wisdom, but our systems are
beginning to show how crowd agents can surpass even expert individuals
on motor and cognitive performance tasks.
This slide introduces the latest status of the SIGVerse project talked in the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2016.
(0661799) The Bicycle: A Human Powered VehicleAdam V
This integrated media series is about the bicycle. Throughout time, bikes have provided a means of transportation, recreation and physical fitness. There are three common types of biking which include cross country, downhill/free-ride and street riding, all of which are strongly reliant on endurance, handling/control and self-reliance.
Renewable Sources of Energy- Dynamo in Bicycleadithebest15
How can we use a renewable Energy Source to ride a bicycle which can emit light in the night and that too, without no money spent?
This Presentation depicts the production of electricity by simply paddling your bicycle.
Of course it is a little expensive, but surely it is better than the battery system...
You can apply it too in your bicycle!
To know more, download the Powerpoint Presentation.
Natalie shares how irrationality, timing and escalating consumer expectations are three important concepts that impact how we think about (and practice) innovation and consumer experience design.
Crowd Agents: Interactive Crowd-Powered Systems in the Real WorldJeffrey Bigham
In this talk, I discuss several interactive crowd-powered systems
that help people address real-world problems. For instance, VizWiz
sends questions blind people have about their visual environment to
the crowd, Legion allows outsourcing of desktop tasks to the crowd,
and Scribe allows the crowd to caption audio in real-time. The
thousands of people have engaged with these systems, providing an
interesting look at how end users want to interact with crowd work.
Collectively, these systems illustrate a new approach to human
computation in which the dynamic crowd is provided the computational
support needed to act as a single, high-quality agent. The classic
advantage of the crowd has been its wisdom, but our systems are
beginning to show how crowd agents can surpass even expert individuals
on motor and cognitive performance tasks.
This slide introduces the latest status of the SIGVerse project talked in the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2016.
Final lecture from the COMP 4010 course on Virtual and Augmented Reality. This lecture was about Research Directions in Augmented Reality. Taught by Mark Billinghurst on November 1st 2016 at the University of South Australia
Top 5 most viewed articles from academia in 2019 - gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications
Assignment of Design Research Method (Chen Mengdie)cocoachen1992
The assignment of Design Research Method is based on the book "Delft Design Guide". Pro. Tan, the tutor of this course, asked each students to chose a method from this book and then make a presentation about the design selected method. Besides, we were also asked to find at least three academic papers related to the method accordingly. What I'm focus on is Three-dimensional Models, which are also called prototypes. The whole presentation could be divided to six parts, as following.
Part 1: What Are Three-dimensional?
Part 2: When Can We Use Three-dimensional Models?
Part 3: How to Use Three-dimensional Models?
Part 4: Why Do We Use Three-dimensional Model during Design Process?
Part 5: Three related papers/case studies, which is FingerReader, LaserOrigami and FaBrickation. All the three papers come from the CHI conference.
Part 6: Design thinking and my own design brief by using this method.
People often use computers other than their own to access web content, but blind users are restricted to using only computers equipped with expensive, special-purpose screen reading programs that they use to access the web. Web-Anywhere is a web-based, self-voicing web browser that enables
blind web users to access the web from almost any computer that can produce sound without installing new software. The system could serve as a convenient, low-cost solution for blind users on-the-go, for blind users unable to afford a full screen reader and for web developers targeting accessible design. This paper overviews existing solutions for mobile web access for blind users and presents the design
of the WebAnywhere system. WebAnywhere generates speech remotely and uses prefetching strategies designed to reduce perceived latency. A user evaluation of the system is presented showing that blind users can use Web-Anywhere to complete tasks representative of what users might want to complete on computers that are not their own. A survey of public computer terminals shows that WebAnywhere can run on most.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
How to Add Chatter in the odoo 17 ERP ModuleCeline George
In Odoo, the chatter is like a chat tool that helps you work together on records. You can leave notes and track things, making it easier to talk with your team and partners. Inside chatter, all communication history, activity, and changes will be displayed.
Delivering Micro-Credentials in Technical and Vocational Education and TrainingAG2 Design
Explore how micro-credentials are transforming Technical and Vocational Education and Training (TVET) with this comprehensive slide deck. Discover what micro-credentials are, their importance in TVET, the advantages they offer, and the insights from industry experts. Additionally, learn about the top software applications available for creating and managing micro-credentials. This presentation also includes valuable resources and a discussion on the future of these specialised certifications.
For more detailed information on delivering micro-credentials in TVET, visit this https://tvettrainer.com/delivering-micro-credentials-in-tvet/
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
Thinking of getting a dog? Be aware that breeds like Pit Bulls, Rottweilers, and German Shepherds can be loyal and dangerous. Proper training and socialization are crucial to preventing aggressive behaviors. Ensure safety by understanding their needs and always supervising interactions. Stay safe, and enjoy your furry friends!
Group Presentation 2 Economics.Ariana Buscigliopptx
The Design of Human-Powered Access Technology
1. The Design of Human-Powered
Access Technology
Jeffrey P. Bigham
University of Rochester
Richard E. Ladner
University of Washington
Yevgen Borodin
Stony Brook University
2. Introduction History Examples Dimensions Application
Human-Powered Access Technology –
technology that facilitates and, ideally,
improves interactions between disabled
people and human assistants
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
3. Introduction History Examples Dimensions Application
Human Power in History
• People Rely on Assistance from Others
– to overcome small accessibility problems
– prevent small challenges from becoming bigger
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
4. Introduction History Examples Dimensions Application
Managing Expectations
• Structures Around Assistance
– sign language interpreters
– volunteer training / accountability
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
5. Introduction History Examples Dimensions Application
Remote Services
• What has changed is Connectivity
Connectivit
y
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
6. Introduction History Examples Dimensions Application
Remote Assistance
Video Relay Services
Real-time
Captioning
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
7. Introduction History Examples Dimensions Application
Crowdsourcing / Human Computation
For an overview see:
[1] Quinn and Bederson.
“Human computation: a
survey and taxonomy of a
growing field. CHI 2011.
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
8. Introduction History Examples Dimensions Application
Bigham et al. Nearly Real-Time Answers to Visual Questions. UIST 2010.
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
9. Introduction History Examples Dimensions Application
Examples
ESP Game
VizWiz
Social Accessibility Project
Solona
IQ Engines / oMoby
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
10. Introduction History Examples Dimensions Application
Examples
MAP Lifeline
ESP Game
Tactile Graphics Project
VizWiz
GoBraille Social Accessibility Project
Remote Real-Time
Respeaking
Reading Service
Solona
ASL-STEM Forum Remote Real-Time
IQ Engines / oMoby Captioning
Bookshare
Scribe4Me Video Relay Services
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
11. Introduction History Examples Dimensions Application
Design Dimensions
Intitiative: who initiates help?
• End User
• Worker
• Organization
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
12. Introduction History Examples Dimensions Application
Design Dimensions
Latency: how long does it take to get help?
• Interactive
• Short Delay
• Undetermined
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
13. Introduction History Examples Dimensions Application
Design Dimensions
Confidentiality: user expectations
• Trusted Worker Pools
• User Feedback
• No Guarantees
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
14. Introduction History Examples Dimensions Application
Design Dimensions
Broader Context:
• User
• Worker
• Community
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
15. Introduction History Examples Dimensions Application
vs.
-- two systems that have sighted people describe web images for blind people --
Similarities Differences
Functionality Experts vs. Crowd
Target Disability: Blind Latency
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
16. Introduction History Examples Dimensions Application
VizWiz
vs. Scribe4Me
-- different target disabilities but similar goal --
Similarities Differences
Latency Target Disability
User Initiative Accuracy
Source
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
17. Introduction History Examples Dimensions Application
Areas for Future Research
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
18. Introduction History Examples Dimensions Application
Areas for Future Research
• Latency
Latency
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
19. Introduction History Examples Dimensions Application
Areas for Future Research
• Latency
• Broader Context
Broader Context
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
20. Introduction History Examples Dimensions Application
Areas for Future Research
• Latency
• Broader Context
• Other Disabilities
Other Disabilities
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
21. Introduction History Examples Dimensions Application
Conclusion
• Human-Powered Access Technology
• Identified 15 Examples
• Isolated 13 Design Dimensions
• Useful for Evaluating, Comparing, Motivating
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
22. Introduction History Examples Dimensions Application
due
12/21/2011
http://www.gccis.rit.edu/taccess
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
23. Introduction History Examples Dimensions Application
crowdability.org
University of Rochester Human-Computer Interaction Jeffrey P. Bigham
Hi everyone, I’m Jeff Bigham from the University of Rochester.Today, I’m going to be talking about the design of what we call “Human-Powered Access Technology.”This paper is joint work with Richard Ladner from the University of Washington, and Eugene Borodin from Stony Brook University.
First, we define human-powered access technology as “technology that facilitates and, ideally, improve interactions between disabled people and human assistants”The goal of our paper was to motivate the idea of human-powered access technology, and to put forth a taxonomy that would help those of us working on technologies in this space to more easily1. talk about our work,2. evaluate and compare new work with what has come before3. suggest directions within this area that might be good targets for future research.
It turns out that people with disabilities have relied on the assistance of people in their communities to overcome small accessibility problems experienced in everyday life for centuries.a blind person may ask a fellow traveler for the number of an approaching bus,or a person with a motor impairment may ask assistance with small physical tasks.Individually, these are small challenges, but the assistance provided helps to prevent these small problems from beginning bigger ones.
Initially, this help was provided informally – for instance, members of a religious congregation may provide informal sign language interpreting.But, far from being passive recipients of this help, people with disabilities have formed organizing structures around the assistance they receive in order to ensure that their expectations are met.For instance, sign language interpreters agree to strict confidentiality agreements that prevent them from injecting their own comments and prevent them from repeating conversations.Volunteers are often trained, and held accountable for the assistance they provide.Almost all professional organizations providing services to people with disabilities have a code of ethics that requires confidentiality, respect for customers, and responsibility of assistants to only take on jobs for which they are qualified.
What has changed is connectivity – no longer must an assistant be co-located to provide support.
Leading the way of remote human assistance were people with disabilities – in the form of such technologies as video relay services, which connect sign language speakers to hearing people on the phone.And remote real-time captionists who can transcribe live events remotely.
To us, this sort of technology presaged the recent popularity of crowdsourcing and human computation in computing.The main idea behind these areas are that there are still some things that people can do better than computers, and that crowds can do some things even better than individuals.I won’t talk extensively about these areas, but for a survey see the really great paper by Quinn and Bederson from this year’s CHI conference.
As an example, consider VizWiz, an application that my group developed.VizWiz is an iPhone application that lets blind people take a picture, speak a question they’d like to know about it, and receive answers from multiple people (aka, the crowd) quickly (generally in less than a minute).As we were creating this application, we made a number of design decisions. We knew we wanted to make it fast – a lot of time, when blind people need to know something about their environment, they need to know quickly – think of reading a menu in a restaurant. To make it work quickly and cheaply, we made some tradeoffs – we solicited answers from several non-skilled workers (we first used primarily Mechanical Turk, and now we also employ volunteers). Some answers they provide may not be correct (or ideal), but the user is likely to be able to make sense of them. These workers are not professionals, and so we cannot guarantee that they will treat the photo confidentially, so each VizWiz user sees a disclaimer warning them of this to help align their expectations with the reality of what the tool does.To give further context, VizWiz is up on the App Store and has now been use to answer over 20,000 questions.
As we were designing and implementingVizWiz, we did so aware of what had come before – and many tools seemed to do something similar to VizWiz.For instance, one of the goals of the ESP Game was to label web images for blind people. But, it wasn’t quite the same – users were unable to take the initiative to directly submit an image for labeling and the crowd provided a label, not an answer to a question.SocialAccessibility gives blind users the initiative to, among other things, request a description for web images. In some ways, this is similar to VizWiz, but latency is not guaranteed, and so you might have to wait quite a while to have your image described.Solona is much closer – with Solona, you can take a screen shot of your desktop, and an expert drawn from a small pool answers it. Solona advertised a latency of 30 minutes, but because the service relied on a small pool, often times no answerer was available…and they have since shut down.IQ Engines is a start-up company that provides an image description API that uses both humans and computer vision. For their crowd they employ a small call center of workers whose job is to describe images extremely quickly.
As you can see, there is a lot of work – our claim is that it’s somewhat difficult to really talk about and compare these related technologies because we don’t have a good framework in which to do so.Especially, when we broaden to other technologies:Scribe4Me was a research project that provides auditory information for deaf people with about 5 minutes latency.Video relay services we’ve seen engage a remote interpreter in real-time, real-time captionists transcribe audio.Real-time Reading Services was a concept out of the Smith Kettlewell Eye Institute in which blind people could fax pictures for description.MAP Lifeline is a technology for people with cognitive impairments that allows caregivers to inject prompts in real-time based on sensory informationAnd then there’s a whole host of more asynchronous tools that use people -- community-driven resources like GoBraille, the ASL-STEM forum, and Bookshare in which disabled people themselves share information to help make the world more accessible for themselves but at the same time each other. And tools like Respeaking and the Tactile Graphics project in which technology helps human workers create more accessible information.
Starting with a large number of examples of human-powered access technologies, we isolated 13 design dimensions you see here that we believe can be useful in talking about technology in this space.The dimensions and values provided here are not necessarily meant to be comprehensive, but rather to serve as a starting point.I don’t have time to go through them all, but I’ll go through a fewFor instance, Initiative refers to who instigates assistance:The end user often decides when to solicit help from human supporters. Examples include services like remote real-time captioning and relay services, and crowdsourcing systemslike Social Accessibility and VizWiz.Systems like Bookshare and Go Braille allow the human supporters to decide when and what information they will provide. For instance, what books will be scanned as part of Bookshare or what landmarks they will label in Go Braille.Groups of people will sometimes decide to solicit the help of human workers or to guide their efforts. For instance, workers are often recruited to contribute specific signs in the ASL-STEM Forum.
As another example, latency refers to how long it takes to get assistance.Some tools are designed to be interactive, and other asynchronous. While it may seem that low latency would also be best, I should point out here that there is no single best answer for each dimension, and that different tools make different trade-offs.For some tools latency might be a primary target, as it is with VizWiz. In others it may be viewed as less important. For instance with Bookshare, it matters more that a book is made available in an accessible form eventually – it doesn’t necessarily have to happen right away.
Confidentiality.Primarily we see this one as an example of setting appropriate user expectations – relay services manage this by using trusted pools who have agreed to confidentiality agreements, applications like VizWiz instead tell users what to expect, and other tools simply make no guarantees.
Most of the focus in human-powered access technology is rightly on the end user. For instance, technology may go to great lengths to help protect their identity or ensure an appropriate user experience. Often, the effects on others in the broader context in which the technology is used are ignored.For instance, in the case of VizWiz, bystanders may unwittingly find themselves in the lens of a blind user.Or workers, may be asked to answer a question with consequences – for instance, VizWiz workers may be asked to decipher a medicine bottle.
Our framework allows us to more easily compare different human-powered access technologies.For instance, both social accessibility and solona describe web images for blind people – their functionality is similar and their target disability is similar, but they differ on where the workers come from and the expected latency of getting back an answer.
VizWiz and Scribe4me are similar in the expected latency and the fact that user’s initiate the service, but they differ in the targeted disability, method for ensuring accuracy (VizWiz recruits multiple answers, Scribe4Me uses experts), and the source of workers.
Plotting our example technologies along with the values for these dimensions also reveals possibilities for future work.
There is a big opportunity to make human-powered access technology faster.Many of the technologies in this space with low latency, require workers to be pre-recruited – for instance, real-time remote captioning..
To consider the broader context in which it will be used – especially considering the workers and the broader community.
And, finally, most of our examples were drawn from sensory disabilities (with the exception of MAP Lifeline). We believe that there is an opportunity to expand human-backed access technology to people with other types of disabilities.
In conclusion, I have introduced the idea of human-powered access technology – technology that facilitates and, ideally, improves interactions between disabled people and human assistants.I’ve described 15 example technologies, and isolated 13 design dimensions from these examples that may help us to discuss technology in this space.My view is that human power has the potential to greatly improve access for a large number of people with disabilities today.Our hope in writing this paper, and my hope in presenting this talk, is that articulating these dimensions may improve research in this area by helping us to better evaluate and compare new technologies, and motivate research into technologies insufficiently covered by existing work.
If you’re doing work in this area, as I think many of you are, I encourage you to submit to the special issue of TACCESS that I’m guest editing on “Crowdsourcing Accessibility.” Despite the title, I’m interested in any technology that broadly meets the definition of human-powered access technology that I’ve outlined in this talk.
I thank you for your time, and am happy to answer any questions now or when I see you around the conference over the next few days.