My final seminar on the course "Fundamentals and Trends in Vision and Image Processing" (IMPA). In this presentation, I focused on deep learning applications for image processing and creative goals.
How to use transfer learning to bootstrap image classification and question a...Wee Hyong Tok
#theaiconf SFO 2018
Session by Danielle Dean, WeeHyong Tok
Transfer learning enables you to use pretrained deep neural networks trained on various large datasets (ImageNet, CIFAR, WikiQA, SQUAD, and more) and adapt them for various deep learning tasks (e.g., image classification, question answering, and more).
Wee Hyong Tok and Danielle Dean share the basics of transfer learning and demonstrate how to use the technique to bootstrap the building of custom image classifiers and custom question-answering (QA) models. You’ll learn how to use the pretrained CNNs available in various model libraries to custom build a convolution neural network for your use case. In addition, you’ll discover how to use transfer learning for question-answering tasks, with models trained on large QA datasets (WikiQA, SQUAD, and more), and adapt them for new question-answering tasks.
https://conferences.oreilly.com/artificial-intelligence/ai-ca/public/schedule/detail/68527
Transfer learning enables you to use pretrained deep neural networks trained on various large datasets (ImageNet, CIFAR, WikiQA, SQUAD, and more) and adapt them for various deep learning tasks (e.g., image classification, question answering, and more).
Wee Hyong Tok and Danielle Dean share the basics of transfer learning and demonstrate how to use the technique to bootstrap the building of custom image classifiers and custom question-answering (QA) models. You’ll learn how to use the pretrained CNNs available in various model libraries to custom build a convolution neural network for your use case. In addition, you’ll discover how to use transfer learning for question-answering tasks, with models trained on large QA datasets (WikiQA, SQUAD, and more), and adapt them for new question-answering tasks.
Topics include:
An introduction to convolution neural networks and question-answering problems
Using pretrained CNNs and the last fully connected layer as a featurizer (Once the features are extracted, any existing classifier can be used for image classification, using the extracted features as inputs.)
Fine-tuning the pretrained models and adapting them for the new images
Using pretrained QA models trained on large QA datasets (WikiQA, SQUAD) and applying transfer learning for QA tasks
Chen Sagiv, co founder and co CEO of SagivTech, gave an introduction talk to Computer Vision at She Codes branch in Google Campus TLV.
In the talk an overview was given on what is computer vision, where it is used, some basic notions and algorithms and the AI revolution.
Prototyping of a Robot Arm Controller: getting the hands dirty to learn new t...EnriqueLlerenaDomngu
Nowadays we can find a lot of exciting technologies to work with, to build complex systems way faster than some years ago. As soon as we realize their existence, we ask ourselves: How can we learn to use them? How can we get that feeling of getting these tools to work? How can we combine them to achieve something real? Of course, we can read documentation… but isn’t it more exciting to get our hands dirty?
On this presentation, I tell the story of my most recent hobby: Prototyping a Controller for a simple simulated Robotic Arm. With the help of a hands sensor, I train a Machine Learning model to identify the position of the fingers of a person, and use that to indicate a movement to the Robotic Arm. The main focus of this exercise is to learn the different technologies used for it: Gazebo, Eclipse Deeplearning4j, Tensorflow, Apache Kafka, and Java. At the end of the session, I show the prototype working.
Used a neural network trained on the multi labelled image set to extract regions in the query image. A label probability of each region is then computed to create an intermediate feature vector. A weighted average of these vectors is then computed to generate multiple labels. This representation is then hashed for fast retrieval.
How to use transfer learning to bootstrap image classification and question a...Wee Hyong Tok
#theaiconf SFO 2018
Session by Danielle Dean, WeeHyong Tok
Transfer learning enables you to use pretrained deep neural networks trained on various large datasets (ImageNet, CIFAR, WikiQA, SQUAD, and more) and adapt them for various deep learning tasks (e.g., image classification, question answering, and more).
Wee Hyong Tok and Danielle Dean share the basics of transfer learning and demonstrate how to use the technique to bootstrap the building of custom image classifiers and custom question-answering (QA) models. You’ll learn how to use the pretrained CNNs available in various model libraries to custom build a convolution neural network for your use case. In addition, you’ll discover how to use transfer learning for question-answering tasks, with models trained on large QA datasets (WikiQA, SQUAD, and more), and adapt them for new question-answering tasks.
https://conferences.oreilly.com/artificial-intelligence/ai-ca/public/schedule/detail/68527
Transfer learning enables you to use pretrained deep neural networks trained on various large datasets (ImageNet, CIFAR, WikiQA, SQUAD, and more) and adapt them for various deep learning tasks (e.g., image classification, question answering, and more).
Wee Hyong Tok and Danielle Dean share the basics of transfer learning and demonstrate how to use the technique to bootstrap the building of custom image classifiers and custom question-answering (QA) models. You’ll learn how to use the pretrained CNNs available in various model libraries to custom build a convolution neural network for your use case. In addition, you’ll discover how to use transfer learning for question-answering tasks, with models trained on large QA datasets (WikiQA, SQUAD, and more), and adapt them for new question-answering tasks.
Topics include:
An introduction to convolution neural networks and question-answering problems
Using pretrained CNNs and the last fully connected layer as a featurizer (Once the features are extracted, any existing classifier can be used for image classification, using the extracted features as inputs.)
Fine-tuning the pretrained models and adapting them for the new images
Using pretrained QA models trained on large QA datasets (WikiQA, SQUAD) and applying transfer learning for QA tasks
Chen Sagiv, co founder and co CEO of SagivTech, gave an introduction talk to Computer Vision at She Codes branch in Google Campus TLV.
In the talk an overview was given on what is computer vision, where it is used, some basic notions and algorithms and the AI revolution.
Prototyping of a Robot Arm Controller: getting the hands dirty to learn new t...EnriqueLlerenaDomngu
Nowadays we can find a lot of exciting technologies to work with, to build complex systems way faster than some years ago. As soon as we realize their existence, we ask ourselves: How can we learn to use them? How can we get that feeling of getting these tools to work? How can we combine them to achieve something real? Of course, we can read documentation… but isn’t it more exciting to get our hands dirty?
On this presentation, I tell the story of my most recent hobby: Prototyping a Controller for a simple simulated Robotic Arm. With the help of a hands sensor, I train a Machine Learning model to identify the position of the fingers of a person, and use that to indicate a movement to the Robotic Arm. The main focus of this exercise is to learn the different technologies used for it: Gazebo, Eclipse Deeplearning4j, Tensorflow, Apache Kafka, and Java. At the end of the session, I show the prototype working.
Used a neural network trained on the multi labelled image set to extract regions in the query image. A label probability of each region is then computed to create an intermediate feature vector. A weighted average of these vectors is then computed to generate multiple labels. This representation is then hashed for fast retrieval.
This talk discusses Google Brain's Magenta, a project using TensorFlow to generate art and music.
The slides are adapted from Douglas Eck's talk at Google I/O 2017 (https://www.youtube.com/watch?v=2FAjQ6R_bf0)
CERTH @ MediaEval 2014 Social Event Detection Taskmultimediaeval
This paper describes the participation of CERTH in the Social Event Detection Task of MediaEval 2014. For Challenge 1, we use a \same event model" to construct a graph on which we perform community detection to obtain the nal clustering. Importantly, we tune the model to have a higher true positive rate than true negative rate, leading to significantly improved performance. The F1 score and NMI for our best run are 0.9161 and 0.9818, respectively. For Challenge 2, we developed probabilistic language models to classify events according to the criteria of the different queries. Our best run on Challenge 2 achieved an average F-score of 0.4604.
http://ceur-ws.org/Vol-1263/mediaeval2014_submission_47.pdf
Deep Learning is the area of machine learning and one of the most talked about trends in business and computer science today.
In this talk, I will give a review of Deep Learning explaining what it is, what kinds of tasks it can do today, and what it probably could do in the future.
We test if modern computer-vision algorithms can predict if users are reading relevant information, from their eye movement patterns. The slides accompany the video presentation at https://youtu.be/ZebBgUhL-EU
The full research paper is available at:
https://dl.acm.org/doi/10.1145/3343413.3377960
and also at
https://arxiv.org/abs/2001.05152
Presentation of web-based service developed within REVEAL and InVID on Experts’ Meeting on Digital Image Authentication and Classification, December 6, 2017.
Generative adversarial network and its applications to speech signal and natu...宏毅 李
Generative adversarial network (GAN) is a new idea for training models, in which a generator and a discriminator compete against each other to improve the generation quality. Recently, GAN has shown amazing results in image generation, and a large amount and a wide variety of new ideas, techniques, and applications have been developed based on it. Although there are only few successful cases, GAN has great potential to be applied to text and speech generations to overcome limitations in the conventional methods.
There are three parts in this tutorial. In the first part, we will give an introduction of generative adversarial network (GAN) and provide a thorough review about this technology. In the second part, we will focus on the applications of GAN to speech signal processing, including speech enhancement, voice conversion, speech synthesis, and the applications of domain adversarial training to speaker recognition and lip reading. In the third part, we will describe the major challenge of sentence generation by GAN and review a series of approaches dealing with the challenge. Meanwhile, we will present algorithms that use GAN to achieve text style transformation, machine translation and abstractive summarization without paired data.
Generative Adversarial Network and its Applications to Speech Processing an...宏毅 李
Generative adversarial network (GAN) is a new idea for training models, in which a generator and a discriminator compete against each other to improve the generation quality. Recently, GAN has shown amazing results in image generation, and a large amount and a wide variety of new ideas, techniques, and applications have been developed based on it. Although there are only few successful cases, GAN has great potential to be applied to text and speech generations to overcome limitations in the conventional methods.
There are three parts in this tutorial. In the first part, we will give an introduction of generative adversarial network (GAN) and provide a thorough review about this technology. In the second part, we will focus on the applications of GAN to speech signal processing, including speech enhancement, voice conversion, speech synthesis, and the applications of domain adversarial training to speaker recognition and lip reading. In the third part, we will describe the major challenge of sentence generation by GAN and review a series of approaches dealing with the challenge. Meanwhile, we will present algorithms that use GAN to achieve text style transformation, machine translation and abstractive summarization without paired data.
Searching Images: Recent research at SouthamptonJonathon Hare
Information Retrieval group seminar series. The University of Glasgow. 21st February 2011.
Southampton has a long history of research in the areas of multimedia information analysis. This talk will focus on some of the recent work we have been involved with in the area of image search. The talk will start by looking at how image content can be represented in ways analogous to textual information and how techniques developed for indexing text can be adapted to images. In particular, the talk will introduce ImageTerrier, a research platform for image retrieval that is built around Glasgow's Terrier software. The talk will also cover some of our recent work on image classification and image search result diversification.
Large Scale Image Forensics using Tika and Tensorflow [ICMR MFSec 2017]Thamme Gowda
This paper describes the applications of deep learning-based image
recognition in the DARPA Memex program and its repository of
1.4 million weapons-related images collected from the Deep web.
We develop a fast, efficient, and easily deployable framework for
integrating Google’s Tensorflow framework with Apache Tika for
automatically performing image forensics on the Memex data. Our
framework and its integration are evaluated qualitatively and quantitatively
and our work suggests that automated, large-scale, and
reliable image classification and forensics can be widely used and
deployed in bulk analysis for answering domain-specific questions
These hand-crafted slides present a hands-on activity on Data Visualization, part of the Big Data and GIS specialization track in Thought for Food Academy Program hosted at Escola Eleva in July 2018. I co-hosted this track with Brittany Dahl, from ESRI Australia, and Vinicius Filier, from Imagem Soluções de Inteligência Geográfica. A special thanks to Leandro Amorim, Henrique Ilidio and Erlan Carvalho, from Café Design Studio, who helped to line up this activity. See my website for further information: http://juliagiannella.com/tff/
This talk discusses Google Brain's Magenta, a project using TensorFlow to generate art and music.
The slides are adapted from Douglas Eck's talk at Google I/O 2017 (https://www.youtube.com/watch?v=2FAjQ6R_bf0)
CERTH @ MediaEval 2014 Social Event Detection Taskmultimediaeval
This paper describes the participation of CERTH in the Social Event Detection Task of MediaEval 2014. For Challenge 1, we use a \same event model" to construct a graph on which we perform community detection to obtain the nal clustering. Importantly, we tune the model to have a higher true positive rate than true negative rate, leading to significantly improved performance. The F1 score and NMI for our best run are 0.9161 and 0.9818, respectively. For Challenge 2, we developed probabilistic language models to classify events according to the criteria of the different queries. Our best run on Challenge 2 achieved an average F-score of 0.4604.
http://ceur-ws.org/Vol-1263/mediaeval2014_submission_47.pdf
Deep Learning is the area of machine learning and one of the most talked about trends in business and computer science today.
In this talk, I will give a review of Deep Learning explaining what it is, what kinds of tasks it can do today, and what it probably could do in the future.
We test if modern computer-vision algorithms can predict if users are reading relevant information, from their eye movement patterns. The slides accompany the video presentation at https://youtu.be/ZebBgUhL-EU
The full research paper is available at:
https://dl.acm.org/doi/10.1145/3343413.3377960
and also at
https://arxiv.org/abs/2001.05152
Presentation of web-based service developed within REVEAL and InVID on Experts’ Meeting on Digital Image Authentication and Classification, December 6, 2017.
Generative adversarial network and its applications to speech signal and natu...宏毅 李
Generative adversarial network (GAN) is a new idea for training models, in which a generator and a discriminator compete against each other to improve the generation quality. Recently, GAN has shown amazing results in image generation, and a large amount and a wide variety of new ideas, techniques, and applications have been developed based on it. Although there are only few successful cases, GAN has great potential to be applied to text and speech generations to overcome limitations in the conventional methods.
There are three parts in this tutorial. In the first part, we will give an introduction of generative adversarial network (GAN) and provide a thorough review about this technology. In the second part, we will focus on the applications of GAN to speech signal processing, including speech enhancement, voice conversion, speech synthesis, and the applications of domain adversarial training to speaker recognition and lip reading. In the third part, we will describe the major challenge of sentence generation by GAN and review a series of approaches dealing with the challenge. Meanwhile, we will present algorithms that use GAN to achieve text style transformation, machine translation and abstractive summarization without paired data.
Generative Adversarial Network and its Applications to Speech Processing an...宏毅 李
Generative adversarial network (GAN) is a new idea for training models, in which a generator and a discriminator compete against each other to improve the generation quality. Recently, GAN has shown amazing results in image generation, and a large amount and a wide variety of new ideas, techniques, and applications have been developed based on it. Although there are only few successful cases, GAN has great potential to be applied to text and speech generations to overcome limitations in the conventional methods.
There are three parts in this tutorial. In the first part, we will give an introduction of generative adversarial network (GAN) and provide a thorough review about this technology. In the second part, we will focus on the applications of GAN to speech signal processing, including speech enhancement, voice conversion, speech synthesis, and the applications of domain adversarial training to speaker recognition and lip reading. In the third part, we will describe the major challenge of sentence generation by GAN and review a series of approaches dealing with the challenge. Meanwhile, we will present algorithms that use GAN to achieve text style transformation, machine translation and abstractive summarization without paired data.
Searching Images: Recent research at SouthamptonJonathon Hare
Information Retrieval group seminar series. The University of Glasgow. 21st February 2011.
Southampton has a long history of research in the areas of multimedia information analysis. This talk will focus on some of the recent work we have been involved with in the area of image search. The talk will start by looking at how image content can be represented in ways analogous to textual information and how techniques developed for indexing text can be adapted to images. In particular, the talk will introduce ImageTerrier, a research platform for image retrieval that is built around Glasgow's Terrier software. The talk will also cover some of our recent work on image classification and image search result diversification.
Large Scale Image Forensics using Tika and Tensorflow [ICMR MFSec 2017]Thamme Gowda
This paper describes the applications of deep learning-based image
recognition in the DARPA Memex program and its repository of
1.4 million weapons-related images collected from the Deep web.
We develop a fast, efficient, and easily deployable framework for
integrating Google’s Tensorflow framework with Apache Tika for
automatically performing image forensics on the Memex data. Our
framework and its integration are evaluated qualitatively and quantitatively
and our work suggests that automated, large-scale, and
reliable image classification and forensics can be widely used and
deployed in bulk analysis for answering domain-specific questions
These hand-crafted slides present a hands-on activity on Data Visualization, part of the Big Data and GIS specialization track in Thought for Food Academy Program hosted at Escola Eleva in July 2018. I co-hosted this track with Brittany Dahl, from ESRI Australia, and Vinicius Filier, from Imagem Soluções de Inteligência Geográfica. A special thanks to Leandro Amorim, Henrique Ilidio and Erlan Carvalho, from Café Design Studio, who helped to line up this activity. See my website for further information: http://juliagiannella.com/tff/
Perspectivas para integração do Design nas Humanidades Digitais frente ao des...Júlia Rabetti Giannella
Apresentação de trabalho no I Congresso Internacional em Humanidades Digitais (sessão Redes sociais e visualizações), realizado de 9 a 13 de abril na Fundação Getúlio Vargas (FGV), Rio de Janeiro.
ObsevartóR!O2016 - interseções entre arte e técnicas de Deep LearningJúlia Rabetti Giannella
Trabalho apresentado no 4º Encontro de Pesquisadores dos Programas de Pós-Graduação em Artes Visuais
do Estado do Rio de Janeiro (Indisciplinas: a arte frente ao urgente)
Rio 2016 Contestation Images: a critical perception from content collected on...Júlia Rabetti Giannella
This work has been presented at the International Seminar "Places: Designing and Belonging", 2017, organized by ESDI (Escola Superior de Desenho Industrial / UERJ) and Centro Carioca de Design.
Campus Party Brasil 2017: OBSERVATÓR!O2016: perceptions of olympics through d...Júlia Rabetti Giannella
Nesta palestra vamos discutir a concepção e desenvolvimento do ObservatóR!O2016, um projeto interessado em coletar, estruturar e visualizar o heterogêneo debate em torno dos Jogos Olímpicos através de dados coletados das redes sociais.
Dedicaremos a primeira parte da palestra para contar sobre nosso processo de criação, abordando desde a infraestrutura para coleta e armazenamento de tweets e imagens até as decisões projetuais para design do site (http://oo.impa.br) e das visualizações de dados.
Na segunda parte, apresentaremos um desdobramento da nossa pesquisa que envolve técnicas inovadoras da visão computacional. Iremos falar sobre como utilizamos técnicas de deep Learning para analisar e classificar as imagens da Rio-2016 para depois remixá-las em novos produtos audiovisuais.
Esse projeto foi realizado por uma equipe multidisciplinar que trabalha no VISGRAF-IMPA e empreende pesquisas na fronteira entre Design, Computação, Arte e Matemática.
* correction: the data visualization presented on page 41 is from the Galileu Magazine.
This talk arises from the interest to gather and expand the discussions and reflections embraced in lectures on the subject data visualization in the eighth edition of Campus Party Brazil, an event attended by the authors of this text both in its organization as in the communication of content. Thus, this paper aims, initially, to make a terminological and conceptual review of data visualization. Then, it evidences and deepens some emerging topics highlighted by Brazilian researchers in their lectures such as: visualization in physical interfaces, collaborative mapping, storytelling in journalistic infographics, environmental information systems, interdisciplinary teaching and practice of data visualization and understanding of economic data through visual schemes.
Design e interfaces cartográficas: avanços para pesquisa e atuação profissionalJúlia Rabetti Giannella
Nessa palestra para o Campus Party Brasi, realizada no dia 05/02/2015, abordei os avanços na produção de interfaces cartográficas do ponto de vista do design e da comunicação e a emergência do mapeamento colababorativo como instrumento de engajamento cívico e inovação social.
Dispositivo infovis: interfaces entre visualização da informação, infografia ...Júlia Rabetti Giannella
A pesquisa busca alcançar novos aportes para estudos em infografia jornalística à medida que investiga qualidades, atividades profissionais e tecnologias envolvidas na produção de uma modalidade infográfica emergente, o infovis, contextualizada pelas potencialidades do meio digital e on-line, e sintonizada com práticas comunicacionais contemporâneas, que supera, em determinados aspectos, modelos infográficos anteriores. Com vistas a atingir nosso objetivo investigativo adotamos metodologia de pesquisa em duas etapas: 1) revisão conceitual de três referências teóricas fundamentais – visualização da informação, infografia jornalística e interatividade; e 2) proposta de análise de infovis com base na técnica de pesquisa de análise de conteúdo. Assim, o corpus empírico é constituído por 270 infovis de conteúdo temático político, sobretudo eleitoral, de quatro sítios jornalísticos – The New York Times, The Guardian, Folha de S. Paulo e O Estado de S. Paulo – veiculados entre jan. 2010 e dez. 2013. A partir de sistemático procedimento de análise elaborado segundo unidades e subunidades analíticas orientadas pelas três dimensões teóricas do dispositivo infovis (input, interface e output), obtivemos respostas codificadas para todas as peças do corpus. A leitura do formulário de análise preenchido gerou resultados e inferências sobre nossa amostra que apontam para aspectos produtivos pioneiros em infografia, como maior abertura comunicacional à audiência, emprego de ferramentas on-line e gratuitas para produção de infovis e utilização de tecnologia de atualização contínua dos dados visualizados.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
2. APPLICATIONS
• Colorization of Black and White Images
• Adding Sounds To Silent Movies
• Object Classification in Photographs
• Automatic Handwriting Generation
• Character Text Generation.
• Image Caption Generation.
• Automatic Game Playing
• Artistic style transfer
Source: http://machinelearningmastery.com/inspirational-applications-deep-learning/
3. 1) Colorization of Black and White Images
• problem of adding color to black and white photographs
• traditionally, this was done by hand with human effort
• CV task attacked by different approaches
• topic of relative importance in SIGGRAPH and EUROGRAPH
• DL approach involves the use of very large CNN
and supervised layers that recreate the image with
the addition of color
4. Paper Colorful Image Colorization (ECCV, 2016)
Source: http://richzhang.github.io/colorization/
15. 2) Object Classification in Photographs
• task requires the classification of objects within a
photograph as one of a set of previously known objects
• State-of-the-art results have been achieved on benchmark
examples of this problem using very large CNN
• derives from image classification task
• breakthrough: ImageNet Classification with Deep
Convolutional Neural Networks (Krizhevsky et al., 2012)
• AlexNet won ILSVRC-2012 challenge
Source: http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
16. Classification with localization
• more complex variation
of this task involves
specifically identifying
one or more objects
within the scene of the
photograph and drawing
a box around them
• GoogLeNet won
ILSVRC-2014 challenge
in this task
Source: https://research.googleblog.com/2014/09/building-deeper-understanding-of-images.html
17. 2.1) DL and RIO2016
• VISGRAF project (out 2016)
• task: automatically classify and cluster images by subject
features related to the Olympic Games, Olympic Torch
• CNN model and supervised learning
• TensorFlow (open source software library)
• Inception-v3 (Going Deeper with Convolutions, 2015)
• transfer learning (manually labeled 100 examples)
Source: http://lvelho.impa.br/dl_rio2016/index.html
Source: https://arxiv.org/abs/1409.4842
21. 2.2) Twitter Facial Analysis Reveals Demographics
of Presidential Campaign Followers
• (Mit Technology Review, march 2016)
• IN: Conference on Web and Social Media
• understand follower demographics of Trump and Clinton by
crossing Twitter metadata and facial features
• a CNN model on followers’ profile images extracts
information on gender, race and age
Source: https://www.technologyreview.com/s/601074/twitter-facial-analysis-reveals-demographics-of-presidential-
campaign-followers/?utm_campaign=add_this&utm_source=email&utm_medium=post
Source: https://arxiv.org/abs/1603.03097
22. A Comparison of the Trumpists and Clintonists
Source: https://arxiv.org/abs/1603.03097
C"lintonists"
in the Twitter
Sphere
23. 2.3) NVIDIA DRIVENet Demo - Visualizing
a Self-Driving Car
Source: https://www.youtube.com/watch?v=HJ58dbd5g8g
24. 3) Artistic style transfer
• task: separate and recombine content and style of arbitrary
images, providing a neural algorithm for the creation of
artistic images
• A Neural Algorithm of Artistic Style (Gatys et al., 2015)
Source: https://arxiv.org/abs/1508.06576
27. 3.1) DeepDream
Source: http://deepdreamgenerator.com/
Source: https://en.wikipedia.org/wiki/DeepDream
• computer vision program created by Google
• given an input image returns a version with h"allucinogenic"
appearance
• originates in a CNN codenamed Inception after the film of
the same name developed for the ILSVRC-2014
• CNN can also be run in reverse, to do synthesis
• enhance faces and certain animals -> pareidolia results
29. 3.2) Prisma App
Source: http://prisma-ai.com/
Source: https://en.wikipedia.org/wiki/Prisma_(app)
• photo-editing application that utilizes a neural network and
to transform the image into an artistic effect
• became popular on July 2016
• created by Alexey Moiseenkov
• reference A Neural Algorithm of Artistic Style (2016)
32. 3.3) Artistic style transfer (video)
Source: https://arxiv.org/abs/1604.08610
Source: https://www.youtube.com/watch?v=Khuj4ASldmU
• Artistic style transfer for videos (Ruder et al.,2016)
33. 3.4) Supercharging Style Transfer for video
Source: https://arxiv.org/abs/1610.07629
Source: https://research.googleblog.com/2016/10/supercharging-style-transfer.html
• A Learned Representation For Artistic Style (Dumoulin et al.,
2016)
• CNN that learns multiple styles at the same time
• method enables style interpolation