Computer vision techniques can be seen in various aspects in our daily life with tremendous impacts. This slides aim at introducing basic concepts of computer vision and applications for the general public.
Download link: https://uofi.box.com/shared/static/24vy7aule67o4g6djr83hzurf5a9lfp6.pptx
Overview of Computer Vision For Footwear IndustryTanvir Moin
Computer vision is an interdisciplinary field that focuses on enabling computers to interpret and analyze visual data from the world around us. It involves the development of algorithms and techniques that allow machines to understand images and videos, just as humans do.
The main goal of computer vision is to create machines that can "see" and understand the world around them, and then use that information to make decisions or take actions. This can involve tasks such as object recognition, scene reconstruction, facial recognition, and image segmentation.
Computer vision has a wide range of applications in various fields, such as healthcare, entertainment, transportation, robotics, and security. Some examples include medical image analysis, autonomous vehicles, augmented reality, and surveillance systems.
In recent years, the development of deep learning techniques, particularly convolutional neural networks (CNNs), has greatly advanced the field of computer vision, allowing machines to achieve state-of-the-art performance on various visual recognition tasks.
What Architecture Taught Me About Information Architecture (and UX)Nam-ho Park
2016-02-20 Presentation at World IA Day / Seattle
1. Layering of complex systems
2. Sequencing of spaces
3. Figuring out documentation
4. Power in the grid
5. Patterns everywhere
6. Beauty in simplicity
7. Recognizing scale
Silver oak College of Engineering and Technology's News letter named " TECOS" Design By me and my Tecos Family, Having Wonderful Glimpse and Acheivements of Students with Wonderful Articles.
Computer vision techniques can be seen in various aspects in our daily life with tremendous impacts. This slides aim at introducing basic concepts of computer vision and applications for the general public.
Download link: https://uofi.box.com/shared/static/24vy7aule67o4g6djr83hzurf5a9lfp6.pptx
Overview of Computer Vision For Footwear IndustryTanvir Moin
Computer vision is an interdisciplinary field that focuses on enabling computers to interpret and analyze visual data from the world around us. It involves the development of algorithms and techniques that allow machines to understand images and videos, just as humans do.
The main goal of computer vision is to create machines that can "see" and understand the world around them, and then use that information to make decisions or take actions. This can involve tasks such as object recognition, scene reconstruction, facial recognition, and image segmentation.
Computer vision has a wide range of applications in various fields, such as healthcare, entertainment, transportation, robotics, and security. Some examples include medical image analysis, autonomous vehicles, augmented reality, and surveillance systems.
In recent years, the development of deep learning techniques, particularly convolutional neural networks (CNNs), has greatly advanced the field of computer vision, allowing machines to achieve state-of-the-art performance on various visual recognition tasks.
What Architecture Taught Me About Information Architecture (and UX)Nam-ho Park
2016-02-20 Presentation at World IA Day / Seattle
1. Layering of complex systems
2. Sequencing of spaces
3. Figuring out documentation
4. Power in the grid
5. Patterns everywhere
6. Beauty in simplicity
7. Recognizing scale
Silver oak College of Engineering and Technology's News letter named " TECOS" Design By me and my Tecos Family, Having Wonderful Glimpse and Acheivements of Students with Wonderful Articles.
Brief History of Visual Representation LearningSangwoo Mo
- [2012-2015] Evolution of deep learning architectures
- [2016-2019] Learning paradigms for diverse tasks
- [2020-current] Scaling laws and foundation models
Artificial Intelligence in Fashion, Beauty and related Creative industriesPetteriTeikariPhD
Quick introduction for artificial intelligence / deep learning applications in fashion, beauty and creative industries.
Alternative download link: https://dl.dropboxusercontent.com/u/6757026/slideShare/creativeAI.pdf
Similar to Computer Vision meets Fashion (第12回ステアラボ人工知能セミナー) (20)
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
2. Who am I
Kota Yamaguchi
Research scientist
CyberAgent, Inc.
vision.is.tohoku.ac.jp/~kyamagu
Computer vision and machine learning
2017 Assistant professor, Tohoku University
2014 PhD, CS, Stony Brook University
2008 MS, 2006 BE, University of Tokyo
twitter.com/kotymg
github.com/kyamagu
3. Research agenda
1. Learning visual perception on the Web
[ACCV16] [ECCV16]
2. Clothing and body recognition
[BMVC 15] [TPAMI14] [ICCV13] [CVPR12]
3. Understanding fashion and behavior
[WACV15] [ACMMM14] [ECCV14]
4. Language and vision
[PACLIC16] [IJCV15] [CVPR12] [NAACL12] [EACL12]
9. For social science
Social groups
[Murillo 12] [Kwak 13]
[Kwak 13]
nput: (Z1, Y1), . . . , (ZN , YN ), C, ϵ
Output: w, ξ
nitialization: H = ∅
epeat
(w, ξ) ← solveproblem (8) based on current H;
for n = 1 to N do
Y∗
n ← argmaxY ∗
n ∈Y { △ (Yn , Y∗ )+
wT
Ψ(Zn , Y∗
)} ;
end
H ← H ∪ { (Y∗
1 , . . . , Y∗
N )} ;
until 1
N
N
n △ (Yn , Y∗
n ) − 1
N wT N
n [Ψ(Zn , Yn ) −
Ψ(Zn , Y∗
n )] ≤ ξ + ϵ;
gorithm 2: 1-slack formulation for structure SVM.
all that wehave6 spatial relations, |A| dimensional fea-
e, and C categories of occupations. Then thedimension-
y of wa and wb is 6C2
and C × |A|, respectively. Anal-
Soccer
Player
Mara-
thoner
Chef
Lawyer
Doctor Firefighter
Policeman
WaiterSoldier Student
Clergy
Mailman
Construc-
tion Labor
Teacher
Figure 3. Illustrations of the collected occupation database. There
are14 occupations and over 7K images in total.
hypothesis should be valid. However, the runtime for this
n-slack formulation in problem (6) is still polynomial with
Occupation
[Song 11] [Shao 13]
[Shao 13]
10. Fashion
Industry
“The global apparel market is valued at 3 trillion dollars,
3,000 billion, and accounts for 2 percent of the world‘s
Gross Domestic Product (GDP).” – FashionUnited.com
11. Amazon Echo Look:
Hands-Free Camera and Style Assistant
https://www.amazon.com/Echo-Hands-Free-Camera-Style-Assistant/dp/B0186JAEWK
20. Where to Buy It: Matching Street Clothing
Photos in Online Shops
Liu et al., Street-to-shop: Cross-scenario clothing
retrieval via parts alignment and auxiliary set.
CVPR 2012
Huang et al., Cross-Domain Image Retrieval With
a Dual Attribute-Aware Ranking Network, ICCV
2015
[Kiapour, ICCV 2015]
40. A Generative Model of People in Clothing
• Generating people from pose map and styling pipeline
[Lassner, ICCV 2017]
41. Recent topics: discussion
• Fashion tech is strongly application-oriented
• UX in e-commerce and social media
• Computer vision as a building block
• Deep learning almost solves recognition problems
• Contextual modeling is still under investigation
• Data issues: research towards unsupervised / weakly-annotated data
• Machine learning for creativity?
49. Refined Fashionista dataset
• High-quality, manually-
annotated 685 pictures
• Major improvement from v0.2
• CVPR 2012
• Used to learn FCN by fine-
tuning from pre-trained model
[Long 2015]
51. Coarse-to-fine superpixels on the fly
• SLIC superpixels computed on the client-side
• Takes just a second in modern browsers
• Efficient annotation from large to small segments
52. Limitations
• CRF tends to trim small
items
• sunglasses
• watch/bracelet
• Dress vs. top+skirt
distinction is still hard
truth prediction
input truth prediction
53. Pose estimation using FCN
• Human pose as
heatmap of parts
• Predict heatmap
by FCN
• Can pose help
segmentation, or
vice versa?
55. Discussion
• Lack of data size
• 685 pictures are not sufficient for deep-learning approach
• Global information in segmentation
• Local appearance cannot solve the confusion
• Need a global prediction of clothing combination to avoid confusion
between items
57. Visual attribute perception
• How does a _______ t-shirt look like?
• yellow
• large
• surfer
• comfy
• original
• popular
...
onehourtees.com
www.justclick.ie
www.matsongraphics.compolyvore.com
58. Another question
• How many words can we use to describe visual attributes of a t-
shirt?
• My t-shirt looks __________.
59. Automatic attribute discovery
• Finding vocabulary of attributes
• Open-world recognition challenge
• Using pre-trained deep neural networks to identify visual words
in the Web data
60. Our approach
Pre-trained deep CNN
beautiful soft blush handmade leather ballet
flats.
***please, note, our new blush ballet flats are
without the beige trim line (around the edges),
still just as beautiful and perhaps even more***
SIZING
✍ how to take measurements ✍
there are a number of ways to measure your
feet, however we find the quickest and most
reliable practice is by tracing your feet. Here is
how to do it: stand on a piece of paper that's
bigger than your feet, circle your feet around
with a straight standing pencil (without pressing
the pencil too hard to the edges of your feet).
Once you have the tracing, measure distance
between longest and widest points. Compare
the measurements to the list below.
Image Text
white
red
striped
wooden
sliky
...
Attributes
1. Get Web data
2. Analyze DNN's
internal activity
61. Web data:
unlimited vocabulary with images
Textual description
Feel So Good ... Purple Halter
Maxi Cotton dress 2 Sizes
Available
Tags
used, american casual, summer,
shorts, t-shirt, surfer, printed,
duffer
Etsy dataset: e-commerce Wear dataset: fashion-blog
63. Identifying difference at neurons
conv1
conv2
conv3
conv4
conv5
fc6
fc7
positive
negative
Deep neural
network
Activation
histograms
unit #1
unit #2
...
KL
divergence
Images
neurons
64. Why neural activation?
• Discriminability
• If the attribute is visual, positive
set should activate different set
of neurons
• Semantic depth
• Depth of activating layer should
be encoding semantic
information
...
conv1
conv2
conv3
conv4
conv5
fc6
fc7
Activation
histograms
Deep neural
network
65. Kullback-Leibler divergence
• Measure of difference
between P+ and P-
• Used to identify highly-
activating units
DKL (P+
|| P-
) º P+
x
å (x)log
P+
(x)
P-
(x)
68. Is the attribute visual?
• Which attribute is visually perceptible?
• Measure the classification performance, and compare against
human
yellow comfy
large original
surfer popular
69. Visualness
• Visualness of word u given a classifier f and dataset D+, D-:
V(u| f )º accuracy( f,Du
+
,Du
-
)
positive negative
D+ D-
71. Discovery in noisy data
annotatedfloralNOTannotatedfloral
predicted MOST floral predicted LEAST floral
False positives
False negatives
72. Most/least visual attributes
Method Most Least
Human flip pink red floral blue
sleeve purple little black yellow
url due last right additional
sure free old possible cold
Pre-trained +
Resample
flip pink red yellow green purple
floral blue sexy elegant
big great due much own
favorite new free different good
Attribute-tuned flip sexy green floral yellow pink
red purple lace loose
right same own light happy
best small different favorite free
Language prior top sleeve front matching waist
bottom lace dry own right
organic lightweight classic
gentle adjustable floral adorable
url elastic super
74. Most salient words (Etsy)
norm1 norm2 conv3 conv4 pool5 fc6 fc7
orange green bright flattering lovely many sleeve
colorful red pink lovely elegant soft sole
vibrant yellow red vintage natural new acrylic
bright purple purple romantic beautiful upper cold
blue colorful green deep delicate sole flip
welcome blue lace waist recycled genuine newborn
exact vibrant yellow front chic friendly large
yellow ruffle sweet gentle formal sexy floral
red orange French formal decorative stretchy waist
specific only black delicate romantic great American
75. Most salient words (Wear)
norm1 norm2 conv3 conv4 pool5 fc6 fc7
blue denim-jacket border-
striped-tops
kids shorts white-skirt long-skirt
green pink stripes bucket-hat half-length flared-skirt suit-style
red-black red dark-style hat-n-glasses pants spring midi-skirt
red red-socks stripes black denim upper gaucho-pants
denim-on-
denim
red-black backpack sleeveless dotted beret handmade
denim-skirt champion red American-
casual
border-stripes shirt-dress straw-hat
pink blue dark-n-dark long-cardigan white-pants overalls white-n-white
denim white denim-shirt white-n-white border-striped-
tops
hair-band white
yellow shirt navy stole gingham-check loincloth-style white-
coordinate
leopard i-am-clumsy outdoor-style mom-style sandals matched-pair white-pants
79. Attribute discovery
• Web data + deep network
• Highly-activating neurons to identify visual stimuli associated to
the given word
• Neural activations can further identify salient regions
87. TrueSkill game algorithm
• Algorithm to select which pair to play
• Idea:
• Represent each image by Gaussian over rating
• Update Gaussian parameters after each click
• Chooses expected-to-tie images for play
[R Herbrich, 2007]
96. What does it mean by similar?
The real challenge is the definition of similarity.
97. The query image is given in the left column, while five candidate
images are shown in the right columns.
1. Select an image with the most similar outfit to the query.
2. If there is NO similar image, please select NONE.
Query image
NONE
Collecting human judgments to learn
similarity
Select an image with the most similar outfit to the query image
101. Visually analyzing floral trend
Runway image of floral Retrieved images in street with timestamp
Peaks in spring!
% retrieved
images
102. Runway to realway analysis
• What is considered similar in fashion?
• Our approach: Learn human judgment
• Tracking similarity over time = trend analysis of a specific style
105. Like button in Chictopia
Long tail
Promotion effect?
~300K posts
106. Why do some pictures get popular?
Content factors
Social factors
• Active posting
• Lots of friends
• Good fashion items
• Photo quality
How much do they
matter?
107. Regression analysis in 300K posts
• Tag TF-IDF
• Image
composition
• Color entropy
• Style descriptor
• Parse
descriptor
Popularity
• User identity
• Previous posts
• Node degrees
Input Output
Social factors
Content factors
• Votes
108. Findings
• Your outfit doesn’t matter (!!!)
• Popularity is mostly the outcome of the network – social bias
• #votes ∝ #followers
• People just click on friends’ photos
• c.f., Rich-get-richer phenomenon
109. Regression performance
Factors R2 Spearman Accuracy
top 25%
Accuracy
top 75%
Social 0.491 0.682 0.847 0.779
Content 0.248 0.488 0.778 0.737
Social +
Content
0.493 0.685 0.845 0.775
Social factors significantly boosts the performance
110. What if there is no social network?
• Popularity = f ( content factors )?
111. Ask crowds!
• Collecting popularity
votes in Amazon
MTurk
• No network!
3000 pictures
25 assignments
113. Task
• Predict crowd popularity using Content factors and/or Social
factors in Chictopia
Social factors
Chictopia
Content factors
MTurk
Voting data
?
114. Predicting crowd votes
Factors R2 Spearman Accuracy
top 25%
Accuracy
top 75%
Social 0.423 0.634 0.845 0.787
Content 0.428 0.647 0.888 0.862
Social +
Content
0.473 0.686 0.884 0.858
• Content factors matter
• Social factors from Chictopia predict crowd votes well
• User-content correlation: Top-bloggers consistently post
good pictures
116. The data told us...
• Popularity is mostly the outcome of the social network
• People click on friends’ photos
• Content affects popularity, but we conjecture the existence of
user-content correlation
117. Computer Vision meets Fashion
• Computer vision = machine perception to quantify visual
• Tool to analyze semantics of fashion
• Research topics
• Recognition, street2shop, style understanding, social influence, fashion
trend, creativity