Modeling Beauty

•Download as PPTX, PDF•

0 likes•259 views

1. The document discusses creating a machine learning model to predict beauty by analyzing photos and their aesthetic ratings. It outlines collecting a dataset of labeled images, developing a convolutional neural network model, and potential applications of predicting beauty such as improving online reviews and product personalization. 2. The presenter acknowledges limitations of the initial model due to the small dataset and discusses challenges such as overfitting. Potential uses of CNN models beyond beauty prediction are also mentioned, such as image recognition, text translation and game playing. 3. Key takeaways are that high quality training data is needed to build accurate models and that machine learning requires creativity in its applications. The presentation encourages exploring model development and potential uses.

Technology

1
Modeling Beauty
Rachael Ferguson
https://sg.com.mx/dataday
#DataDayMx

HELLO!
My name is Rachael Ferguson
and I want you to be excited about creating models to predict
weird things
2

GOALS
3
1. Machine Learning
a. Basics
b. Really Dig in to CNNs
2. Beauty Classification Model
3. Potential Applications

Machine Learning Overview
just the basics
1a
4

GENERAL STEPS
5
get
data
choose
model
clean
data
fit
model
re-tune
model
test
model
predict

GENERAL STEPS
6
get
data
choose
model
clean
data
fit
model
re-tune
model
test
model
predict

Machine Learning Overview
convolutional neural network
1b
8

NESTS OF MACHINE
LEARNING
Deep Learning
Neural Networks
Convolutional
Neural
Networks
9
Machine Learning

CURRENT CNN USES
16
Facial recognition for security monitoring

CURRENT CNN USES
17
Translating handwritten text into strings and ints

Predicting Beauty
what makes you say “ooo”?
2
19

DEFINED BEAUTY
What is beauty?
Property of being visually pleasing.
20

COMPUTER DEFINED BEAUTY
Beautiful
21
Not

“
Beauty is in the eye of the beholder.
23

COMPUTER DEFINED BEAUTY
Beautiful
24
Not

INDIVIDUAL VERSUS
CONSENSUS
25
Personal
Data
Model
Consensu
s Data
Model

HOW ITS PREDICTED
26
Beautiful Ugly
Model Model

OUR DATASET
27
AADB Dataset:
- 10,000 photographic Flickr images with Creative Commons
license
- Each image has been labelled by 5 participants by aesthetic merit
- These participants were Amazon employees paid for their time.

EXTROVERSION LEVEL
IMPLICATIONS
28
Extrovert Introvert
5! 4

GOOD TRAINING DATA
IS RARE AND
EXPENSIVE
30

OUR DATASET
32
5
1
3
4
Data DataLabel Label

ASSUMPTIONS
34
1. The model will attempt to predict the perceived level of beauty of
an image as the participants themselves would rate it.
2. Due to the low number of images, the model will likely overfit the
images that are used for training and isn’t expected to abstract
well.
3. We will be grouping together images similarly rated with each
other under the assumption that they represent about the same
level of aesthetics.
4. Images rated a 3 likely do not have a strong influence on the
participant and thus are not considered to have important
features.

HOW TO IMPLEMENT
35
CNN Knowledge
Required
Method
Parameter
Control
Ability to Brag at
Parties
Chance of
Overwriting Hard
Drive
HIGHMEDIUMLOW
LOW MEDIUM HIGH
MEDIUM HIGH VERY HIGH
VERY LOW LOW HIGH

WHY SHOW A BAD MODEL?
36
Why 👏 are 👏 you 👏 wasting 👏 our 👏 time 👏 ?

Digging in to the Potential
why should we bother?
2
38

POTENTIAL GENERAL MODEL
USES
47
Absolutely Bonkers

Key Takeaways
jeez that was a lot at once
3
51

THANKS!
You can find me at
github.com/fergusonrae
linkedin.com/in/fergusonrae
fergusonrae@outlook.com
55

REFERENCED
● AADB Dataset
○ Information
○ Data
● Extroversion Study
● Awesome Rating Model Video
● Yelp and Machine Learning
● Importance of Aesthetics in:
○ Car Company Loyalty
○ Phone Sales
○ Socially Sharing Products Bought
○ Increase in Sales After Personalization
56

VISUAL CITATIONS
● Presentation template by SlidesCarnival
● Machine Learning Images
○ CNN Filtering - all
○ MNIST dataset
○ Checkers
● Predicting Beauty
○ Hummingbird
○ Boats
○ Crowd of Stick Figures
○ Dabbing Figure
○ Happy Man
○ Happy Woman
○ Bubble Freezing
○ Black and White House
○ Wedding
○ Man and Son
57

VISUAL CITATIONS
● Predicting Beauty
○ Surfing Business Man
● Potential
○ Road with Hills
○ City Center
○ Selfie 1
○ Selfie 2
○ Selfie 3
○ Bag 1
○ Bag 2
○ Xray
○ Recipes
○ Portrait of Edmond Belamy
○ Parameter Tuning
58

RESOURCES
● TensorFlow Transfer Learning
● Andrew Ng’s Coursera Video Series
● Chris Olah’s Blog
● Freely Available Datasets for Exploration
59

Similar to Modeling Beauty

Introduction to Usability Testing: The DIY Approach - GA, London January 13th...Evgenia (Jenny) Grinblo

Strata 2017 NYC - How to Hire and Test for Data Skills: A One-Size-Fits-All I...Tanya Cashorali

Let's Work Together!Brad Frost

Let's Work TogetherAquent

Artificial intelligence use cases for International Dating Apps. iDate 2018. ...Lluis Carreras

STAYING READY FOR WHAT’S NEXT: OUTCOMES VS LEARNINGHuman Capital Media

Pitching Ideas: How to sell your ideas to othersJeroen van Geel

Agile Australia - The anti-transformation transformationMirco Hering

IAD 2023 Milan - Building a Culture of SW CraftsmanshipMichele Brissoni

DevOps2018 Singapore Eliminating the dev versus ops mentalityMirco Hering

RE-EVALUATING YOUR ORGANIZATION’S SKILL GAPSHuman Capital Media

LMFAO Leveraging Machines for Awesome OutreachGareth Simpson

Brighton SEO International Search - Four Pillars of SuccessWill Cecil

Brian jeffcock collide_2014Brian Jeffcock

Building a Testing Playbook by Andrew RichardsonDelphic Digital

Building Agile & AI startups - Basic tips for Product Managers John Fagan

Programming for Non-programmers PFNP @ Razorfish Chris Castiglione

Multimodal Learning to Rank in Production Scale E-commerce SearchLucidworks

Giving sense to complexity PMI LebanonPierre E. NEIS

Why Marketers Will Rule The World: Rise of the Marketing TechnologistSocial Media Group

Similar to Modeling Beauty (20)

Introduction to Usability Testing: The DIY Approach - GA, London January 13th...

Strata 2017 NYC - How to Hire and Test for Data Skills: A One-Size-Fits-All I...

Let's Work Together!

Let's Work Together

Artificial intelligence use cases for International Dating Apps. iDate 2018. ...

STAYING READY FOR WHAT’S NEXT: OUTCOMES VS LEARNING

Pitching Ideas: How to sell your ideas to others

Agile Australia - The anti-transformation transformation

IAD 2023 Milan - Building a Culture of SW Craftsmanship

DevOps2018 Singapore Eliminating the dev versus ops mentality

RE-EVALUATING YOUR ORGANIZATION’S SKILL GAPS

LMFAO Leveraging Machines for Awesome Outreach

Brighton SEO International Search - Four Pillars of Success

Brian jeffcock collide_2014

Building a Testing Playbook by Andrew Richardson

Building Agile & AI startups - Basic tips for Product Managers

Programming for Non-programmers PFNP @ Razorfish

Multimodal Learning to Rank in Production Scale E-commerce Search

Giving sense to complexity PMI Lebanon

Why Marketers Will Rule The World: Rise of the Marketing Technologist

Recently uploaded

IESVE for Early Stage Design and PlanningIES VE

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...Product School

UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10

IoT Analytics Company Presentation May 2024IoTAnalytics

Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software

ODC, Data Fabric and Architecture User GroupCatarinaPereira64715

Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin

Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...Product School

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...Product School

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀DianaGray10

Speed Wins: From Kafka to APIs in Minutesconfluent

In-Depth Performance Testing Guide for IT ProfessionalsExpeed Software

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra

Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...Sri Ambati

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood

Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin

10 Differences between Sales Cloud and CPQ, Blanka DoktorováCzechDreamin

Recently uploaded (20)

IESVE for Early Stage Design and Planning

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...

UiPath Test Automation using UiPath Test Suite series, part 3

IoT Analytics Company Presentation May 2024

Essentials of Automations: Optimizing FME Workflows with Parameters

ODC, Data Fabric and Architecture User Group

Powerful Start- the Key to Project Success, Barbara Laskowska

Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀

Speed Wins: From Kafka to APIs in Minutes

In-Depth Performance Testing Guide for IT Professionals

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality

Connector Corner: Automate dynamic content and events by pushing a button

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...

Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade

10 Differences between Sales Cloud and CPQ, Blanka Doktorová

Modeling Beauty

1. 1 Modeling Beauty Rachael Ferguson https://sg.com.mx/dataday #DataDayMx

2. HELLO! My name is Rachael Ferguson and I want you to be excited about creating models to predict weird things 2

3. GOALS 3 1. Machine Learning a. Basics b. Really Dig in to CNNs 2. Beauty Classification Model 3. Potential Applications

4. Machine Learning Overview just the basics 1a 4

5. GENERAL STEPS 5 get data choose model clean data fit model re-tune model test model predict

6. GENERAL STEPS 6 get data choose model clean data fit model re-tune model test model predict

7. GARBAGE IN -> GARBAGE OUT 7

8. Machine Learning Overview convolutional neural network 1b 8

9. NESTS OF MACHINE LEARNING Deep Learning Neural Networks Convolutional Neural Networks 9 Machine Learning

10. FILTERING 10

11. FILTERING 11

12. FILTERING 12

13. FILTERING 13

14. FILTERING 14

15. FILTERING 15

16. CURRENT CNN USES 16 Facial recognition for security monitoring

17. CURRENT CNN USES 17 Translating handwritten text into strings and ints

18. CURRENT CNN USES 18 Playing checkers

19. Predicting Beauty what makes you say “ooo”? 2 19

20. DEFINED BEAUTY What is beauty? Property of being visually pleasing. 20

21. COMPUTER DEFINED BEAUTY Beautiful 21 Not

22. COMPUTER DEFINED BEAUTY No. 22

23. “ Beauty is in the eye of the beholder. 23

24. COMPUTER DEFINED BEAUTY Beautiful 24 Not

25. INDIVIDUAL VERSUS CONSENSUS 25 Personal Data Model Consensu s Data Model

26. HOW ITS PREDICTED 26 Beautiful Ugly Model Model

27. OUR DATASET 27 AADB Dataset: - 10,000 photographic Flickr images with Creative Commons license - Each image has been labelled by 5 participants by aesthetic merit - These participants were Amazon employees paid for their time.

28. EXTROVERSION LEVEL IMPLICATIONS 28 Extrovert Introvert 5! 4

29. OUR DATASET 29

30. GOOD TRAINING DATA IS RARE AND EXPENSIVE 30

31. OUR DATASET 31

32. OUR DATASET 32 5 1 3 4 Data DataLabel Label

33. OUR DATASET 33 Data Data

34. ASSUMPTIONS 34 1. The model will attempt to predict the perceived level of beauty of an image as the participants themselves would rate it. 2. Due to the low number of images, the model will likely overfit the images that are used for training and isn’t expected to abstract well. 3. We will be grouping together images similarly rated with each other under the assumption that they represent about the same level of aesthetics. 4. Images rated a 3 likely do not have a strong influence on the participant and thus are not considered to have important features.

35. HOW TO IMPLEMENT 35 CNN Knowledge Required Method Parameter Control Ability to Brag at Parties Chance of Overwriting Hard Drive HIGHMEDIUMLOW LOW MEDIUM HIGH MEDIUM HIGH VERY HIGH VERY LOW LOW HIGH

36. WHY SHOW A BAD MODEL? 36 Why 👏 are 👏 you 👏 wasting 👏 our 👏 time 👏 ?

37. GARBAGE IN -> GARBAGE OUT 37

38. Digging in to the Potential why should we bother? 2 38

39. APPLICATIONS 39 2015 Yelp

40. APPLICATIONS 40 2016 Yelp

41. POTENTIAL BEAUTY APPLICATIONS 41 Insane

42. POTENTIAL BEAUTY APPLICATIONS 42

43. POTENTIAL BEAUTY APPLICATIONS 43

44. POTENTIAL GENERAL CNN USES 44 Crazy

45. POTENTIAL GENERAL CNN USES 45

46. POTENTIAL GENERAL CNN USES 46

47. POTENTIAL GENERAL MODEL USES 47 Absolutely Bonkers

48. POTENTIAL GENERAL MODEL USES 48

49. POTENTIAL GENERAL MODEL USES 49

50. POTENTIAL GENERAL MODEL USES 50

51. Key Takeaways jeez that was a lot at once 3 51

52. GARBAGE IN -> GARBAGE OUT 52

53. ANYONE CAN MAKE A MODEL 53

54. MACHINE LEARNING NEEDS CREATIVITY 54

55. THANKS! You can find me at github.com/fergusonrae linkedin.com/in/fergusonrae fergusonrae@outlook.com 55

56. REFERENCED ● AADB Dataset ○ Information ○ Data ● Extroversion Study ● Awesome Rating Model Video ● Yelp and Machine Learning ● Importance of Aesthetics in: ○ Car Company Loyalty ○ Phone Sales ○ Socially Sharing Products Bought ○ Increase in Sales After Personalization 56

57. VISUAL CITATIONS ● Presentation template by SlidesCarnival ● Machine Learning Images ○ CNN Filtering - all ○ MNIST dataset ○ Checkers ● Predicting Beauty ○ Hummingbird ○ Boats ○ Crowd of Stick Figures ○ Dabbing Figure ○ Happy Man ○ Happy Woman ○ Bubble Freezing ○ Black and White House ○ Wedding ○ Man and Son 57

58. VISUAL CITATIONS ● Predicting Beauty ○ Surfing Business Man ● Potential ○ Road with Hills ○ City Center ○ Selfie 1 ○ Selfie 2 ○ Selfie 3 ○ Bag 1 ○ Bag 2 ○ Xray ○ Recipes ○ Portrait of Edmond Belamy ○ Parameter Tuning 58

59. RESOURCES ● TensorFlow Transfer Learning ● Andrew Ng’s Coursera Video Series ● Chris Olah’s Blog ● Freely Available Datasets for Exploration 59

Editor's Notes

goal of this talk was to apply ml to a creative topic, dive in to how to do it, and put the tools in your hands to explore more and push the boundaries of what machine learning is and what it applies to, and open about some really creative useful things that you can do.
Brief overview of the different pieces we’ll explore. Starts with what steps are required when setting up a machine learning project. This part has probably been repeated a few times so far, so I’ll be fast. Then, going to dig in to what makes a convolutional neural network different from other models. Next, we’ll begin setting up our beauty classification problem. This is the part I have made some tweaks to since writing the talk abstract due to time constraints and an increased focus on the important aspects of modeling in the future. So, this will be a bit different. Finally, we’ll walk through some interesting applications that can be done using modeling and beauty algorithms in general.
Just the basics. We’ll add more info and background and we move along. Also, I noticed there were quite a few ml geared talks today so I’m sure ya’ll are filled up
This flow is probably recognizable to most of you. Walk through the steps.
Most important step is understanding your data going in. We are going to discuss these again and again and again. Because these are currently the most critical parts. For the most part, every other step has a defined list of things you do. Follow the steps, and your model will turn out the best it can. Which means it can be automated. However, data selection is considered an art form and requires special human care. And it is the easiest thing to mess up.
Most important part of machine learning and the central theme to this talk MAKE SURE YOU HAVE CLEAN DATA If your model is going wrong, first thought should always be in regards to your data Garbage in, garbage out Your model means nothing if you’re feeding it biased, inaccurate, and/or too little good data Sounds like a simple thing, but clean data is actually really hard to come by. You can pretty much guarantee that any dataset that is touched or filled in by a human will need to be examined closely because there will be errors. We’re not machines. I know this talk is about modeling, but always remember that the most important part of this is cleaning your data, removing anomalies if needed, normalizing, the whole shebang. That is where most of your time will be spent.
For image recognition, we’re going to focus on convolutional neural networks. These are currently the best option when implementing a model that takes images and inputs.
So CNNs are a type of neural network, like their name suggests. What makes CNNs distinct as a particular kind of NN is that they Explicitly take images for inputs Are better optimized for handling images than plain neural networks are with the introduction of filters
So what makes a CNN optimized for handling images? It has preprocessing in the form of filters! In this case, preprocessing the input image data means taking the images and turning them into data that the computer can recognize. So taking an image and running a filter. So let’s start with a black and white image so we don’t have to deal with three dimensions. Guess who that is?
Example of a filter. This is how we are going to pull out and create data from the image. The initial filters for an image are usually very simple. Lines, slight curve, etc.
Now we’re using a mouse because it had more readily available images.
What it gets to eventually. Can classify based on whether certain features are contained within an image.
Common one that you would have likely come across if you did research into trying to train your own CNN Countless tutorials because it’s such an interesting dataset
Wonderful image right? CNNs have been trained to learn what a checkerboard looks like, and what it needs to look like for a winning strategy First thought might be cool, doens’t this seem more complicated than just creating a more virtual checkboard with code and programming in logic? Valid point. Yes and no. Yes as in that might be more efficent and require less computer power. But no as in, we don’t need to represent it in a way the computer can understand this way. It learns how to understand it itself. There’s no extra processing steps. Also, a human can play a physical checkers game with a machine. You don’t have to play it online, where the computer is comfortable. No input is needed except for the input also needed for a human. Which brings me to my next point. Who cares if it’s easier because you don’t have to change inputs, if you still have to know all this stuff about the model and code it into existence, and train it and tune it and test it?
In this section, we’re going to start introducing what are considered abstract-concepts for humans. We’ll start the process of creating a model that takes images as data and outputs a classification of beautiful or ugly and work through how it is we can do that.
We’re going to start off in the most exciting way possible, with a definition! It’s important to know exactly what we want to predict so we know if we hit it or not. Here, we’re going to extend that definition to also include aesthetically pleasing. So this is how a human can define beauty. But, if you send this definition into the computer, it’s not going to know what in the world you are talking about. What does that even mean. Heck, even us humans aren’t quite sure exactly what beauty represents. There isn’t a quantifiable thing that you can reference. How could an individual define beauty outside of abstract words and lots of hand waving. Well, the same way you describe concepts to a child, you point out examples. This is a dog, this is not. This might be a dog? And computers like examples.
So, the goal is to find a dataset that has images classified into beautiful and not beautiful. So we want labeled data. Cool, straightforward right?
This is where we start to slip down that abstract slide. Because while there tends to be a general consensus as to beauty, people are not the exact same when it comes to it.
Those photos aren’t the total definition of beauty. They are my definition of beauty. For instance, does anyone actually think that this frog is beautiful? And this specification of “my” is how we can attempt to get around the complexity of everyone having a different notion of beauty.
Now, because a large amount of data is needed to try to encompass all aspects of an abstract concept like beauty, and I don’t want to personally label a million images, so for this model, we’re going to instead find a good dataset that can represent consensus data.
NIMA model does this
Now even though we’re just beginning, there’s some red flags here. And it’s important to be critical at every step, most importantly selecting data. First, 10,000 images were used. At a glance, that seems like a lot, but remember that we are trying to quantify beauty. Could 10,000 labeled stills represent the entirety of human beauty? Questionable. So, worst offender here is that it is based on 5 people’s opinions. That is too small to be statistically significant for really any group of people outside of these 5. So off the bat, we need to change the objective of this model. Also, we don’t really know anything about these 5 people. We don’t know their background. We don’t know their state of mind when answering their questions. We don’t know their personalities.
Extroverted people rate things higher and have been shown to mentally react to stimuli more intensely. So, a 4 to an extrovert is likely not the same thing as a 4 to an introvert. With this in mind, let's take a closer look at how these individuals rated their images.
Example of what screen the participants saw. Hard to see, but at the top they are rating the level of aesthetic and along the sides they are indicating if a photo has contains any of the listed “negative” and “positive” effects. So, more red flags. 2nd, bias built in. Attributes are labeled negative and positive. This assumes that all people view the rule of thirds as aesthetically pleasing and not having an object emphasis as not aesthetically pleasing. Also, how are the 5 participants checking for these attributes. What does it mean for an image to have “color harmony” versus not. We’re adding subjective features to our subjective label. So, we have all these issues right now. Okay, first instinct, we probably shouldn’t use this dataset due to the amount of bias and subjectivity baked in. But I did use it. Which brings me to:
I went with this dataset because it’s free to use/legal to use, has documentation about where the data came from, and I had difficulty finding any other good datasets. So, with that in mind, I continued with the data and tried to make it better where I could.
For example, I did not pull in any of these negative/positive attributes features into my model to eliminate those biases. Rather I used the straight aesthetic score as a measure of level of “beauty” which from now out should actually be called “aesthetic” because that is what is called on the dataset and we can’t say that there is a one-to-one correlation there. Also, this model is assumed to only predict what level of aesthetic these 5 individuals would rate the image. And really, if you had a couple months to focus in there is other things you could do. Instead of averaging responses to images, take into account if the image is polarizing or not, etc. But at the end of the day, it’s still just 5 people. Not really a consensus model So now that we have the data clean and we’ve covered our assumptions.
So a CNN is a classification algorithm. So it puts things in classes. But our data is set up to have discrete ordinal values. That is they have a defined ranking. So a 3 is more like a 4 than a 1. So it’s not really it’s own class. So another adjustment that will be made is to combine values 4 and 5 under the class beautiful. And values 1 and 2 under the class ugly. We will strip photos with a value of 3 from the dataset under the idea that they don’t provide useful data to the model. Another assumption, but it also addresses the possibility of people rating things slightly different. Turning this into a binary classification. An issue with this is that it whittles down our dataset to about 8,000 images.
So here are some of the ways we can now create a model based on our dataset. In particular, I want you to take notice of this CNN knowledge required section. And take notice of the fact that there is a low section. Over the past year alone, modeling has become incredibly more accessible. Services like AutoML and TensorFlow 2.0 can allow you to train, test, and put into production a model without really needing to know anything about tuning or testing. And this isn’t counting the tons of start ups. Because of this increasing accessibility for non-devs and abstraction away of things like testing and tuning, the most important part of your model is the data cleaning because this is the part that is hardest to be automated in the future. After going through the previous steps, it should be apparent that this is not a solid model. But, I do have a working classification example of the Medium TensorFlow on my GitHub linked in the references. Feel free to try it out if you feel like seeing what 5 random strangers would likely think about your selfie. I gotta say from experience though, they are rather harsh. I would now go on to testing and tuning, but we’ve already established this is not a good model and needs a lot of work. And, if testing came back as great, that should only increase suspicion. So really, this model is a dud, shouldn’t be used to make any real life decisions, and needs a lot of work. Entertainment value only. But no worries, we’ve already worked through the most important aspects.
So, why walk through this setup if the model is bad and falls apart? One, because I want to normalize this experience as a part of machine learning. It’s easy to see all these great, fancy models in production and being talked about, but you often don’t get to see the background. A lot of the time, your model is not going to work. There won’t be enough data. The data that exists is incomplete or biased. Or, you have fantastic perfect data, and the model just doesn’t find anything interesting. It doesn’t latch on to anything. Two, I want to show how critical you need to be about the data going into your model. How sensitive you need to be to recognizing bias.. You gotta read the docs on where the data came from and ask yourself if what is happening behind the scenes is okay. Because once you put biased data in, it doesn’t matter what fancy algorithm you choose or how much time you spend tuning and adjusting, it’s going to be a biased model. They’re only as smart as what you put in. And three, show how much time is spent in this aspect. Actually training, testing, tuning the model? Small potatoes. One reason I switched this part of my talk out, is that I have worked on several data science projects with incredibly smart, math phds, super intelligent people with extremely sofistigated models. That, sadly, fell apart in production because they didn’t focus on this step. They took data accumulated by data engineers, and immediately created a model around it without questioning anything about it. This is the most important part.
Back at this again. If you come away from this talk with anything, please make sure it is this. I think it is the most important/fundamental thing. That being said, there are currently existing models that are based on good data. Things like Google’s NIMA, currently being used to predict aesthetic quality of photographs.
Now that we’ve seen two poor examples of models. How about an example of a great one? And, what’s on the horizon for future applications
Well, here we have 2015 Yelp. Like a lot of search sites, when you search for a restaurant images taken by members pop up and are one of the first things a users eyes are drawn to. The images shown are ranked. Usually by quantifiable statistics like amount of views of a picture. So the more time people click on it, the faster it goes to the most readily viewable image. So, with these top pictures in mind, how do you feel about dining at the Country Way restaraunt? Looks appetizing right? An issue they had is 1. Users would click more often on fuzzy or blurry images to try to see them in a higher resolution. These clicks were falsely assumed to be because they liked the image. 2. Reinforcement bias. The pictures most clicked on were, suprise, the pictures that were already at the top because they were viewable directly on the page.
Real life application of a CNN trained on not only the images themselves, but also additional data gleaned from them like aspect ratios. Much tastier right? Now the photos are ranked on aesthetic merit versus clicks. So with this example of a good model, what can we hope to look to in the future when it comes to aesthetic predictions?
Most scenic route to work. Has to know what you find scenic and be able to classify points of your route. Maybe you like the look of tight bussling cities. Or maybe you like rolling hills. Maye you just hate the color orange and you don’t want to see it anywhere on your trip Best selfie
Found that people are more likely to be loyal to a company if they value their aesthetics than if they like the actual attributes of the cars. And then they found this again when it came to phone sales. With this model, as consumers interact with our site, we can get valuable information about what type of images they prefer to look at.
When it comes to personalized CNNs, it’s pretty exciting stuff. If you can collect enough data on your opinions or thoughts on images, you can pretty much automate a lot of your life. Or, increase the visual appeal of it.
Airport security measures. Went through the airport recently and see that there’s an individual having to look at each one. Suffer from fatigue.
Classify types of skin rashes, xrays. This is starting to be implemented, but hasn’t totally caught on. Let’s get even more general. How about machine learning in general. Including not just visual CNNs, but NNs, regressions, clustering?
Absolutely bonkers. Machine learning really is revolutionary Really think of any decision you make. It can probably be turned into a model that will make the decision for you and, given enough information, likely will make it better.
Recipes, making the perfect recipe.
Portrait of Edmond Belamy December of this last year. First AI art auctioned at an auction house. Fetched $425,000. Uses generative adversarial network, which is a model that learns what something is like a cnn would, but then tries to create it’s own images and trick itself into thinking they’re real. Now we’re getting into concepts like can machines been artists? Creativity is considered a human pursuit, what makes us us. But if a machine can learn what that subjective concept is, what is stopping a machine from having creativity?
Tune and test itself Clean and find its own data. That whole thing about you making the decision as to what is good and bad data? People are starting to look into how to automate that as well.
Classifying aesthetics One of those terms that mean something different to everyone. Define that it’s the artistic movement, not the meme. Also, is currently used as a synonym for fashion style but it’s not strictly that either. Also not really an art style? Can combine a bunch of senses at once and isn’t necessarily human made.
Most important part of machine learning MAKE SURE YOU HAVE CLEAN DATA If your model is going wrong, first thought should always be in regards to your data Garbage in, garbage out Your model means nothing if you’re feeding it biased, inaccurate, and/or too little good data Sounds like a simple thing, but clean data is actually really hard to come by. You can pretty much guarantee that any dataset that is touched or filled in by a human will need to be examined closely because there will be errors. We’re not machines. I know this talk is about modeling, but always remember that the most important part of this is cleaning your data, removing anomalies if needed, normalizing, the whole shebang. That is where most of your time will be spent.
This includes non-devs. We’re right at the point where you don’t really need to be able to code to be able to model. And it’s only going to get easier from here on out. As long as you can be skeptical with data going in and results coming out, you’re golden.
There’s so much to discover. The goal of this talk was to apply ml to a creative topic, dive in to how to do it, and put the tools in your hands to explore more. Maybe you don’t want to predict beauty, but you want to create a model that ______… For fun, think about decisions you make throughout the day Do it. Make silly models. Try your hand at automating possible decisions in your life.

Modeling Beauty

Recommended

Recommended

More Related Content

Similar to Modeling Beauty

Similar to Modeling Beauty (20)

More from Software Guru

More from Software Guru (20)

Recently uploaded

Recently uploaded (20)

Modeling Beauty

Editor's Notes