Source: Statista.com
binge watch·ing
noun
the practice of watching multiple episodes of
a television program in rapid succession,
typically by means of digital streaming.
Netflix Report Reveals Pivotal Episodes
that “Hooked”* Viewers on Popular
Series
Source: Netflix
Netflix Report Reveals Pivotal Episodes
that “Hooked”* Viewers on Popular
Series
*“Hooked” indicates that >70% of viewers proceeded to complete the full first season following this episode
Source: Netflix
Netflix Report Reveals Pivotal Episodes
that “Hooked”* Viewers on Popular
Series
*“Hooked” indicates that 70% of viewers proceeded to complete the full first season following this episode
Source: Netflix
"Arrow" - Episode 8
"Bates Motel" - Episode 2
"Better Call Saul" - Episode 4
"Bloodline" - Episode 4
"BoJack Horseman" - Episode 5
"Breaking Bad" - Episode 2
"Dexter" - Episode 3
"Gossip Girl" - Episode 3
"Grace & Frankie" - Episode 4
"House of Cards" - Episode 3
"How I Met Your Mother" - Episode 8
"Mad Men" - Episode 6
"Marco Polo" - Episode 3
"Marvel’s Daredevil" - Episode 5
"Once Upon A Time" - Episode 6
"Orange is the New Black" - Episode 3
"Pretty Little Liars" - Episode 4
"Scandal" - Episode 2
"Sense8" - Episode 3
"Sons of Anarchy" - Episode 2
"Suits" - Episode 2
"The Blacklist" - Episode 6
"The Killing" - Episode 2
"The Walking Dead" - Episode 2
"Unbreakable Kimmy Schmidt" - Episode 4
Project Goal
Use Natural Language Processing tools and feature engineering
to identify elements of TV show scripts that “hook”
viewers and induce binge-watching.
Methods
Methods
Use web
scraping to
collect TV show
scripts
Methods
Use web
scraping to
collect TV show
scripts
Train word2vec
model to create
word embeddings
for script text
Methods
Use web
scraping to
collect TV show
scripts
Train word2vec
model to create
word embeddings
for script text
goodbye pilot happy birthday look at that that is veggie
bacon believe it or not zero cholesterol you wont even
taste the difference what time do you think youll be home
same time i dont want him dicking you around tonight
you get paid till you work till no later hey happy birthday
well thank you youre late again there was no hot water
again i have an easy fix for that you wake up early and
then you get to be the first person in the shower i have an
idea how about buy a new hot water heater hows that
idea for the millionth and billionth time did you take your
echinacea yeah i think its getting better not me i want
real bacon not this fake crap too bad eat it this smells like
bandaids eat it so hows it feel to be old how does it feel
to be a smart ass good eat your veggie bacon you all set
yeah im fine all right see you at home okay see you
chemistry it is the study of what anyone ben chemicals
chemicals no chemistry is well technically chemistry is the
study of matter but i prefer to see it as the study of
change now just just think about this electrons they
change their energy levels molecules molecules change
their bonds elements they combine and change into
compounds well thats thats all of life right i mean its just
its the constant its the cycle its solution dissolution just
over and over and over it is growth then decay then
transformation it is fascinating really chad is there
something wrong with your table okay ionic bonds
Isolate sections and
calculate vector ave.
cosine distance
from remainder of the
script
Methods
Use web
scraping to
collect TV show
scripts
Train word2vec
model to create
word embeddings
for script text
goodbye pilot happy birthday look at that that is veggie
bacon believe it or not zero cholesterol you wont even
taste the difference what time do you think youll be home
same time i dont want him dicking you around tonight
you get paid till you work till no later hey happy birthday
well thank you youre late again there was no hot water
again i have an easy fix for that you wake up early and
then you get to be the first person in the shower i have an
idea how about buy a new hot water heater hows that
idea for the millionth and billionth time did you take your
echinacea yeah i think its getting better not me i want
real bacon not this fake crap too bad eat it this smells like
bandaids eat it so hows it feel to be old how does it feel
to be a smart ass good eat your veggie bacon you all set
yeah im fine all right see you at home okay see you
chemistry it is the study of what anyone ben chemicals
chemicals no chemistry is well technically chemistry is the
study of matter but i prefer to see it as the study of
change now just just think about this electrons they
change their energy levels molecules molecules change
their bonds elements they combine and change into
compounds well thats thats all of life right i mean its just
its the constant its the cycle its solution dissolution just
over and over and over it is growth then decay then
transformation it is fascinating really chad is there
something wrong with your table okay ionic bonds
Isolate sections and
calculate vector ave.
cosine distance
from remainder of the
script
Search text for
sentiment for
hypothesized “hook”
moments
Methods
Use web
scraping to
collect TV show
scripts
Train word2vec
model to create
word embeddings
for script text
goodbye pilot happy birthday look at that that is veggie
bacon believe it or not zero cholesterol you wont even
taste the difference what time do you think youll be home
same time i dont want him dicking you around tonight
you get paid till you work till no later hey happy birthday
well thank you youre late again there was no hot water
again i have an easy fix for that you wake up early and
then you get to be the first person in the shower i have an
idea how about buy a new hot water heater hows that
idea for the millionth and billionth time did you take your
echinacea yeah i think its getting better not me i want
real bacon not this fake crap too bad eat it this smells like
bandaids eat it so hows it feel to be old how does it feel
to be a smart ass good eat your veggie bacon you all set
yeah im fine all right see you at home okay see you
chemistry it is the study of what anyone ben chemicals
chemicals no chemistry is well technically chemistry is the
study of matter but i prefer to see it as the study of
change now just just think about this electrons they
change their energy levels molecules molecules change
their bonds elements they combine and change into
compounds well thats thats all of life right i mean its just
its the constant its the cycle its solution dissolution just
over and over and over it is growth then decay then
transformation it is fascinating really chad is there
something wrong with your table okay ionic bonds
Isolate sections and
calculate vector ave.
cosine distance
from remainder of the
script
Search text for
sentiment for
hypothesized “hook”
moments
Create features for
various script
elements based on
genre
! ? …
Methods
Use web
scraping to
collect TV show
scripts
Train word2vec
model to create
word embeddings
for script text
goodbye pilot happy birthday look at that that is veggie
bacon believe it or not zero cholesterol you wont even
taste the difference what time do you think youll be home
same time i dont want him dicking you around tonight
you get paid till you work till no later hey happy birthday
well thank you youre late again there was no hot water
again i have an easy fix for that you wake up early and
then you get to be the first person in the shower i have an
idea how about buy a new hot water heater hows that
idea for the millionth and billionth time did you take your
echinacea yeah i think its getting better not me i want
real bacon not this fake crap too bad eat it this smells like
bandaids eat it so hows it feel to be old how does it feel
to be a smart ass good eat your veggie bacon you all set
yeah im fine all right see you at home okay see you
chemistry it is the study of what anyone ben chemicals
chemicals no chemistry is well technically chemistry is the
study of matter but i prefer to see it as the study of
change now just just think about this electrons they
change their energy levels molecules molecules change
their bonds elements they combine and change into
compounds well thats thats all of life right i mean its just
its the constant its the cycle its solution dissolution just
over and over and over it is growth then decay then
transformation it is fascinating really chad is there
something wrong with your table okay ionic bonds
Isolate sections and
calculate vector ave.
cosine distance
from remainder of the
script
Use tf-idf
weighting and
NMF to focus on
key topics
Search text for
sentiment for
hypothesized “hook”
moments
Create features for
various script
elements based on
genre
! ? …
Topics
Supporting
Character
Development
Joining
Forces
Demonstratio
n of
Leadership
Topic: Supporting Character Development
Ex) Mad Men Hook Episode (Season 1, Episode 6)
FRED
Your girl. Full of surprises. Oh
Pretty Peggy Sue.
DON
Peggy? If you say so. I avoid eye
contact to avoid being blinded by
the earnestness.
FRED
Actually, she really stood out—
brainstorming wise…While the rest
of the hens were busy tearing out
each other’s feathers, that one
saw the benefit, not the feature.
She said she didn’t want to be one
of a hundred colors in a box.
That’s interesting, right?
”Hook” Scene
Remainder of
Episode
t-SNE Plot:
2D representation of word clusters
Topic: Joining Forces
Ex) Breaking Bad Hook Episode (Season 1, Episode 2)
JESSE
You're not You're not serious? You're
serious? Who's gonna do that? And
don't look at me!
WALT
I guess we'll both do it together.
JESSE
No, Mr. White, okay, I'm not good with
dead bodies. We're in this 50/50,
okay?
WALT
I guess the only other fair way to go
about this would be that one of us
deals with the body situation, while
the other one of us deals with the
Krazy-8 situation. In a scenario like
this, I don't suppose it is bad form
to just flip a coin.
”Hook” Scene
Remainder of
Episode
t-SNE Plot:
2D representation of word clusters
Topic: Demonstration of Leadership
Ex) House of Cards Hook Episode (Season 1, Episode 3)
FRANK
You know what no one wants to
talk about.
Hate.
I know all about hate.
It starts in your gut, deep down
here, where it stirs and churns.
And then it rises.
Hate rises fast and volcanic.
It erupts hot on the breath.
Your eyes go wide with fire.
You clench your teeth so hard
you think they'll shatter.
I hate you, God.
I hate you! Oh, don't tell me
you haven't said those words
before.
I know you have.
We all have, if you've ever
felt so crushing a loss.
”Hook” Scene
Remainder of
Episode
t-SNE Plot:
2D representation of word clusters
Conclusion
Conclusion
• Analysis led to 3 central themes in shows that “hook” viewers:
• Supporting character development
• Characters joining forces
• Demonstration of leadership
Conclusion
• Analysis led to 3 central themes in shows that “hook” viewers:
• Supporting character development
• Characters joining forces
• Demonstration of leadership
• word2vec proved to be a useful tool to establish text sections’
divergence from document
Conclusion
• Analysis led to 3 central themes in shows that “hook” viewers:
• Supporting character development
• Characters joining forces
• Demonstration of leadership
• word2vec proved to be a useful tool to establish text sections’
divergence from document
• Incorporating sentiment analysis, tf-idf, and other script features
help target pivotal scenes
jmfradkin@gmail.com
linkedin.com/in/jamiefradkin
github.com/jmfradkin
jamiefradkin.wordpress.com
Thank you!
Jamie Fradkin

Metis Project 5: The (Data) Science of Binge Watching on Netflix

  • 3.
  • 4.
    binge watch·ing noun the practiceof watching multiple episodes of a television program in rapid succession, typically by means of digital streaming.
  • 5.
    Netflix Report RevealsPivotal Episodes that “Hooked”* Viewers on Popular Series Source: Netflix
  • 6.
    Netflix Report RevealsPivotal Episodes that “Hooked”* Viewers on Popular Series *“Hooked” indicates that >70% of viewers proceeded to complete the full first season following this episode Source: Netflix
  • 7.
    Netflix Report RevealsPivotal Episodes that “Hooked”* Viewers on Popular Series *“Hooked” indicates that 70% of viewers proceeded to complete the full first season following this episode Source: Netflix "Arrow" - Episode 8 "Bates Motel" - Episode 2 "Better Call Saul" - Episode 4 "Bloodline" - Episode 4 "BoJack Horseman" - Episode 5 "Breaking Bad" - Episode 2 "Dexter" - Episode 3 "Gossip Girl" - Episode 3 "Grace & Frankie" - Episode 4 "House of Cards" - Episode 3 "How I Met Your Mother" - Episode 8 "Mad Men" - Episode 6 "Marco Polo" - Episode 3 "Marvel’s Daredevil" - Episode 5 "Once Upon A Time" - Episode 6 "Orange is the New Black" - Episode 3 "Pretty Little Liars" - Episode 4 "Scandal" - Episode 2 "Sense8" - Episode 3 "Sons of Anarchy" - Episode 2 "Suits" - Episode 2 "The Blacklist" - Episode 6 "The Killing" - Episode 2 "The Walking Dead" - Episode 2 "Unbreakable Kimmy Schmidt" - Episode 4
  • 8.
    Project Goal Use NaturalLanguage Processing tools and feature engineering to identify elements of TV show scripts that “hook” viewers and induce binge-watching.
  • 9.
  • 10.
  • 11.
    Methods Use web scraping to collectTV show scripts Train word2vec model to create word embeddings for script text
  • 12.
    Methods Use web scraping to collectTV show scripts Train word2vec model to create word embeddings for script text goodbye pilot happy birthday look at that that is veggie bacon believe it or not zero cholesterol you wont even taste the difference what time do you think youll be home same time i dont want him dicking you around tonight you get paid till you work till no later hey happy birthday well thank you youre late again there was no hot water again i have an easy fix for that you wake up early and then you get to be the first person in the shower i have an idea how about buy a new hot water heater hows that idea for the millionth and billionth time did you take your echinacea yeah i think its getting better not me i want real bacon not this fake crap too bad eat it this smells like bandaids eat it so hows it feel to be old how does it feel to be a smart ass good eat your veggie bacon you all set yeah im fine all right see you at home okay see you chemistry it is the study of what anyone ben chemicals chemicals no chemistry is well technically chemistry is the study of matter but i prefer to see it as the study of change now just just think about this electrons they change their energy levels molecules molecules change their bonds elements they combine and change into compounds well thats thats all of life right i mean its just its the constant its the cycle its solution dissolution just over and over and over it is growth then decay then transformation it is fascinating really chad is there something wrong with your table okay ionic bonds Isolate sections and calculate vector ave. cosine distance from remainder of the script
  • 13.
    Methods Use web scraping to collectTV show scripts Train word2vec model to create word embeddings for script text goodbye pilot happy birthday look at that that is veggie bacon believe it or not zero cholesterol you wont even taste the difference what time do you think youll be home same time i dont want him dicking you around tonight you get paid till you work till no later hey happy birthday well thank you youre late again there was no hot water again i have an easy fix for that you wake up early and then you get to be the first person in the shower i have an idea how about buy a new hot water heater hows that idea for the millionth and billionth time did you take your echinacea yeah i think its getting better not me i want real bacon not this fake crap too bad eat it this smells like bandaids eat it so hows it feel to be old how does it feel to be a smart ass good eat your veggie bacon you all set yeah im fine all right see you at home okay see you chemistry it is the study of what anyone ben chemicals chemicals no chemistry is well technically chemistry is the study of matter but i prefer to see it as the study of change now just just think about this electrons they change their energy levels molecules molecules change their bonds elements they combine and change into compounds well thats thats all of life right i mean its just its the constant its the cycle its solution dissolution just over and over and over it is growth then decay then transformation it is fascinating really chad is there something wrong with your table okay ionic bonds Isolate sections and calculate vector ave. cosine distance from remainder of the script Search text for sentiment for hypothesized “hook” moments
  • 14.
    Methods Use web scraping to collectTV show scripts Train word2vec model to create word embeddings for script text goodbye pilot happy birthday look at that that is veggie bacon believe it or not zero cholesterol you wont even taste the difference what time do you think youll be home same time i dont want him dicking you around tonight you get paid till you work till no later hey happy birthday well thank you youre late again there was no hot water again i have an easy fix for that you wake up early and then you get to be the first person in the shower i have an idea how about buy a new hot water heater hows that idea for the millionth and billionth time did you take your echinacea yeah i think its getting better not me i want real bacon not this fake crap too bad eat it this smells like bandaids eat it so hows it feel to be old how does it feel to be a smart ass good eat your veggie bacon you all set yeah im fine all right see you at home okay see you chemistry it is the study of what anyone ben chemicals chemicals no chemistry is well technically chemistry is the study of matter but i prefer to see it as the study of change now just just think about this electrons they change their energy levels molecules molecules change their bonds elements they combine and change into compounds well thats thats all of life right i mean its just its the constant its the cycle its solution dissolution just over and over and over it is growth then decay then transformation it is fascinating really chad is there something wrong with your table okay ionic bonds Isolate sections and calculate vector ave. cosine distance from remainder of the script Search text for sentiment for hypothesized “hook” moments Create features for various script elements based on genre ! ? …
  • 15.
    Methods Use web scraping to collectTV show scripts Train word2vec model to create word embeddings for script text goodbye pilot happy birthday look at that that is veggie bacon believe it or not zero cholesterol you wont even taste the difference what time do you think youll be home same time i dont want him dicking you around tonight you get paid till you work till no later hey happy birthday well thank you youre late again there was no hot water again i have an easy fix for that you wake up early and then you get to be the first person in the shower i have an idea how about buy a new hot water heater hows that idea for the millionth and billionth time did you take your echinacea yeah i think its getting better not me i want real bacon not this fake crap too bad eat it this smells like bandaids eat it so hows it feel to be old how does it feel to be a smart ass good eat your veggie bacon you all set yeah im fine all right see you at home okay see you chemistry it is the study of what anyone ben chemicals chemicals no chemistry is well technically chemistry is the study of matter but i prefer to see it as the study of change now just just think about this electrons they change their energy levels molecules molecules change their bonds elements they combine and change into compounds well thats thats all of life right i mean its just its the constant its the cycle its solution dissolution just over and over and over it is growth then decay then transformation it is fascinating really chad is there something wrong with your table okay ionic bonds Isolate sections and calculate vector ave. cosine distance from remainder of the script Use tf-idf weighting and NMF to focus on key topics Search text for sentiment for hypothesized “hook” moments Create features for various script elements based on genre ! ? …
  • 16.
  • 17.
    Topic: Supporting CharacterDevelopment Ex) Mad Men Hook Episode (Season 1, Episode 6) FRED Your girl. Full of surprises. Oh Pretty Peggy Sue. DON Peggy? If you say so. I avoid eye contact to avoid being blinded by the earnestness. FRED Actually, she really stood out— brainstorming wise…While the rest of the hens were busy tearing out each other’s feathers, that one saw the benefit, not the feature. She said she didn’t want to be one of a hundred colors in a box. That’s interesting, right? ”Hook” Scene Remainder of Episode t-SNE Plot: 2D representation of word clusters
  • 18.
    Topic: Joining Forces Ex)Breaking Bad Hook Episode (Season 1, Episode 2) JESSE You're not You're not serious? You're serious? Who's gonna do that? And don't look at me! WALT I guess we'll both do it together. JESSE No, Mr. White, okay, I'm not good with dead bodies. We're in this 50/50, okay? WALT I guess the only other fair way to go about this would be that one of us deals with the body situation, while the other one of us deals with the Krazy-8 situation. In a scenario like this, I don't suppose it is bad form to just flip a coin. ”Hook” Scene Remainder of Episode t-SNE Plot: 2D representation of word clusters
  • 19.
    Topic: Demonstration ofLeadership Ex) House of Cards Hook Episode (Season 1, Episode 3) FRANK You know what no one wants to talk about. Hate. I know all about hate. It starts in your gut, deep down here, where it stirs and churns. And then it rises. Hate rises fast and volcanic. It erupts hot on the breath. Your eyes go wide with fire. You clench your teeth so hard you think they'll shatter. I hate you, God. I hate you! Oh, don't tell me you haven't said those words before. I know you have. We all have, if you've ever felt so crushing a loss. ”Hook” Scene Remainder of Episode t-SNE Plot: 2D representation of word clusters
  • 20.
  • 21.
    Conclusion • Analysis ledto 3 central themes in shows that “hook” viewers: • Supporting character development • Characters joining forces • Demonstration of leadership
  • 22.
    Conclusion • Analysis ledto 3 central themes in shows that “hook” viewers: • Supporting character development • Characters joining forces • Demonstration of leadership • word2vec proved to be a useful tool to establish text sections’ divergence from document
  • 23.
    Conclusion • Analysis ledto 3 central themes in shows that “hook” viewers: • Supporting character development • Characters joining forces • Demonstration of leadership • word2vec proved to be a useful tool to establish text sections’ divergence from document • Incorporating sentiment analysis, tf-idf, and other script features help target pivotal scenes
  • 24.

Editor's Notes

  • #2 Hi everyone, my name is Jamie, and today I’d like to share some insights from my project which looked into components of popular Netflix TV shows that might lead you to start binge-watching on Netflix
  • #3 It’s no big secret that the popularity of Netflix and other online streaming models has been rapidly increasing in recent years. But with key competition from HBO, Amazon, Hulu, and more, quality content is becoming increasingly important to retain and increase subscribers. One of the key’s to netflix’s popularity is
  • #4 It’s no big secret that the popularity of Netflix and other online streaming models has been rapidly increasing in recent years. But with key competition from HBO, Amazon, Hulu, and more, quality content is becoming increasingly important to retain and increase subscribers. One of the key’s to netflix’s popularity is
  • #5 the idea of binge-watching, which for those of you who don’t know is the indulgence in back-to-back episodes of a TV series, and this has become an important feature in Netflix analytics
  • #6 The inspiration for my project came from a recent release from Netflix that shared the episode of many popular shows that “hooked” viewers, meaning that 70% or more continued on to complete the full first season after this episode. One interesting observation about this report is that the ‘hooked points,’ were different and varied regardless of the type of show — comedy vs. drama — or the number of episodes. Netflix representatives confirmed that, based on their analysis, the ‘hooked point’ was determined by the content itself.
  • #7 The inspiration for my project came from a recent release from Netflix that shared the episode of many popular shows that “hooked” viewers, meaning that 70% or more continued on to complete the full first season after this episode. One interesting observation about this report is that the ‘hooked points,’ were different and varied regardless of the type of show — comedy vs. drama — or the number of episodes. Netflix representatives confirmed that, based on their analysis, the ‘hooked point’ was determined by the content itself.
  • #8 The inspiration for my project came from a recent release from Netflix that shared the episode of many popular shows that “hooked” viewers, meaning that 70% or more continued on to complete the full first season after this episode. One interesting observation about this report is that the ‘hooked points,’ were different and varied regardless of the type of show — comedy vs. drama — or the number of episodes. Netflix representatives confirmed that, based on their analysis, the ‘hooked point’ was determined by the content itself.
  • #9 Armed with Natural Language Processing techniques and feature engineering methods, I set out to find elements in these TV episode scripts that “hook” viewers and contribute to the binge-worthiness of a series.
  • #10 I first collected scripts using Beautiful Soup and used word2vec to create embeddings of the text in each episode’s script. With this, I broke the script up into equal-sized parts and found the section with the largest cosine distance from the rest, indicating that the content had shifted in theme or tone, since I suspected that this would point to the climax of the episode that hooked viewers. I then incorporated an 8-parameter sentiment analysis tool and feature engineering to find important sections of this subset, such as length of dialogue, punctuation used, number of characters, and scene cuts. Finally, I used tf-idf weighting and topic modeling to compare themes between shows.
  • #11 I first collected scripts using Beautiful Soup and used word2vec to create embeddings of the text in each episode’s script. With this, I broke the script up into equal-sized parts and found the section with the largest cosine distance from the rest, indicating that the content had shifted in theme or tone, since I suspected that this would point to the climax of the episode that hooked viewers. I then incorporated an 8-parameter sentiment analysis tool and feature engineering to find important sections of this subset, such as length of dialogue, punctuation used, number of characters, and scene cuts. Finally, I used tf-idf weighting and topic modeling to compare themes between shows.
  • #12 I first collected scripts using Beautiful Soup and used word2vec to create embeddings of the text in each episode’s script. With this, I broke the script up into equal-sized parts and found the section with the largest cosine distance from the rest, indicating that the content had shifted in theme or tone, since I suspected that this would point to the climax of the episode that hooked viewers. I then incorporated an 8-parameter sentiment analysis tool and feature engineering to find important sections of this subset, such as length of dialogue, punctuation used, number of characters, and scene cuts. Finally, I used tf-idf weighting and topic modeling to compare themes between shows.
  • #13 I first collected scripts using Beautiful Soup and used word2vec to create embeddings of the text in each episode’s script. With this, I broke the script up into equal-sized parts and found the section with the largest cosine distance from the rest, indicating that the content had shifted in theme or tone, since I suspected that this would point to the climax of the episode that hooked viewers. I then incorporated an 8-parameter sentiment analysis tool and feature engineering to find important sections of this subset, such as length of dialogue, punctuation used, number of characters, and scene cuts. Finally, I used tf-idf weighting and topic modeling to compare themes between shows.
  • #14 I first collected scripts using Beautiful Soup and used word2vec to create embeddings of the text in each episode’s script. With this, I broke the script up into equal-sized parts and found the section with the largest cosine distance from the rest, indicating that the content had shifted in theme or tone, since I suspected that this would point to the climax of the episode that hooked viewers. I then incorporated an 8-parameter sentiment analysis tool and feature engineering to find important sections of this subset, such as length of dialogue, punctuation used, number of characters, and scene cuts. Finally, I used tf-idf weighting and topic modeling to compare themes between shows.
  • #15 I first collected scripts using Beautiful Soup and used word2vec to create embeddings of the text in each episode’s script. With this, I broke the script up into equal-sized parts and found the section with the largest cosine distance from the rest, indicating that the content had shifted in theme or tone, since I suspected that this would point to the climax of the episode that hooked viewers. I then incorporated an 8-parameter sentiment analysis tool and feature engineering to find important sections of this subset, such as length of dialogue, punctuation used, number of characters, and scene cuts. Finally, I used tf-idf weighting and topic modeling to compare themes between shows.
  • #16 I first collected scripts using Beautiful Soup and used word2vec to create embeddings of the text in each episode’s script. With this, I broke the script up into equal-sized parts and found the section with the largest cosine distance from the rest, indicating that the content had shifted in theme or tone, since I suspected that this would point to the climax of the episode that hooked viewers. I then incorporated an 8-parameter sentiment analysis tool and feature engineering to find important sections of this subset, such as length of dialogue, punctuation used, number of characters, and scene cuts. Finally, I used tf-idf weighting and topic modeling to compare themes between shows.
  • #17 Interestingly, my analysis led me to 3 central themes in these ‘hook’ episodes that engage viewers early on in a show’s pilot season
  • #18 Supporting character development is an example of one of the themes in “hook” episodes that recurred in my analysis across multiple shows. For example, in this episode of Mad Men, Peggy gains respect from the other men in her office by impressing them with marketing savvy during a product trial. The t-SNE plot shows a 2-D representation of the word embeddings how the word embeddings in this section of the script differ from the content in the remainder of the episode.
  • #19 Another recurring theme that hooks viewers early in the season proved to be the joining together of two characters previously on separate paths. For example, in this “hook” episode of Breaking Bad, Walt and Jesse bond over the predicament of who will deal with the competing drug dealer they’ve kidnapped in Jesse’s house.
  • #20 Another important theme I found in this analysis is a demonstration of leadership and inspirational monologues. In this scene from House of Cards, Frank delivers a sermon at a church in his hometown and delivers the famous line “I hate you, God”, referring to questioning your faith in times of despair.
  • #21 Though each Netflix show differs in content and format, the analysis I performed led to 3 themes in popular shows that could really pull viewers in and start a binge-watching phase. the word embeddings provided by word2vec provided a useful tool to find portions of text that diverge from the rest of the script. Finally, working with sentiment analysis tools, tf-idf and topic modeling, and engineering new features for the scripts allowed me to find important scenes and compare these topics across shows.
  • #22 Though each Netflix show differs in content and format, the analysis I performed led to 3 themes in popular shows that could really pull viewers in and start a binge-watching phase. the word embeddings provided by word2vec provided a useful tool to find portions of text that diverge from the rest of the script. Finally, working with sentiment analysis tools, tf-idf and topic modeling, and engineering new features for the scripts allowed me to find important scenes and compare these topics across shows.
  • #23 Though each Netflix show differs in content and format, the analysis I performed led to 3 themes in popular shows that could really pull viewers in and start a binge-watching phase. the word embeddings provided by word2vec provided a useful tool to find portions of text that diverge from the rest of the script. Finally, working with sentiment analysis tools, tf-idf and topic modeling, and engineering new features for the scripts allowed me to find important scenes and compare these topics across shows.
  • #24 Though each Netflix show differs in content and format, the analysis I performed led to 3 themes in popular shows that could really pull viewers in and start a binge-watching phase. the word embeddings provided by word2vec provided a useful tool to find portions of text that diverge from the rest of the script. Finally, working with sentiment analysis tools, tf-idf and topic modeling, and engineering new features for the scripts allowed me to find important scenes and compare these topics across shows.