Movie datastructure2

Optimizing Profit
for the Movie Theatre
김가영
박수현
박현도
성훈
이재완
Keynote by : 진겸

> Which variables are needed(can be used) 
> Where to find actually useable variables’ database
> How to make use of the obtained variables
> What kind of Analysis techniques we should use
> How to partition the screens in the cinema 
> What kind of algorithm to use for movie allocation
> What kind of data structure to use for storing movies
> What kind of data structure to use for scheduling movies

1 Setting Goals
2 Fixing(reinitializing) variables
3 Analyzing and predicting the box office
4 Scheduling the movies

Profit to variety
Discard immeasurable states

Variables
변수
감독 날씨
등급
공휴일
정치판
지역
좌석수
평점
배우
시간
검색량
배급사
마케팅홍보
장르
국적

Variables
사용가능한 변수

Movie’s Inherent
Specification
Quantitive
Value
주연배우 인지도
감독 인지도
배급사
조연배우 인지도
Social Buzz
관객 평점
프랜차이즈
개봉전 평점
Need to modify variables for actual analysis!
* variables will be continuously modified and changed
개봉후 1주차 관객수
(매출점유율)

배우 인지도
- 근 5년간 출연작품의 관객수(200만)를 바탕으로 discrete한 점수를 매긴다.
- 포털/검색엔진/SNS의 검색어 순으로 점수를 매긴다.
- 각종 영화관련 수상중 유의미한 수상내역을 바탕으로 점수를 매긴다.
감독 인지도
- 근 5년간 작품의 관객수를 바탕으로 discrete한 점수를 매긴다.
- 각종 영화관련 수상중 유의미한 수상내역을 바탕으로 점수를 매긴다.
배급사
- 출시한 작품의 관객수를 바탕으로 discrete한 점수를 매긴다.
- More search and discussion needed
프랜차이스
- More search and discussion needed
Social Buzz
- Google/Naver Search Results
- Twitter API/Exclusive sites

Variables need correct scope, standard, interval

관객 점유율
http://www.kofic.or.kr/kofic/business/infm/
introBoxOffice.do
평점
http://movie.naver.com/
감독 인지도
http://www.kobis.or.kr/kobis/business/mast/peop/
searchPeopleList.do
배우 인지도
http://www.kobis.or.kr/kobis/business/mast/peop/
searchPeopleList.do
http://www.kobis.or.kr/kobis/business/mast/mvie/
searchMovieList.do
배급사 관련
http://www.kobis.or.kr/kobis/business/mast/mvie/
searchMovieList.do
버즈량
http://snsbuzz.com/m_index.php
https://www.tibuzz.co.kr/
영화관 입장권 통신전산망

How can we get meaningful result?

Regression Analysis
회귀 분석

Linear regression
Logistic regression
Bass defusion
Multiple regression

2차원 Data
선형적, 양의 상관관계가 있음

Linear Regression
종속변수
(Output)
예측값
Parameter Vector(What we want to find)
독립변수(Data input)

Variable 1 Variable p
Coefficient 1 Coefficient 2
Multiple Linear Regression
변수가 좀 많을때…

Mean Squared Error
Optimization with
differentiation on w
Prediction Real Value

Non linear data linearalized by raising powers of variables
What if data is not linear?

Movie 4 : 1.56
Movie 1 : 1.17
Movie 2 : 1.03
…
Movie 3 : 0.91
Regression Results >

Movie 4 : 1.56
Movie 1 : 1.17
Movie 2 : 1.03
…
Movie 3 : 0.91
Need to be sorted by order

Movie 4 : 1.56
Movie 1 : 1.17
Movie 2 : 1.03
…
Movie 3 : 0.91
Need to be sorted by order
Tournament Tree
Choosing the winner
Total sort time of ( )logO n n

Movie 1
Movie 2
Movie 3
……
Movie n
31% 24% 20% 15%
Movie 4
90% Cutting point
Movie 1 : 1.56
Movie 2 : 1.17
Movie 3 : 1.03
…

Problems we might interface
Data modification is too hard for analysis
R^2 not at the right precision (lower than 0.65)
그냥 구현을 못함 ㅜㅜ
-> Mechanical Tulk
Configure parameters on our own
Not enough training data

Requisite Skillsets
Tensorflow
Scipy

Movie 1
Movie 2
Movie 3
31% 24% 20% 15%
Movie 4
Screen1 Screen2 Screen3 Screen4 Screen5 Screen6

Hypothesis
> People are general and like to follow trend
(People in Gangnam)
> No special cases
> All days are same.

peak time
afternoon
morning
late night
forenoon
Division by Timezone
Different Weight is put on each time zone
and Shelf algorithm is used for fitting in movies into each time zone
Also, more popular movies are put more on peak times

peak time
Movie1
Movie1
Movie1
Movie1
Movie1
Division by Timezone
Too complex
Need to differ the ratio of each movie per time zone
Movie2
Movie2
Movie2
Movie3Movie3
Movie2
Movie1 Movie4
Movie1
Movie1
Movie1

600 400 300 300 200 150
Division by Screens (with variety of seats)
Different Weight is put on each screen by actually changing the number of seats
After, we sort in the movie with the most weight into the screen with most seats

600 400 300 300 200 150
Division by Screens (with variety of seats)
Movie1 Movie1
Movie1
Movie2
Movie2
Movie3
Movie4
Movie3
Movie2
Problem with this is that, “time” is not considered at all.

Combining time zone with screen division
On top of partition by screens, we can create another layer of time zones.
According to time zones, we can switch movies on based of fixed algorithms
600 400 300 300 200 150
Movie1 Movie1 Movie1 Movie2
Movie2
Movie3
Movie4
Movie3
Movie2
Movie1
Movie2

More discussion is needed on
> How to split the partitions of screens
> What variables should be considered
> Which algorithm should be used on the structure
> What we do with leftover time
(how to use it effectively)

Movie datastructure2

Recommended

Recommended

More Related Content

Similar to Movie datastructure2

Similar to Movie datastructure2 (13)

Recently uploaded

Recently uploaded (20)

Movie datastructure2