The document discusses optimizing profit for a movie theater by scheduling movies across screens and time zones. It identifies key variables like actor popularity, director popularity, and social buzz that can be used to predict box office performance. Regression analysis techniques are proposed to analyze the data and predict revenues. Different algorithms for partitioning screens and scheduling movies across time zones are proposed, with the goal of maximizing profits through variety and popularity. More discussion is needed on implementation details.
2. > Which variables are needed(can be used)
> Where to find actually useable variables’ database
> How to make use of the obtained variables
> What kind of Analysis techniques we should use
> How to partition the screens in the cinema
> What kind of algorithm to use for movie allocation
> What kind of data structure to use for storing movies
> What kind of data structure to use for scheduling movies
3. 1 Setting Goals
2 Fixing(reinitializing) variables
3 Analyzing and predicting the box office
4 Scheduling the movies
11. 배우 인지도
- 근 5년간 출연작품의 관객수(200만)를 바탕으로 discrete한 점수를 매긴다.
- 포털/검색엔진/SNS의 검색어 순으로 점수를 매긴다.
- 각종 영화관련 수상중 유의미한 수상내역을 바탕으로 점수를 매긴다.
감독 인지도
- 근 5년간 작품의 관객수를 바탕으로 discrete한 점수를 매긴다.
- 각종 영화관련 수상중 유의미한 수상내역을 바탕으로 점수를 매긴다.
배급사
- 출시한 작품의 관객수를 바탕으로 discrete한 점수를 매긴다.
- More search and discussion needed
프랜차이스
- More search and discussion needed
Social Buzz
- Google/Naver Search Results
- Twitter API/Exclusive sites
25. Non linear data linearalized by raising powers of variables
What if data is not linear?
26. Movie 4 : 1.56
Movie 1 : 1.17
Movie 2 : 1.03
…
Movie 3 : 0.91
Regression Results >
27. Movie 4 : 1.56
Movie 1 : 1.17
Movie 2 : 1.03
…
Movie 3 : 0.91
Need to be sorted by order
Regression Results >
28. Movie 4 : 1.56
Movie 1 : 1.17
Movie 2 : 1.03
…
Regression Results >
Movie 3 : 0.91
Need to be sorted by order
Tournament Tree
Choosing the winner
Total sort time of ( )logO n n
29. Movie 1
Movie 2
Movie 3
……
Movie n
31% 24% 20% 15%
Movie 4
90% Cutting point
Movie 1 : 1.56
Movie 2 : 1.17
Movie 3 : 1.03
…
Regression Results >
30. Problems we might interface
Data modification is too hard for analysis
R^2 not at the right precision (lower than 0.65)
그냥 구현을 못함 ㅜㅜ
-> Mechanical Tulk
Configure parameters on our own
Not enough training data
33. Movie 1
Movie 2
Movie 3
31% 24% 20% 15%
Movie 4
Screen1 Screen2 Screen3 Screen4 Screen5 Screen6
34. Hypothesis
> People are general and like to follow trend
(People in Gangnam)
> No special cases
> All days are same.
35. Screen1 Screen2 Screen3 Screen4 Screen5 Screen6
peak time
afternoon
morning
late night
forenoon
Division by Timezone
Different Weight is put on each time zone
and Shelf algorithm is used for fitting in movies into each time zone
Also, more popular movies are put more on peak times
36. Screen1 Screen2 Screen3 Screen4 Screen5 Screen6
peak time
Movie1
Movie1
Movie1
Movie1
Movie1
Division by Timezone
Too complex
Need to differ the ratio of each movie per time zone
Movie2
Movie2
Movie2
Movie3Movie3
Movie2
Movie1 Movie4
Movie1
Movie1
Movie1
37. 600 400 300 300 200 150
Division by Screens (with variety of seats)
Different Weight is put on each screen by actually changing the number of seats
After, we sort in the movie with the most weight into the screen with most seats
38. 600 400 300 300 200 150
Division by Screens (with variety of seats)
Movie1 Movie1
Movie1
Movie2
Movie2
Movie3
Movie4
Movie3
Movie2
Problem with this is that, “time” is not considered at all.
39. Combining time zone with screen division
On top of partition by screens, we can create another layer of time zones.
According to time zones, we can switch movies on based of fixed algorithms
600 400 300 300 200 150
Movie1 Movie1 Movie1 Movie2
Movie2
Movie3
Movie4
Movie3
Movie2
Movie1
Movie2
40. More discussion is needed on
> How to split the partitions of screens
> What variables should be considered
> Which algorithm should be used on the structure
> What we do with leftover time
(how to use it effectively)