SlideShare a Scribd company logo
1 of 18
Film Big Data Visualization
Based on D3.js
By
ABDUL VAHED SHAIK
016452540
SAN JOSE STATE UNIVERSITY
Abstract
• D3.js is utilized to visually represent the film's enormous data in order
to make it easier to comprehend and mine it. Use D3.js to present the
data in the form of a histogram, doughnut chart, force-directed graph,
map, and global cloud after receiving the information about the films
released in China in 2019 through the crawler. Add rich interactive
functions to help people find the data they require quickly. The final
analysis can serve as a guide for users to choose films as well as some
decision help for the Chinese film business. According to research,
D3.js is flexible and trustworthy for the visualization objectives in big
data processing activities at high speed and cheap cost.
Introduction
• People's viewing habits have gotten increasingly unpleasant as cultural norms and living
standards have advanced, which reflects people's innate desire to document their daily
lives. People can encounter subjective experience through watching movies as a service
in addition to rest and pleasure. The National Film Administration said that the entire box
office for movies in 2019 was 64.266 billion yuan, an increase of 5.4% from the previous
year; the total box office for domestic movies was 41.175 billion yuan, an increase of
8.65%. It is evident that China's film industry is seeing success.
• The upsurge of different resources and info in the big data era is astounding, but the
most efficient approach for academics to gather data from the outside world comes from
the visualization system. Analysts can often see the underlying information quickly when
data is presented in visual visuals [1,2]. Data visualization is the use of computer
graphics, image processing, and other technologies to transform data into images,
interact with them, and more easily explain information and trends [3].In order to
undertake more in-depth observation and analysis, this article leverages film-related data
from Douban Movie and Endata to visualize film data from several dimensions with D3.js.
Introduction to D3.js
• Data-Driven Documents is D3.js's official name. Data is displayed
dynamically by use of a Javascript library. The 'Data-Driven Documents,' or
D3.js, are documents that can load any data into the browser's memory
space and link it to the DOM (Document Object Model). By manipulating
the document with HTML, SVG, and CSS (web page elements appeared in
the browser), it applies the data-driven transformation to the document to
display visualization effects.
• D3.js complies with Web standards and has excellent browser
compatibility. Users simply need to import the D3.js source file in the HTML
head> tag; they are not required to be compatible through a proprietary
framework. They can pair data-driven DOM action with dependable
visualization components [5]. D3.js is simpler to draw with than SVG and
Canvas. D3.js is more adaptable and expandable when compared to
Echarts and other open source visualization tools .
Overall design
• Visualization: Use the D3.js visualization tool to display the data using
a histogram, doughnut chart, map, word cloud, or force-directed
graph after extracting and cleaning the necessary data from the
Mysql database.
• Graphical interaction: Visual presentation and interactivity are the
two main components of data visualization. Interaction can help with
more than just the conflict between data overload and limited
viewing space. Users actively engage in the process of creating mental
models, which aids in helping users comprehend data and spot
trends. This article will achieve interactivity through the use of the
mouse to hover the prompt box, search, button switching, link jump,
and other techniques.
Implementation
• Utilizing Python web crawlers, gather information about movies from
Douban Movie and Endata between January 1 and December 31,
2019, as well as the top 100 movies with the biggest box office in the
Chinese mainland, and save it all in a MySQL database. Figure 1
depicts the initial data. The visualization system draws charts based
on D3.js using the Bootstrap framework in the front-end and the
lightweight Python-based Web framework Flask in the background.
Annual box office review
The yearly box office overview is divided into two sections, which are represented by a
histogram and a doughnut chart. The first section comprises the top 10 movies at the box
office in 2019; the second half includes the box office of theater chains for the entire year.
The following methods are used to create a histogram using D3.js of the top 10 box office
hits of 2019:
1.Definition of the loading dataset: To the front-end, submit the name of the eligible movie
and the accompanying box office information that you extracted from the database. Put
the dataset in the workspace of the browser, defining it as "dataset1".
2.Bind data: Join "dataset1" to the designated document element. Make it consistent with
the length of the bound data and add new components if necessary. Create a rect> tag with
the class set to "cube" for each piece of data so that each value corresponds to a rectangle
on the page. Create a rect> tag with the class set to "cube" for each piece of data so that
each value corresponds to a rectangle on the page.
3.Attribute transformation: Control the element's transformation by setting the attribute
and changing the element's attributes. Set the width and height of "rect.cube"
dynamically. The fill color and other characteristics, such as the rectangle's x and y
coordinates, will be set concurrently.
Similar to a histogram, a doughnut chart concentrates on the
arrangement and the conversion of angles and arc paths, and
it can be used to illustrate the box office of theater chains.
Figure 2 represents the finished doughnut diagram. In 2019,
cinemas earned 64.123 billion yuan in total. The majority was
accounted for by Wanda Cinemas. It has a close connection
to Shanghai United, Southern Shinkansen, Jinyi Zhujiang,
Hengdian, Omnijoi, Huaxia United, Guangdong Dadi, China
Film Group Corporation, and China Film Stellar. A whopping
67.68% of the total was made up by these ten well-known
theater chains.
Box office in china
provinces:
By using the Mercator projection
algorithm, first define the map projection. Define
the path object of the path generator after that. It
can create a closed map area by converting the
following geojson map data into a set of pixel
sequences that are displayed on the web page using
the preset projection.
To acquire map objects, use the 'd3.json'
method to call the geojson data, which contains
outline information for all of China's provinces,
municipalities, and counties. To maintain the
integrity of Chinese land, the 'd3.xml' method is
used to ask that the svg file of a map of the South
China Sea be added to the drawing area. Each
province on the map corresponds to a path object by
creating a path element in the svg to describe a
graphical path and entering each element to build
data binding. Use the 'linear()' method when filling
the region to make the color linearly related to the
relevant box office. Set the font size, center, and
other characteristics, then use CSS to change the
map's border.
To observe the growth in box
office in 2019 compared to 2018, users can
toggle the visualization map through the
button. Figure 4 makes it clear that all
provinces, with the exception of the three
"Hei Ji Liao" northeastern provinces, are
demonstrating a growth tendency, albeit the
increase is distinct. In the three provinces of
Heilongjiang, Jilin, and Liaoning, the slow
economic growth—which is closely
connected to the grave loss of the young
population—is the main cause of the
negative box office growth.
The growth of the Chinese film
industry has been incredibly unequal, and
regional variations are large. All ten of the
top provinces have substantial populations
and economies. This demonstrates that the
growth of the film market may be influenced
by the state of the economy. The movie
market might not be able to stably grow
steadily if the economy lags.
Box Office Ranking in
Chinese mainland
*The WordCloud for the top 100 films in
mainland China's box office is displayed in
Figure 6. The text size increases in direct
proportion to the box office. To depict the
box office of 5 billion, 4 billion, 3 billion, 2
billion, 1 billion, and less than 1 billion, the
films in Figure 6 are divided by colors such as
blue, orange, green, red, purple, and brown.
By clicking to view the current movie's
official release window and each week's box
office, users can interactively link to the
histogram of weekly box office revenue
associated with the current movie. The "Wolf
Warrior 2" weekly box office is displayed in
Figure 7.
The highest box office performance
frequently occurs in the first or second week
following the official release of the movie,
according to an analysis of the weekly box
office for these 100 movies. This
phenomenon's cause is strongly tied to both
the film's schedule and actual content, in
addition to the film's promotion. In general,
series movies, well-known IP movies like
"Fast & Furious," Marvel series movies, etc.,
and movies released during special occasions
like "My People, My Country," "The Captain,"
and other movies created to commemorate
the 70th anniversary of the founding of the
People's Republic of China tend to have the
highest box office openings.
Figure 8 shows a WordCloud of the top 100
films in mainland China's box office, as rated
by Douban. To distinguish between a rating
above 9.0 points, 8.0 points, 7.0 points, 6.0
points, 5.0 points, and a rating below 5.0
points, it is separated into sections using
colors like blue, orange, green, red, purple,
and brown. It is evident that movies with
successful box office performances typically
have a great reputation. These movies are
not only well-liked by the commercial
market, but they also appeal to a broad
audience due to their compelling plot and
artistic worth.
“Film-Director-Actor"
relationship
The finished force-directed graph
of the relationship between "Film- Director-
Actor" is shown in Figure 9, which allows for
the quick association of films, directors, and
actors. Colors like green, blue, and red are
used to split the nodes to signify actors,
directors, and movies. Three views of the
graph are available: node, text, and hidden.
The first mode shows the connections
between the characters as nodes, whereas
the second option shows the connections
between the characters as text. The third
display mode only reveals data relevant to
the currently pointed node and conceals the
rest.
The nodes information related to
"Huang Bo" is depicted in Figure 10. It is
evident that Huang Bo worked as a director
and an actor simultaneously on the two
movies "My People, My Country" and "Gone
With the Light" in 2019.
CONCLUSION
• The use of D3.js in data visualization and analysis is described in this
document. It is possible to estimate some film production orientation
for film firms and also provide data references to film producers by
displaying and evaluating the visualization results of the film data in
2019 and the top 100 films in the history of the Chinese mainland box
office.
REFERENCES
• 1. Ren L. Research on interaction techniques in information visualization [Ph.D.
Thesis]. Beijing: The Chinese Academy of Sciences,2009. (in Chinese)
• 2. Card SK, Mackinlay JD, Shneiderman B. Readings in Information Visualization:
Using Vision To Think. San Francisco: Morgan- Kaufmann Publishers, 1999. 1-712.
• 3. Keim D, Andrienko G, Fekete J, Görg C, Kohlhammer J, Melancon G. Visual
analytics: Definition, process, and challenges. In: Kerren A, ed. Proc. of the
Information Visualization. LNCS 4950, Berlin: Springer- Verlag, 2008. 154-175.
[doi: 10.1007/978-3-540-70956-5_7]
• 4. Zhao Cong. Application research of visualization library D3.js[J]. Information
Technology and Informatization,2015(02):107-109. (In Chinese)
• 5. Paul Krill, Paul Krill. D3.js JavaScript data visualization package goes modular[J].
InfoWorld.com,2016.
THANK YOU

More Related Content

Similar to Film Big Data Visualization Based on D3.pptx

Building a Movie Success Predictor
Building a Movie Success PredictorBuilding a Movie Success Predictor
Building a Movie Success Predictor
Youness Lahdili
 
Problems Faced By Japanese Foreign Investments Relations...
Problems Faced By Japanese Foreign Investments Relations...Problems Faced By Japanese Foreign Investments Relations...
Problems Faced By Japanese Foreign Investments Relations...
Kimberly Thomas
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Rinke Hoekstra
 
Service Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data MiningService Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data Mining
IIRindia
 

Similar to Film Big Data Visualization Based on D3.pptx (20)

Introduction to Material Design
Introduction to Material DesignIntroduction to Material Design
Introduction to Material Design
 
Visualizing your data in JavaScript
Visualizing your data in JavaScriptVisualizing your data in JavaScript
Visualizing your data in JavaScript
 
IRJET- Movie Success Prediction using Popularity Factor from Social Media
IRJET- Movie Success Prediction using Popularity Factor from Social MediaIRJET- Movie Success Prediction using Popularity Factor from Social Media
IRJET- Movie Success Prediction using Popularity Factor from Social Media
 
Essay About Family Day
Essay About Family DayEssay About Family Day
Essay About Family Day
 
Building a Movie Success Predictor
Building a Movie Success PredictorBuilding a Movie Success Predictor
Building a Movie Success Predictor
 
No9_大陸電影.pptx
No9_大陸電影.pptxNo9_大陸電影.pptx
No9_大陸電影.pptx
 
Problems Faced By Japanese Foreign Investments Relations...
Problems Faced By Japanese Foreign Investments Relations...Problems Faced By Japanese Foreign Investments Relations...
Problems Faced By Japanese Foreign Investments Relations...
 
Predicting movie success from search
Predicting movie success from searchPredicting movie success from search
Predicting movie success from search
 
The Design and Development of a Prototype Community Banking Game
The Design and Development of a Prototype Community Banking GameThe Design and Development of a Prototype Community Banking Game
The Design and Development of a Prototype Community Banking Game
 
IRJET- Neural Style based Comics Photo-Caption Generator
IRJET- Neural Style based Comics Photo-Caption GeneratorIRJET- Neural Style based Comics Photo-Caption Generator
IRJET- Neural Style based Comics Photo-Caption Generator
 
Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)
Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)
Forecasting Traffic Growth and Impact with Cisco MATE Design (White Paper)
 
FITC - Data Visualization in Practice
FITC - Data Visualization in PracticeFITC - Data Visualization in Practice
FITC - Data Visualization in Practice
 
Se276 enterprise computingassignment
Se276 enterprise computingassignmentSe276 enterprise computingassignment
Se276 enterprise computingassignment
 
Drupal Camp La Keynoter4 Update 2009
Drupal Camp La Keynoter4 Update 2009Drupal Camp La Keynoter4 Update 2009
Drupal Camp La Keynoter4 Update 2009
 
IRJET- Predicting Bitcoin Prices using Convolutional Neural Network Algor...
IRJET-  	  Predicting Bitcoin Prices using Convolutional Neural Network Algor...IRJET-  	  Predicting Bitcoin Prices using Convolutional Neural Network Algor...
IRJET- Predicting Bitcoin Prices using Convolutional Neural Network Algor...
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
 
A Review on data visualization tools used for Big Data
A Review on data visualization tools used for Big DataA Review on data visualization tools used for Big Data
A Review on data visualization tools used for Big Data
 
Service Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data MiningService Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data Mining
 
IRJET- A Reflection on Big Data Business Analytics in Smart Cities
IRJET- A Reflection on Big Data Business Analytics in Smart CitiesIRJET- A Reflection on Big Data Business Analytics in Smart Cities
IRJET- A Reflection on Big Data Business Analytics in Smart Cities
 
Risk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep LearningRisk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep Learning
 

Recently uploaded

1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
dq9vz1isj
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
jk0tkvfv
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
yulianti213969
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
a8om7o51
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
ju0dztxtn
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
great91
 

Recently uploaded (20)

Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster AnalysisData Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
 
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae CoolbethDigital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 

Film Big Data Visualization Based on D3.pptx

  • 1. Film Big Data Visualization Based on D3.js By ABDUL VAHED SHAIK 016452540 SAN JOSE STATE UNIVERSITY
  • 2. Abstract • D3.js is utilized to visually represent the film's enormous data in order to make it easier to comprehend and mine it. Use D3.js to present the data in the form of a histogram, doughnut chart, force-directed graph, map, and global cloud after receiving the information about the films released in China in 2019 through the crawler. Add rich interactive functions to help people find the data they require quickly. The final analysis can serve as a guide for users to choose films as well as some decision help for the Chinese film business. According to research, D3.js is flexible and trustworthy for the visualization objectives in big data processing activities at high speed and cheap cost.
  • 3. Introduction • People's viewing habits have gotten increasingly unpleasant as cultural norms and living standards have advanced, which reflects people's innate desire to document their daily lives. People can encounter subjective experience through watching movies as a service in addition to rest and pleasure. The National Film Administration said that the entire box office for movies in 2019 was 64.266 billion yuan, an increase of 5.4% from the previous year; the total box office for domestic movies was 41.175 billion yuan, an increase of 8.65%. It is evident that China's film industry is seeing success. • The upsurge of different resources and info in the big data era is astounding, but the most efficient approach for academics to gather data from the outside world comes from the visualization system. Analysts can often see the underlying information quickly when data is presented in visual visuals [1,2]. Data visualization is the use of computer graphics, image processing, and other technologies to transform data into images, interact with them, and more easily explain information and trends [3].In order to undertake more in-depth observation and analysis, this article leverages film-related data from Douban Movie and Endata to visualize film data from several dimensions with D3.js.
  • 4. Introduction to D3.js • Data-Driven Documents is D3.js's official name. Data is displayed dynamically by use of a Javascript library. The 'Data-Driven Documents,' or D3.js, are documents that can load any data into the browser's memory space and link it to the DOM (Document Object Model). By manipulating the document with HTML, SVG, and CSS (web page elements appeared in the browser), it applies the data-driven transformation to the document to display visualization effects. • D3.js complies with Web standards and has excellent browser compatibility. Users simply need to import the D3.js source file in the HTML head> tag; they are not required to be compatible through a proprietary framework. They can pair data-driven DOM action with dependable visualization components [5]. D3.js is simpler to draw with than SVG and Canvas. D3.js is more adaptable and expandable when compared to Echarts and other open source visualization tools .
  • 5. Overall design • Visualization: Use the D3.js visualization tool to display the data using a histogram, doughnut chart, map, word cloud, or force-directed graph after extracting and cleaning the necessary data from the Mysql database. • Graphical interaction: Visual presentation and interactivity are the two main components of data visualization. Interaction can help with more than just the conflict between data overload and limited viewing space. Users actively engage in the process of creating mental models, which aids in helping users comprehend data and spot trends. This article will achieve interactivity through the use of the mouse to hover the prompt box, search, button switching, link jump, and other techniques.
  • 6. Implementation • Utilizing Python web crawlers, gather information about movies from Douban Movie and Endata between January 1 and December 31, 2019, as well as the top 100 movies with the biggest box office in the Chinese mainland, and save it all in a MySQL database. Figure 1 depicts the initial data. The visualization system draws charts based on D3.js using the Bootstrap framework in the front-end and the lightweight Python-based Web framework Flask in the background.
  • 7. Annual box office review The yearly box office overview is divided into two sections, which are represented by a histogram and a doughnut chart. The first section comprises the top 10 movies at the box office in 2019; the second half includes the box office of theater chains for the entire year. The following methods are used to create a histogram using D3.js of the top 10 box office hits of 2019: 1.Definition of the loading dataset: To the front-end, submit the name of the eligible movie and the accompanying box office information that you extracted from the database. Put the dataset in the workspace of the browser, defining it as "dataset1". 2.Bind data: Join "dataset1" to the designated document element. Make it consistent with the length of the bound data and add new components if necessary. Create a rect> tag with the class set to "cube" for each piece of data so that each value corresponds to a rectangle on the page. Create a rect> tag with the class set to "cube" for each piece of data so that each value corresponds to a rectangle on the page. 3.Attribute transformation: Control the element's transformation by setting the attribute and changing the element's attributes. Set the width and height of "rect.cube" dynamically. The fill color and other characteristics, such as the rectangle's x and y coordinates, will be set concurrently.
  • 8.
  • 9. Similar to a histogram, a doughnut chart concentrates on the arrangement and the conversion of angles and arc paths, and it can be used to illustrate the box office of theater chains. Figure 2 represents the finished doughnut diagram. In 2019, cinemas earned 64.123 billion yuan in total. The majority was accounted for by Wanda Cinemas. It has a close connection to Shanghai United, Southern Shinkansen, Jinyi Zhujiang, Hengdian, Omnijoi, Huaxia United, Guangdong Dadi, China Film Group Corporation, and China Film Stellar. A whopping 67.68% of the total was made up by these ten well-known theater chains.
  • 10. Box office in china provinces: By using the Mercator projection algorithm, first define the map projection. Define the path object of the path generator after that. It can create a closed map area by converting the following geojson map data into a set of pixel sequences that are displayed on the web page using the preset projection. To acquire map objects, use the 'd3.json' method to call the geojson data, which contains outline information for all of China's provinces, municipalities, and counties. To maintain the integrity of Chinese land, the 'd3.xml' method is used to ask that the svg file of a map of the South China Sea be added to the drawing area. Each province on the map corresponds to a path object by creating a path element in the svg to describe a graphical path and entering each element to build data binding. Use the 'linear()' method when filling the region to make the color linearly related to the relevant box office. Set the font size, center, and other characteristics, then use CSS to change the map's border.
  • 11. To observe the growth in box office in 2019 compared to 2018, users can toggle the visualization map through the button. Figure 4 makes it clear that all provinces, with the exception of the three "Hei Ji Liao" northeastern provinces, are demonstrating a growth tendency, albeit the increase is distinct. In the three provinces of Heilongjiang, Jilin, and Liaoning, the slow economic growth—which is closely connected to the grave loss of the young population—is the main cause of the negative box office growth. The growth of the Chinese film industry has been incredibly unequal, and regional variations are large. All ten of the top provinces have substantial populations and economies. This demonstrates that the growth of the film market may be influenced by the state of the economy. The movie market might not be able to stably grow steadily if the economy lags.
  • 12. Box Office Ranking in Chinese mainland *The WordCloud for the top 100 films in mainland China's box office is displayed in Figure 6. The text size increases in direct proportion to the box office. To depict the box office of 5 billion, 4 billion, 3 billion, 2 billion, 1 billion, and less than 1 billion, the films in Figure 6 are divided by colors such as blue, orange, green, red, purple, and brown. By clicking to view the current movie's official release window and each week's box office, users can interactively link to the histogram of weekly box office revenue associated with the current movie. The "Wolf Warrior 2" weekly box office is displayed in Figure 7.
  • 13. The highest box office performance frequently occurs in the first or second week following the official release of the movie, according to an analysis of the weekly box office for these 100 movies. This phenomenon's cause is strongly tied to both the film's schedule and actual content, in addition to the film's promotion. In general, series movies, well-known IP movies like "Fast & Furious," Marvel series movies, etc., and movies released during special occasions like "My People, My Country," "The Captain," and other movies created to commemorate the 70th anniversary of the founding of the People's Republic of China tend to have the highest box office openings.
  • 14. Figure 8 shows a WordCloud of the top 100 films in mainland China's box office, as rated by Douban. To distinguish between a rating above 9.0 points, 8.0 points, 7.0 points, 6.0 points, 5.0 points, and a rating below 5.0 points, it is separated into sections using colors like blue, orange, green, red, purple, and brown. It is evident that movies with successful box office performances typically have a great reputation. These movies are not only well-liked by the commercial market, but they also appeal to a broad audience due to their compelling plot and artistic worth.
  • 15. “Film-Director-Actor" relationship The finished force-directed graph of the relationship between "Film- Director- Actor" is shown in Figure 9, which allows for the quick association of films, directors, and actors. Colors like green, blue, and red are used to split the nodes to signify actors, directors, and movies. Three views of the graph are available: node, text, and hidden. The first mode shows the connections between the characters as nodes, whereas the second option shows the connections between the characters as text. The third display mode only reveals data relevant to the currently pointed node and conceals the rest. The nodes information related to "Huang Bo" is depicted in Figure 10. It is evident that Huang Bo worked as a director and an actor simultaneously on the two movies "My People, My Country" and "Gone With the Light" in 2019.
  • 16. CONCLUSION • The use of D3.js in data visualization and analysis is described in this document. It is possible to estimate some film production orientation for film firms and also provide data references to film producers by displaying and evaluating the visualization results of the film data in 2019 and the top 100 films in the history of the Chinese mainland box office.
  • 17. REFERENCES • 1. Ren L. Research on interaction techniques in information visualization [Ph.D. Thesis]. Beijing: The Chinese Academy of Sciences,2009. (in Chinese) • 2. Card SK, Mackinlay JD, Shneiderman B. Readings in Information Visualization: Using Vision To Think. San Francisco: Morgan- Kaufmann Publishers, 1999. 1-712. • 3. Keim D, Andrienko G, Fekete J, Görg C, Kohlhammer J, Melancon G. Visual analytics: Definition, process, and challenges. In: Kerren A, ed. Proc. of the Information Visualization. LNCS 4950, Berlin: Springer- Verlag, 2008. 154-175. [doi: 10.1007/978-3-540-70956-5_7] • 4. Zhao Cong. Application research of visualization library D3.js[J]. Information Technology and Informatization,2015(02):107-109. (In Chinese) • 5. Paul Krill, Paul Krill. D3.js JavaScript data visualization package goes modular[J]. InfoWorld.com,2016.