SlideShare a Scribd company logo
1 of 15
Tiktok and Education: Discovering
Knowledge through Learning Videos.
Carlos Fiallos
Angel Fiallos
Stalin Figueroa
July 2021
Agenda
• Introduction
• Objective
• Methodology
• Data Collection
• Video Indexer Process.
• Exploratory and Demographic Analysis
• Multi-Label Categorization
• Results
• Conclusions
2
3
Introduction
TikTok is considered a social media
platform because, like Twitter and
Instagram. However, TikTok offers users a
unique way to share creative videos of
themselves, their surroundings, or a
collection of external audiovisual content.
Introduction
TikTok launched LearnOnTikTok
program, which consists of
educational videos to facilitate
learning during COVID-19
lockdowns. These videos are
authored by professionals, students,
and other users, who have shared
their knowledge to this social
network's audiences.
4
Objective
This study aims to discover the types of knowledge and learnings
shared on #learnontiktok campaign, using a framework that
integrates computer vision, natural language processing, and
machine learning techniques.
5
6
Methodology
The pipeline starts with data
collection from the TikTok
platform. Then it continues
with Computer Vision
processes and finishes with
multi-classification text
models available to predict
the knowledge areas of the
educational videos.
7
Data Collection
We used scraping algorithms
developed for this objective. Next, a
sample of 1495 TikTok posts using
the hashtag #learnontiktok was
selected to obtain metadata from
them, such as video file, post
description, counts of likes, date,
number of views, author information,
and profile picture.
8
Video Indexer Process.
Azure Video Analyzer for Media is a cloud
application, part of Azure Applied AI
Services. The API extracts the insights
from your videos using Video Analyzer for
Media video and audio models.
Also, we use an Azure Optical Character
Recognition (OCR) service for text
extraction on specific images.
9
Exploratory and Demographic
Analysis
We used the images
from the original videos
that included the user’s
face profiles, which
allows us to infer gender
and age.
10
Multi-Label Categorization
• We designed a simple CNN network composed for an input layer and a convolution
layer of word vectors obtained from Word2Vec model. The training dataset was
composed of text sentences tagged in 20 specific areas from Wikipedia
Excerpt from Wikipedia sentences dataset
11
Results
The Face API process
was applied to the
TikTok profile's photos
for the recognition of
facial properties. The
following results show
the percentages
belonging to gender
and user groups by age
range.
Percentages of detected genre.
Percentages of detected age ranges.
12
Results
Figure shows a word cloud with the
most relevant terms related to video
descriptions registered by authors
Figure shows tags belonging to the
elements identified by the video
indexer process were selected for
each of the videos
13
Results
Histogram with the most relevant labels
from videos
Percentages of knowledge areas with the
highest engagement
14
Conclusions
The proposed framework allows us to identify the main areas of
knowledge associated with educational videos on the TikTok
platform. This information would allow us to add efforts in
important knowledge areas, but which are not widely accepted or
have few content creators.
We find a more extensive collection of Health Sciences videos and
related to STEM areas, even higher than social sciences such as
law and education.
15
Thanks
cafiallos@espol.edu.ec
angel.fiallos@ieee.org
sgfigueroa@espe.edu.ec

More Related Content

Similar to Tiktok and Education-ICEDEG 2021.pptx

Using educational technology to convey complex IL topics: animating OSCOLA re...
Using educational technology to convey complex IL topics: animating OSCOLA re...Using educational technology to convey complex IL topics: animating OSCOLA re...
Using educational technology to convey complex IL topics: animating OSCOLA re...IL Group (CILIP Information Literacy Group)
 
Automatic semantic content extraction in videos using a fuzzy ontology and ru...
Automatic semantic content extraction in videos using a fuzzy ontology and ru...Automatic semantic content extraction in videos using a fuzzy ontology and ru...
Automatic semantic content extraction in videos using a fuzzy ontology and ru...IEEEFINALYEARPROJECTS
 
OEDN Sponsor Program - Summer 2009
OEDN Sponsor Program - Summer 2009OEDN Sponsor Program - Summer 2009
OEDN Sponsor Program - Summer 2009OEDN
 
CIS 499 – Faculty Notes(Prerequisite To be taken last or ne.docx
CIS 499 – Faculty Notes(Prerequisite To be taken last or ne.docxCIS 499 – Faculty Notes(Prerequisite To be taken last or ne.docx
CIS 499 – Faculty Notes(Prerequisite To be taken last or ne.docxclarebernice
 
European Open Science Cloud update webinar
European Open Science Cloud update webinarEuropean Open Science Cloud update webinar
European Open Science Cloud update webinarJisc
 
Educational App Development Guide 2024.pdf
Educational App Development Guide 2024.pdfEducational App Development Guide 2024.pdf
Educational App Development Guide 2024.pdfSuccessiveDigital
 
Defining and measuring digital competence in a rapidly changing world: Perspe...
Defining and measuring digital competence in a rapidly changing world: Perspe...Defining and measuring digital competence in a rapidly changing world: Perspe...
Defining and measuring digital competence in a rapidly changing world: Perspe...Riina Vuorikari
 
Projectpacetech
ProjectpacetechProjectpacetech
Projectpacetechdsidoroff
 
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Amélie Gyrard
 
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptx
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptxIoT-Fundamentals-And-Digital-Tranformation-Repaired.pptx
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptxAurelia JQ
 
Taking it on the road walsdorf
Taking it on the road   walsdorfTaking it on the road   walsdorf
Taking it on the road walsdorfMary Jo Walsdorf
 
Mobile Technologies And Learning
Mobile Technologies And LearningMobile Technologies And Learning
Mobile Technologies And Learningjasdhillo
 

Similar to Tiktok and Education-ICEDEG 2021.pptx (20)

University of Milan - Bicocca
University of Milan - BicoccaUniversity of Milan - Bicocca
University of Milan - Bicocca
 
Using educational technology to convey complex IL topics: animating OSCOLA re...
Using educational technology to convey complex IL topics: animating OSCOLA re...Using educational technology to convey complex IL topics: animating OSCOLA re...
Using educational technology to convey complex IL topics: animating OSCOLA re...
 
Automatic semantic content extraction in videos using a fuzzy ontology and ru...
Automatic semantic content extraction in videos using a fuzzy ontology and ru...Automatic semantic content extraction in videos using a fuzzy ontology and ru...
Automatic semantic content extraction in videos using a fuzzy ontology and ru...
 
Wp62
Wp62Wp62
Wp62
 
OEDN Sponsor Program - Summer 2009
OEDN Sponsor Program - Summer 2009OEDN Sponsor Program - Summer 2009
OEDN Sponsor Program - Summer 2009
 
CIS 499 – Faculty Notes(Prerequisite To be taken last or ne.docx
CIS 499 – Faculty Notes(Prerequisite To be taken last or ne.docxCIS 499 – Faculty Notes(Prerequisite To be taken last or ne.docx
CIS 499 – Faculty Notes(Prerequisite To be taken last or ne.docx
 
Video wiki
Video wikiVideo wiki
Video wiki
 
50120130404055
5012013040405550120130404055
50120130404055
 
European Open Science Cloud update webinar
European Open Science Cloud update webinarEuropean Open Science Cloud update webinar
European Open Science Cloud update webinar
 
Educational App Development Guide 2024.pdf
Educational App Development Guide 2024.pdfEducational App Development Guide 2024.pdf
Educational App Development Guide 2024.pdf
 
Defining and measuring digital competence in a rapidly changing world: Perspe...
Defining and measuring digital competence in a rapidly changing world: Perspe...Defining and measuring digital competence in a rapidly changing world: Perspe...
Defining and measuring digital competence in a rapidly changing world: Perspe...
 
Projectpacetech
ProjectpacetechProjectpacetech
Projectpacetech
 
G05913234
G05913234G05913234
G05913234
 
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
 
e learning merged.pdf
e learning merged.pdfe learning merged.pdf
e learning merged.pdf
 
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptx
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptxIoT-Fundamentals-And-Digital-Tranformation-Repaired.pptx
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptx
 
Taking it on the road walsdorf
Taking it on the road   walsdorfTaking it on the road   walsdorf
Taking it on the road walsdorf
 
Mobil1
Mobil1Mobil1
Mobil1
 
Mobil1
Mobil1Mobil1
Mobil1
 
Mobile Technologies And Learning
Mobile Technologies And LearningMobile Technologies And Learning
Mobile Technologies And Learning
 

Recently uploaded

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxdhanalakshmis0310
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 

Recently uploaded (20)

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 

Tiktok and Education-ICEDEG 2021.pptx

  • 1. Tiktok and Education: Discovering Knowledge through Learning Videos. Carlos Fiallos Angel Fiallos Stalin Figueroa July 2021
  • 2. Agenda • Introduction • Objective • Methodology • Data Collection • Video Indexer Process. • Exploratory and Demographic Analysis • Multi-Label Categorization • Results • Conclusions 2
  • 3. 3 Introduction TikTok is considered a social media platform because, like Twitter and Instagram. However, TikTok offers users a unique way to share creative videos of themselves, their surroundings, or a collection of external audiovisual content.
  • 4. Introduction TikTok launched LearnOnTikTok program, which consists of educational videos to facilitate learning during COVID-19 lockdowns. These videos are authored by professionals, students, and other users, who have shared their knowledge to this social network's audiences. 4
  • 5. Objective This study aims to discover the types of knowledge and learnings shared on #learnontiktok campaign, using a framework that integrates computer vision, natural language processing, and machine learning techniques. 5
  • 6. 6 Methodology The pipeline starts with data collection from the TikTok platform. Then it continues with Computer Vision processes and finishes with multi-classification text models available to predict the knowledge areas of the educational videos.
  • 7. 7 Data Collection We used scraping algorithms developed for this objective. Next, a sample of 1495 TikTok posts using the hashtag #learnontiktok was selected to obtain metadata from them, such as video file, post description, counts of likes, date, number of views, author information, and profile picture.
  • 8. 8 Video Indexer Process. Azure Video Analyzer for Media is a cloud application, part of Azure Applied AI Services. The API extracts the insights from your videos using Video Analyzer for Media video and audio models. Also, we use an Azure Optical Character Recognition (OCR) service for text extraction on specific images.
  • 9. 9 Exploratory and Demographic Analysis We used the images from the original videos that included the user’s face profiles, which allows us to infer gender and age.
  • 10. 10 Multi-Label Categorization • We designed a simple CNN network composed for an input layer and a convolution layer of word vectors obtained from Word2Vec model. The training dataset was composed of text sentences tagged in 20 specific areas from Wikipedia Excerpt from Wikipedia sentences dataset
  • 11. 11 Results The Face API process was applied to the TikTok profile's photos for the recognition of facial properties. The following results show the percentages belonging to gender and user groups by age range. Percentages of detected genre. Percentages of detected age ranges.
  • 12. 12 Results Figure shows a word cloud with the most relevant terms related to video descriptions registered by authors Figure shows tags belonging to the elements identified by the video indexer process were selected for each of the videos
  • 13. 13 Results Histogram with the most relevant labels from videos Percentages of knowledge areas with the highest engagement
  • 14. 14 Conclusions The proposed framework allows us to identify the main areas of knowledge associated with educational videos on the TikTok platform. This information would allow us to add efforts in important knowledge areas, but which are not widely accepted or have few content creators. We find a more extensive collection of Health Sciences videos and related to STEM areas, even higher than social sciences such as law and education.

Editor's Notes

  1. TikTok is considered a social media platform like Twitter and Instagram. It has more than 800 million monthly active users its users have a social group of followers and other users they follow . However, TikTok offers to users a unique way to share creative videos of themselves (either dancing, lip-syncing), their surroundings, or a collection of external audiovisual content. The most straightforward videos consist only of text superimposed on a colored background. Videos can be more complex by including images, video clips, and sounds.
  2. TikTok launched LearnOnTikTok program, which consists of educational videos to facilitate learning during COVID-19 lockdowns. These videos are authored by professionals from different areas, students, and other users, who have shared their knowledge to this social network's audiences. The videos linked to the hashtag #learnontiktok. The videos linked to the hashtag #learnontiktok, have varied topics: from chemistry experiments, cooking recipes, health tips, learning other languages, to creating origami figures, all created by its users.
  3. This study aims to discover the types of knowledge and learnings shared on #learnontiktok campaign, using a framework that integrates computer vision and audio recognition approaches for processing text and metadata information that is part of videos. It also contemplates processes for classifying the information collected into science categories using natural language processing and ML techniques.
  4. The pipeline starts with data collection from the TikTok platform. Then it continues with the use Computer vision and audio recognition models to obtain text metadata from the video files. Finally, a trained multi-label classification text model is available to use text metadata to predict the knowledge areas of the educational videos. I will now explain each step
  5. We used scraping algorithms developed for this purpose. First, we searched with the hashtag #learnontiktok, which yielded a limited number of videos. From that result, we selected a sample of 1,495 TikTok posts Next, we collected posts metadata, such as video file, post description, counts of likes, date, number of views, author information, and profile picture. The period of the posts was from June 2020 to January 2021.
  6. Azure Video Analyzer for Media is a cloud application, part of Azure Applied AI Services. The API extracts the insights from your videos using Computer vision models and also audio models for transcriptions. Also, we use an Azure Optical Character Recognition (OCR) service for text extraction on specific images. Azure uses an internal algorithm to infer the correct string to be presented by correcting the mistakes introduced by individual OCR detections. The audio transcriptions and text from video snippets were acquired for further processing. Optical character recognition or optical character reader (OCR)
  7. We used the images from the original videos that included the user’s face profiles and processed them via Microsoft’s Azure Face API, which allows us to infer gender and age in json format. Once the process is finished, we selected the photos in which the exposure value was greater than 0.5 and the gender and age properties could be detected.
  8. Following the Kim approach [14], we designed a simple CNN network composed for an input layer with five different ngrams window sizes and one layer of convolution on top of word vectors obtained from Word2Vec unsupervised neural language model [15]. The training dataset was composed of text sentences tagged in 20 specific areas from Wikipedia, such as medicine, food and drink, legal, physics, chemistry, among others. The networks try to predict 0 or 1 values on every label, and the model uses the confidence values to produce a ranking.
  9. The Face API process was applied to the TikTok profile's photos for the recognition of facial properties. The following results show the percentages belonging to gender and user groups by age range: the male gender and the 18-34 age group had the highest percentage The rest of the photos of user profiles, among other reasons, did not show the user's face or belonged to business profiles, could not identify gender and age properties.
  10. First Figure shows a word cloud with the most relevant terms related to video descriptions registered by authors. Only terms identified as nouns, through Part of Speech Tagging libraries, were selected for analysis. Some words such as “psychology”, “food”, “life”, “amazon”, “fun”, could be identified, which give a weak idea about the topics related to educational videos. Then, the tags belonging to the elements identified by the video indexer process were selected for each of the videos. Terms such as "person," "text," "indoor," "clothing", "hair" can be identified, which relate to characteristics of the authors and the video background but continue to give a weak idea about the topics and area of knowledge covered in the video
  11. Next, we applied the multiclassification model to the keywords obtained from the identification of the text snippets and the text transcription of the audios to assign the knowledge areas to each video. The model returns a set of probabilities labels and the following knowledge areas were identified as having the highest counts. It can be seen in Figure , that Medicine, Food and Drink, Health, Cooking, Biology, Chemistry, among others, are the most relevant categories. Finally, we associated the knowledge areas with the number of likes for each video in order to establish the categories that had the highest user engagement. The results are shown in Table . Medicine, Food and Drink, Health, Chemistry, and Technology are the areas with the best engagement by the TikTok audience.
  12. The proposed framework allows us to identify in an automatic way the main areas of knowledge associated with educational videos on the TikTok platform and which areas are the most preferred by users. This information would allow us to add efforts in important knowledge areas, but which are not widely accepted or have few content creators. In our sample, we find a more extensive collection of Health Sciences videos and related to STEM areas, even higher than social sciences such as law and education, which gives an impression of the potential of this type of videos for science learning. The study also confirmed that most authors are people under the age of 34, who also represent the largest audience on the social network. This study also supports the idea that audio and text metadata information available in short TikTok videos contains concepts that give rise to a better understanding of the video learning topics than even the descriptions registered by the authors.