Axa Assurance Maroc - Insurer Innovation Award 2024
Talking blogs – an attempt to give weblogs a voice
1. Talking blogs – an attempt to
give weblogs a voice
Adding TTS functionality to Wordpress
by Arne Hellmich
Bielefeld University
2. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich
Motivation
• connect two existing applications: a weblog publishing system and a
speech synthesizer
• add TTS functionality to weblogs to give multimodal access to
normally only written information
→ e.g. web accessibility for visually impaired or blind people
• in times of MP3 players and Podcasts, there is a high demand for
automatically generated audio content from written texts
→ people can listen to blog entries on their MP3 players
• "on the fly" generation of audio content is necessary because of the
rapidly changing content
→ prerecording is almost impossible
3. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich
What is a blog?
• a blog (or weblog) is an online journal where an individual, group or
corporation presents a record of activities, thoughts or beliefs
• normally, blogs are frequently updated and contain diary-type
commentary and links to other blogs or websites
• blogs often reflect personal ideas and play an increasing role in the
spreading of news (cf. reports by bloggers on the hurricane that hit
New Orleans for example)
• some well-known weblog publishing systems are: Wordpress,
Serendipidy, …
• The PHP-based Wordpress allows plugin development in PHP
which makes the system easily adaptable
4. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich
The talking blogs plugin
• "Plugins are cool bits of programming scripts that add additional
functionality to your blog."
• the talking blogs plugin adds speech functionality to the weblog
publishing system Wordpress
• it takes the RSS-feed generated by Wordpress and uses its content
as input for the speech synthesizer SWIFT, developed by Cepstral,
LLC
• audio content is generated "on the fly" and an audio file is stored on
the webserver
• the generated audio file can then be played on the website
→ Wordpress got a voice!
5. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich
Installation of the plugin
• prerequisits: working Swift TTS engine, directory with writing
permissions for storage of audio files
• there is a special directory for plugins in Wordpress
• the files of the talking blogs plugin have to be put into a subdirectory
of the plugins directory of the Wordpress installation
• the plugin has to be enabled in the admin menu of Wordpress and
some manual modifications to the Wordpress template are needed
to put a link for the audio playback below each blog entry
6. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich
The plugin‘s architecture
• the plugin consists of three files:
– talkingblogs.php: needed to enable the plugin in Wordpress
– config.php: configuration file
– tts.php: main file doing all the parsing and the synthesis
• the file config.php contains the default values for some important
variables used in the parsing process of the RSS feed:
$URL: defines the URL directing to the Wordpress installation
$SPEECHDIR: defines the directory where the audio files are stored
$DEFAULTVOICE: defines the default voice for the synthesis
$PAUSEAFTERTITLE: defines the length of the pause after the title
$PROSODYRATETITLE: average rate of the title
$PROSODYPITCHTITLE: pitch
$PROSODYVOLUMETITLE: title‘s volume
7. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich
The plugin‘s architecture II
• the main file tts.php can be divided into three main parts:
– first: the text of the blog entry is read from the the RSS feed
– second: preprocessing of the text is done using regular expressions to
improve synthesis results:
e.g. 25$ is converted to 25 Dollar, "emoticons" are deleted, …
– third: generation of the actual audio file using the TTS engine SWIFT;
the generated audio file is stored in a special directory so that the text
has to be synthesized only once
• the tts.php is called each time a user clicks onto a „speak entry" link
below one of the blog entries; if the audio file for the entry exists it is
just played back, otherwise it is generated "on the fly"
8. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich
Using SSML to add prosodic information to the blog
entry‘s title
• SWIFT has an SSML mode in which it can parse SSML tags
• SSML tags can be used to add prosodic information as well as for
changing the default voice used by SWIFT
• at the current stage of development it is only possible to add
prosodic information like pitch and volume to the title by specifying
the necessary information in the config file of the talkingblogs plugin
• valid SSML tags containing the prosodic information are added to
the title of each blog entry when the RSS feed is parsed, e.g.:
<prosody rate="+50%" volume="+100%" pitch="medium">Title</prosody>
9. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich
Using SSML to switch voices within one blog entry
• it is possible to switch voices in one blog entry by adding an SSML
tag named <voice>
→ challenge: How to add SSML tags to the HTML source code
produced by Wordpress without producing invalid
HTML code?
→ solution: Usage of the "invisible" HTML tag <span>
• the HTML tag <span> can have different attributes like id and lang
• using the following piece of HTML code can switch voices:
Text. <span id="new_voice">Some more text.</span>
• the RSS feed contains the invisible <span> tag and when it is
parsed by the plugin, regular expressions convert it to the following
correct text string containing the SSML tag:
Text. <voice name="new_voice">Some more text.</voice>
10. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich
Future work
• changing "on demand" audio generation to "on publish" audio
generation meaning that an audio file is generated whenever a new
blog entry is written
→ advantages: no delay when playing back the audio file on the
website, generation process is not as error-prone because
guaranteed completeness of the generation process
• adding new means of audio distribution
→ including link to the audio file in the RSS feed which enables
users to directly download the audio file with their feed readers
• adding some kind of compression mechanism to get smaller audio
files
→ MP3 instead of uncompressed WAV
11. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich
For additional information visit:
http://www.talkingblogs.de
12. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich
Thank you for your attention!
13. Talking blogs – an attempt to give weblogs a voice
Adding TTS functionality to Wordpress by Arne Hellmich