Transcript of "Modelling social Web applications via tinydb"
Modelling social Web applications via tinydb
Faculty of Computer Science,
”Al.I. Cuza” University of Ia¸i,
16, G-ral Berthelot Street,
700483 Ia¸i, Romania
Abstract. This paper reports on the possibilities of data modelling for social Web applications
and user interaction with them by using url-shortening Web services such as tinydb. Due to the fact
that such a service transforms a long url into a very short one, and is able to send structured data
along with the url, it can change radically the appearance and composition of current messages.
However, some essential problems regarding these services have been identiﬁed, which need to be
addressed before using short urls safely.
Key words: tinydb, social Web, semantic Web, Twitter, model, rest, interaction
Microblogging is a relatively new phenomenon deﬁned as ”a form of multimedia blogging that allows users
to send brief text updates or micromedia such as photos or audio clips and publish them, either to be
viewed by anyone or by a restricted group which can be chosen by the user.”1 Microblogging tools provide
a light-weight, easy form of communication that enables users to broadcast and share information about
their activities, opinions and status. However, the existing microblogging services are still centralised and
conﬁned, and eﬀorts are being made to let microblogging be an integrating part of the Social Semantic
Modelling the necessary data for this type of Web applications whilst maintaining a high standard for
a rich user experience is not an easy task to fulﬁll. We hereby describe some steps that have been taken
towards the integration of microblogging into the semantic Web and the access to data in a restfull
manner by using url-shortening Web services.
The report is structured as follows: in section two, the tinydb Web service is presented, whilst in the
third section we describe a case study of using this Web service for one social Web application. Finally,
the issues arising from the use of url-shortening services are discussed.
Tinydb 2 is a Web service that oﬀers the possibility of transmitting structured data along a short url. The
data can be associated with a url address either by get or post http-request parameters. For example,
the request in Fig. 1 would return a short url, http://tinydb.org/11mK, which, when accessed, would
make possible the retrieval of the original data.
Fig. 1: Writing the data as parameters of an http request
There is one restriction regarding the name of the parameters: they are not allowed to start with an
underscore (’ ’), due to the fact that the Web service uses a number of parameters whose names start in
this manner. These parameters are url, f, c and tinydb id . If present, the url parameter allows
the redirection to its value instead of displaying the actual data stored in the tinydb url.
The data associated with the parameters can be retrieved by accessing the url address obtained in
response to the request. There are multiple representations of the created information resource that can
be retrieved in a restful manner by the user or another application:
– f = json - returns a json object containing the data.
tinydbCallback, to be called once the script is loaded.
callback) to be called once the script is loaded.
– f = xml - returns the data in xml.
The thus retrieved information may be used by other applications as it is, or for the creation of
For example, Fig. 2 includes the data requested via the http://tinydb.org/11mK? f=xml url. If no
explicit format would have been requested, the accessing of the tinydb url would have resulted in the
redirection to the url speciﬁed by the value of the url parameter.
Fig. 2: Retrieving the data as xml
The simplicity and ease of use of this Web service in storing and retrieving structured data make it
a viable option for its inclusion in large projects, especially in which data storage capabilities represent
In the next section, we will analyse the modelling of the data and the interaction of the user with
one of the intensively used social Web applications nowadays, Twitter.
Twitter3 is a very popular social Web application which allows the users to send and read short pieces
of text, known as tweets . This application has grown signiﬁcantly and very fast since its launching in
2006, having an estimate number of users for 2009 of 20 million people .
The very short length of the messages was initially established to make it possible to send and receive
tweets via the Short Message Service (sms). Thus, the limit of 140 characters has introduced sms-speciﬁc
slang and shorthand notation into the Web. Therefore, the creation of Web services such as tinybd can
largely inﬂuence the content of the sent messages. By using this service, Twitter users can now surpass
the 140 character limit and share practically unlimited amounts of data.
Searches on this system make use of hashtags, which are words or phrases preﬁxed with a #. A search
for ”Web” would ﬁnd all messages that include #Web. Similarly, the @ sign followed by a username allows
users to send messages directly to each other, although the message is still readable by anyone.
One eﬀective use of the tinydb Web service in the Twitter application is to share long and cumbersome
urls, which can take more than 140 characters. For example, the Google link in Fig. 3 comprises 214
characters and the whole link would not be allowed in a single tweet. However, using tinydb would
consume only 22 characters to obtain the same result, which leaves suﬃcient additional space for some
other short message, be it text or another url. Therefore, the association of the desired link with one
Fig. 3: Example of a 214-character long Google link
from tinydb is beneﬁcial to the user, since more than 13% of the existing tweets contain some url in
Furthermore, the fact that tinydb is able to store many long texts as parameters gives users the
possibility of publishing extensive messages. Since the http protocol does not place any a priori limit on
the length of a request parameter, users have no imposed length limit other than the maximum admitted
by the server. An advantage of having more space for writing messages is that the used concepts can be
properly tagged. Instead of the hashtags Twitter oﬀers, the users can annotate by including more powerful
processing that can extract and deﬁne uris based on those tags. For instance, instead of writing ”Climbing
the #Statue of Liberty in #New York ”, someone could microblog ”Climbing the #dbp:Statue of Liberty
in #geo:New York ”. By doing so, the processor would then be able to extract the hashtags and send
queries to DBpedia4 and GeoNames5 to retrieve the uris of the related concepts. Thus, the tweets would
be automatically connected to existing uris rather than to meaningless text strings .
Moreover, the associated data may be described by using metadata, such as vocabularies from the
social semantic Web, e.g. foaf (Friend of a friend) and sioc (Semantically-Interlinked Online Commu-
nities). The former is used to model the microbloggers and their properties (e.g., name and e-mail) and
reuse their uri from some Web 2.0 services instead of creating new ones every time. The latter is used
to deﬁne related user contexts, providing a way to identify a user account on a given microblogging
service. Given the strong connection between foaf and sioc, people are allowed to access information
unavailable before via sparql queries.
4 Current issues
Even though the tinydb Web service provides short urls as aliases for longer ones or for large amounts of
text, it should be noted that the more the service is used, the longer the tinydb urls will get. Considering
n the size of the alphabet, and k the longest size of the desired code, the total number of short pages
with codes less or equal to k that can be created is computed in equation (1), according to the geometric
n(nk − 1)
ni = (1)
The currently used alphabet of 10 numbers and 52 letters (both upper and lowercase English letters)
creates a 62-base numeral system. With at most four characters per key, the service can encode almost
15 million urls. In the case of extending the key to ﬁve characters, the possibilities increase to almost
one billion. Although these are big numbers, they will probably prove to be not enough.
One drawback of the tinydb Web service is that no privacy or security features are introduced and
the submitted data is entirely accessible by the wide public. However, this issue can be easily overcome
by using various cryptography algorithms and submitting only the encrypted text. The ciphertext would
then be available to everyone, but readable only by those who possess the correct decryption key.
Furthermore, it is impossible to know towards which url the user will be redirected. Many phishing,
spamming, shock and aﬃliate urls are recorded on url shortening sites, which can compromise the users’
integrity. Nonetheless, this type of problematic content can be overcome by ﬁltering or by presenting the
link instead of an automatic redirect. However, the case of hacking, when the url is changed intentionally
by someone else, can vexate and expose the service’s users.
Moreover, the urls that this type of services store are prone to becoming link rot. Gomes and Silva 
concluded in one study that the lifetime of contents follows a logarithmic distribution with an estimated
half-life of only two days, which is continuously decreasing.
Another issue is that by using such a service the complexity of each access increases by one level. The
fact that every time at least one dns lookup and one http access have to be performed supplements the
number of necessary requests and lengthens the waiting time.
In this report we have presented the possibility of modelling the data in social Web applications via url
shortening services and how this aﬀects the user’s interaction with those applications.
Possessing a diminished url with associated structured data brings many advantages when it comes
to the information content that can be sent. On the one hand, users are able to communicate in longer
texts, acquiring a higher expressivity regarding their emotional state or activities. On the other hand, it
is now possible to include appropriate metadata to the messages. Including widely used ontologies and
data bases enables the creation of an enriched semantic Web.
However, the large number of examples of inappropriately used tinydb short urls raises a question
mark on its utility. The fact that short urls are not entirely safe at the moment may procrastinate its
employment on a large scale. More security measures need be developed in order to apply this type of
data modelling for it to be a successful step towards a social semantic Web.
1. Breslin, J.G., Decker, S.: Semantic web 2.0: Creating social semantic information spaces. In: Tutorial in the
15th International World Wide Web Conference (WWW 2006). (May 2006)
2. Pontin, J.: From many tweets, one loud voice on the internet. The New York Times (22 April 2007)
3. Kazeniac, A.: Social networks: Facebook takes over top spot, Twitter climbs.
http://blog.compete.com/2009/02/09/facebook-myspace-twitter-social-network/ (9 February 2009)
4. Java, A., Song, X., Finin, T., Tseng, B.: Why we twitter: understanding microblogging usage and communities.
In: WebKDD/SNA-KDD ’07: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web
mining and social network analysis, New York, NY, USA, ACM Press (2007) pp. 56–65
5. Passant, A., Hastrup, T., Bojars, U., Breslin, J.: Microblogging: A semantic web and distributed approach.
In: Proceedings of the 4th Workshop on Scripting for the Semantic Web. (2008)
6. Gomes, D., Silva, M.J.: Modelling information persistence on the web. In: Proceedings of the 6th International
Conference on Web Engineering, New York, NY, USA, ACM Press (11-16 July 2006) pp. 193–200