Modelling social Web applications via tinydb


Published on

Published in: Technology, Design
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Modelling social Web applications via tinydb

  1. 1. Modelling social Web applications via tinydb Claudiu Mih˘il˘ a a Faculty of Computer Science, ”Al.I. Cuza” University of Ia¸i, s 16, G-ral Berthelot Street, 700483 Ia¸i, Romania s Abstract. This paper reports on the possibilities of data modelling for social Web applications and user interaction with them by using url-shortening Web services such as tinydb. Due to the fact that such a service transforms a long url into a very short one, and is able to send structured data along with the url, it can change radically the appearance and composition of current messages. However, some essential problems regarding these services have been identified, which need to be addressed before using short urls safely. Key words: tinydb, social Web, semantic Web, Twitter, model, rest, interaction 1 Introduction Microblogging is a relatively new phenomenon defined as ”a form of multimedia blogging that allows users to send brief text updates or micromedia such as photos or audio clips and publish them, either to be viewed by anyone or by a restricted group which can be chosen by the user.”1 Microblogging tools provide a light-weight, easy form of communication that enables users to broadcast and share information about their activities, opinions and status. However, the existing microblogging services are still centralised and confined, and efforts are being made to let microblogging be an integrating part of the Social Semantic Web [1]. Modelling the necessary data for this type of Web applications whilst maintaining a high standard for a rich user experience is not an easy task to fulfill. We hereby describe some steps that have been taken towards the integration of microblogging into the semantic Web and the access to data in a restfull manner by using url-shortening Web services. The report is structured as follows: in section two, the tinydb Web service is presented, whilst in the third section we describe a case study of using this Web service for one social Web application. Finally, the issues arising from the use of url-shortening services are discussed. 2 Tinydb Tinydb 2 is a Web service that offers the possibility of transmitting structured data along a short url. The data can be associated with a url address either by get or post http-request parameters. For example, the request in Fig. 1 would return a short url,, which, when accessed, would make possible the retrieval of the original data. diu.mihaila&name=Claudiu Mihaila&course=WADe Fig. 1: Writing the data as parameters of an http request There is one restriction regarding the name of the parameters: they are not allowed to start with an underscore (’ ’), due to the fact that the Web service uses a number of parameters whose names start in 1 2
  2. 2. this manner. These parameters are url, f, c and tinydb id . If present, the url parameter allows the redirection to its value instead of displaying the actual data stored in the tinydb url. The data associated with the parameters can be retrieved by accessing the url address obtained in response to the request. There are multiple representations of the created information resource that can be retrieved in a restful manner by the user or another application: – f = json - returns a json object containing the data. – f = jsonp - returns a json object but in JavaScript, that is fed directly into a callback function, tinydbCallback, to be called once the script is loaded. – f = js - returns a json object but in JavaScript, with an optional callback function (& c = callback) to be called once the script is loaded. – f = xml - returns the data in xml. The thus retrieved information may be used by other applications as it is, or for the creation of mash-ups, etc. For example, Fig. 2 includes the data requested via the f=xml url. If no explicit format would have been requested, the accessing of the tinydb url would have resulted in the redirection to the url specified by the value of the url parameter. <xml_data> <url></url> <name>Claudiu Mihaila</name> <course>WADe</course> <tinydb_id>11mK</tinydb_id> <created>2009-10-03 15:08:05.225510</created> </xml_data> Fig. 2: Retrieving the data as xml The simplicity and ease of use of this Web service in storing and retrieving structured data make it a viable option for its inclusion in large projects, especially in which data storage capabilities represent an issue. In the next section, we will analyse the modelling of the data and the interaction of the user with one of the intensively used social Web applications nowadays, Twitter. 3 Twitter Twitter3 is a very popular social Web application which allows the users to send and read short pieces of text, known as tweets [2]. This application has grown significantly and very fast since its launching in 2006, having an estimate number of users for 2009 of 20 million people [3]. The very short length of the messages was initially established to make it possible to send and receive tweets via the Short Message Service (sms). Thus, the limit of 140 characters has introduced sms-specific slang and shorthand notation into the Web. Therefore, the creation of Web services such as tinybd can largely influence the content of the sent messages. By using this service, Twitter users can now surpass the 140 character limit and share practically unlimited amounts of data. Searches on this system make use of hashtags, which are words or phrases prefixed with a #. A search for ”Web” would find all messages that include #Web. Similarly, the @ sign followed by a username allows users to send messages directly to each other, although the message is still readable by anyone. One effective use of the tinydb Web service in the Twitter application is to share long and cumbersome urls, which can take more than 140 characters. For example, the Google link in Fig. 3 comprises 214 characters and the whole link would not be allowed in a single tweet. However, using tinydb would consume only 22 characters to obtain the same result, which leaves sufficient additional space for some other short message, be it text or another url. Therefore, the association of the desired link with one 2
  3. 3. umihaila%26target%3Dphoto%26id%3D5386208676112752274&service=lh2 &ltmpl=gp&passive=true Fig. 3: Example of a 214-character long Google link from tinydb is beneficial to the user, since more than 13% of the existing tweets contain some url in them [4]. Furthermore, the fact that tinydb is able to store many long texts as parameters gives users the possibility of publishing extensive messages. Since the http protocol does not place any a priori limit on the length of a request parameter, users have no imposed length limit other than the maximum admitted by the server. An advantage of having more space for writing messages is that the used concepts can be properly tagged. Instead of the hashtags Twitter offers, the users can annotate by including more powerful processing that can extract and define uris based on those tags. For instance, instead of writing ”Climbing the #Statue of Liberty in #New York ”, someone could microblog ”Climbing the #dbp:Statue of Liberty in #geo:New York ”. By doing so, the processor would then be able to extract the hashtags and send queries to DBpedia4 and GeoNames5 to retrieve the uris of the related concepts. Thus, the tweets would be automatically connected to existing uris rather than to meaningless text strings [5]. Moreover, the associated data may be described by using metadata, such as vocabularies from the social semantic Web, e.g. foaf (Friend of a friend) and sioc (Semantically-Interlinked Online Commu- nities). The former is used to model the microbloggers and their properties (e.g., name and e-mail) and reuse their uri from some Web 2.0 services instead of creating new ones every time. The latter is used to define related user contexts, providing a way to identify a user account on a given microblogging service. Given the strong connection between foaf and sioc, people are allowed to access information unavailable before via sparql queries. 4 Current issues Even though the tinydb Web service provides short urls as aliases for longer ones or for large amounts of text, it should be noted that the more the service is used, the longer the tinydb urls will get. Considering n the size of the alphabet, and k the longest size of the desired code, the total number of short pages with codes less or equal to k that can be created is computed in equation (1), according to the geometric progression. k n(nk − 1) ni = (1) i=1 n−1 The currently used alphabet of 10 numbers and 52 letters (both upper and lowercase English letters) creates a 62-base numeral system. With at most four characters per key, the service can encode almost 15 million urls. In the case of extending the key to five characters, the possibilities increase to almost one billion. Although these are big numbers, they will probably prove to be not enough. One drawback of the tinydb Web service is that no privacy or security features are introduced and the submitted data is entirely accessible by the wide public. However, this issue can be easily overcome by using various cryptography algorithms and submitting only the encrypted text. The ciphertext would then be available to everyone, but readable only by those who possess the correct decryption key. Furthermore, it is impossible to know towards which url the user will be redirected. Many phishing, spamming, shock and affiliate urls are recorded on url shortening sites, which can compromise the users’ integrity. Nonetheless, this type of problematic content can be overcome by filtering or by presenting the link instead of an automatic redirect. However, the case of hacking, when the url is changed intentionally by someone else, can vexate and expose the service’s users. 3 4 5 3
  4. 4. Moreover, the urls that this type of services store are prone to becoming link rot. Gomes and Silva [6] concluded in one study that the lifetime of contents follows a logarithmic distribution with an estimated half-life of only two days, which is continuously decreasing. Another issue is that by using such a service the complexity of each access increases by one level. The fact that every time at least one dns lookup and one http access have to be performed supplements the number of necessary requests and lengthens the waiting time. 5 Conclusions In this report we have presented the possibility of modelling the data in social Web applications via url shortening services and how this affects the user’s interaction with those applications. Possessing a diminished url with associated structured data brings many advantages when it comes to the information content that can be sent. On the one hand, users are able to communicate in longer texts, acquiring a higher expressivity regarding their emotional state or activities. On the other hand, it is now possible to include appropriate metadata to the messages. Including widely used ontologies and data bases enables the creation of an enriched semantic Web. However, the large number of examples of inappropriately used tinydb short urls raises a question mark on its utility. The fact that short urls are not entirely safe at the moment may procrastinate its employment on a large scale. More security measures need be developed in order to apply this type of data modelling for it to be a successful step towards a social semantic Web. References 1. Breslin, J.G., Decker, S.: Semantic web 2.0: Creating social semantic information spaces. In: Tutorial in the 15th International World Wide Web Conference (WWW 2006). (May 2006) 2. Pontin, J.: From many tweets, one loud voice on the internet. The New York Times (22 April 2007) 3. Kazeniac, A.: Social networks: Facebook takes over top spot, Twitter climbs. (9 February 2009) 4. Java, A., Song, X., Finin, T., Tseng, B.: Why we twitter: understanding microblogging usage and communities. In: WebKDD/SNA-KDD ’07: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, New York, NY, USA, ACM Press (2007) pp. 56–65 5. Passant, A., Hastrup, T., Bojars, U., Breslin, J.: Microblogging: A semantic web and distributed approach. In: Proceedings of the 4th Workshop on Scripting for the Semantic Web. (2008) 6. Gomes, D., Silva, M.J.: Modelling information persistence on the web. In: Proceedings of the 6th International Conference on Web Engineering, New York, NY, USA, ACM Press (11-16 July 2006) pp. 193–200 4