Your SlideShare is downloading. ×
  • Like
Semantic Monitoring of Personal Web Activity to Support the Management of Trust and Privacy
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Semantic Monitoring of Personal Web Activity to Support the Management of Trust and Privacy

  • 2,148 views
Published

Presentation at the SPOT 2010 workhop on Provacy and Trust on the Social and Semantic Web.

Presentation at the SPOT 2010 workhop on Provacy and Trust on the Social and Semantic Web.

Published in Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,148
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
6
Comments
0
Likes
4

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Semantic Monitoring of Personal Web Activity to Support the Management of Trust and Privacy
    Mathieu d'Aquin, SalmanElahi, Enrico Motta
    Knowledge Media Institute, The Open University, UK
  • 2. Stating the obvious
    Personal information exchange on the Web is
    Big
    Heterogeneous
    Distributed
    Fragmented
    Sometimes implicit
  • 3. Challenges to individuals
    Lack of control over personal information
    In sum, we don’t know the most important things about our personal data
    What are all the websites that know my e-mail address?
    What does amazon.co.uk or the website of my favorite airline know about me?
  • 4. Why this is important
    Because these things are useful to know in general
    Because these things can tell us a lot about our own behavior, our attitudes towards information sharing and exchange
    Because this behavior has strong implications in terms of privacy and defines our trust relationships with website online
  • 5. So, what do we do?
    Unrestricted monitoring of information exchange on the Web by an individual user
    Building a semantically represented and processable datasets of what was shared and with who
    Analyze these datasets in terms of building models of the user’s behavior related to privacy,
    levels of trust given to websites
    levels criticality associated to different pieces of data
  • 6. Local Logging
    Proxy
    HTTP Requests
    HTTP Requests
    Local Web Agents
    (e.g., browser)
    External Web Sites
    HTTP Responses
    HTTP Responses
    Web Exchange
    RDF Logs
    Interaction Patterns
    Personal Information
    HTTP Ontology
  • 7. <Request rdf:about="#request-1257949232709-1257949233757">
    <startedAt>1257949232709</startedAt>
    <endedAt>1257949233757</endedAt>
    <origin rdf:resource="127.0.0.1" />
    <onPort>80</onPort>
    <toHostrdf:resource="api.facebook.com" />
    <method rdf:resource="POST"/>
    <toURLrdf:resource="http://api.facebook.com/restserver.php" />
    <HTTPVersionrdf:resource="HTTP-1.1" />
    <Host rdf:resource="api.facebook.com" />
    <Content-Type rdf:resource="application--x-www-form-urlencoded" />
    <User-Agent rdf:resource="Mozilla--5.0_(Macintosh;_U;_Intel_Mac_OS_X;_en)_App
    leWebKit--526.9+_(KHTML._like_Gecko)_AdobeAIR--1.5.2" />
    <Refererrdf:resource="app:--TweetDeck.swf" />
    <X-Flash-Version rdf:resource="10.0.32.18" />
    <Accept rdf:resource="*--*" />
    <Accept-Language rdf:resource="en-us" />
    <Accept-Encoding rdf:resource="gzip._deflate" />
    <Cookie rdf:resource= "__qca=1239783354-42963995-12118014;___utma=87286159.357
    565716.1239892196.1252686326.1257582307.16;___utmz=87286159.1257582307.16.16.utm
    ccn= (referral)|utmcsr=facebook.com|utmcct=--tos.php|utmcmd=referral;_c_user=6055
    59235;_cur_max_lag=2;_datr=1239398136-0711bf1215821a9c58848bf0ffd0020ec8450cfa71
    54b9e228c29;_lsd=P3Zpn;_lxe=metm.daquin%40virgin.net;_lxs=3;_s_vsn_facebookpoc_1
    =9874874320812" />
    <Content-Length rdf:resource="984" />
    <Connection rdf:resource="keep-alive" />
    <Proxy-Connection rdf:resource="keep-alive" />
    <data rdf:resource="data_c22b691f691dabd5ae893b9cb2f8add7" />
    <response>
    <Response rdf:about="#response-1257949232709--1257949233757">
    <HTTPVersionrdf:resource="HTTP--1.0" />
    <responseCoderdf:resource="200_OK" />
    <Cache-Control rdf:resource="private._no-store._no-cache._must-revalidate.
    _post-check=0._pre-check=0" />
    <Content-Type rdf:resource="application--json" />
    <Expires rdf:resource="Mon._26_Jul_1997_05:00:00_GMT" />
    <Pragmardf:resource="no-cache" />
    <Content-Encoding rdf:resource="gzip" />
    <Content-Length rdf:resource="5943" />
    <X-Cache rdf:resource="MISS_from_roeburn.open.ac.uk" />
    <Proxy-Connection rdf:resource="keep-alive" />
    <data rdf:resource="data_5ccf6054fd0fba3ee7eb444e178eaf19" />
    </Response></response>
    </Request>
    Ran over a period of 2.5 months yielded around 100 Million triples, representing about 3 Million HTTP requests.
    Encodes all the info related to HTTP requests and responses.
    Data sent and received stored separately.
  • 8. Basic analytics
  • 9. Focusing on personal data exchange
    Extract information sent through parameters of HTTP Requests
    http://uk.search.yahoo.com/beacon/module?p=idiocracy&url=http%3A%2F%2Fwww.imdb.com%2Ftitle%2Ftt0387808%2F
    format=JSON&method=fql%2Emultiquery&api%5Fkey=51d350e8d92da1f5623512a9e801da2b&v
    =1%2E0&queries=%7B%22query2%22%3A%22SELECT%20app%5Fid%2C%20display%5Fname%20FROM
    %20application%20WHERE%20app%5Fid%20IN%20%28SELECT%20app%5Fid%20FROM%20%23query1
    %29%22%2C%22query1%22%3A%22SELECT%20post%5Fid%2C%20source%5Fid%2C%20created%5Fti
    me%2C%20updated%5Ftime%2C%20actor%5Fid%2C%20target%5Fid%2C%20app%5Fid%2C%20messa
    ge%2C%20attachment%2C%20comments%2C%20likes%2C%20permalink%2C%20attribution%2C%2
    0type%20FROM%20stream%20WHERE%20filter%5Fkey%20IN%20%28SELECT%20filter%5Fkey%20F
    ROM%20stream%5Ffilter%20WHERE%20uid%20%3D%20605559235%20AND%20type%20%3D%20%27ne
    wsfeed%27%29%20AND%20%28created%5Ftime%20%3E%3D%201257443596%29%20AND%20%28%28cr
    eated%5Ftime%20%3E%201257945423%29%20OR%20%28updated%5Ftime%20%21%3D%20created%5
    Ftime%29%29%20ORDER%20BY%20created%5Ftime%20DESC%20LIMIT%20200%22%7D&call%5Fid=1
    2565739074246102&sig=01a13a72825ed83ed6d23bdf2791ad1a&session%5Fkey=be312ffdf9b9
    e1a5ec6c5768%2D605559235
    Map this data onto a representation of a user profile (set of attributes of personal data)
  • 10. Tool used to create mappings between data sent to websites (from logs on the right) with the user profile (left). Effectively reconstructing the profile from the data
  • 11. What this tells us about Trust and Criticality of data
    36 attributes, 1,080 values, to 123 domains
    A model of what piece of personal information was sent where (can answer the questions)
    Taking the point of view of an external observer, we can derive an observed model of trust and criticality of data
    If this piece of data is critical to you and you give it to bob, you must trust bob
    If you give this piece of data to many untrusted people, you probably don’t consider it critical
    The goal being to help the user to better understand his own behavior
  • 12. The model formally
    Trust in a domain =
    max of criticality of data it received
    Criticality of a piece of data=
    1 / 1 + Σ (1- trust in websites
    that received the data)
    Obviously, these 2 formulas are interdependent. Treating them as a sequence, with initial values at 0.5
  • 13. Interacting with the model
    Expose the user to his own observed behavior has observed, so that he can try to align it to his intended behavior
  • 14. What we can do with this
    Help a user understand his own data exchange
    Compare websites and data in terms of the observed trust and criticality
    “Correct” the model by re-aligning it with the intended behavior
    Detect fundamental conflicts between the observed behavior and the intended behavior
    Observe correlations in the data
  • 15. Where that leads us
    1 first tools exploiting logs of personal Web activity
    Demonstrate the need for better ways to personal information management as personal Web data exchange
    Need to exploit and integrate local and external sources of data together to create new mechanisms supporting individuals in interpreting, understating and managing their information online
  • 16. Thank you
    m.daquin@open.ac.uk
    @mdaquin