g-Social - Enhancing e-Science Tools with Social Networking Functionality
Oct. 9, 2012•0 likes
2 likes
Be the first to like this
Show More
•906 views
views
Total views
0
On Slideshare
0
From embeds
0
Number of embeds
0
Download to read offline
Report
Education
Presentation of "g-Social - Enhancing e-Science Tools with Social Networking Functionality" given at the Workshop on Analyzing and Improving Collaborative eScience with Social Networks, Chicago October 8th, 2012. Co-located with IEEE eScience 2012.
Fourth Paradigm of Scientific Exploration (J. Gray)
Source: J. Gray, talk to NRC/CSTB, “eScience - A Transformed Scientific
Method.” Mountain View CA, 11 January 2007.
• Thousand years ago science was empirical
– describing natural phenomena
• Last few hundred years: theoretical branch
– using models, generalizations
• Last few decades: a computational branch
– simulating complex phenomena
• Today: data exploration (eScience)
– unify theory, experiment, and simulation
– Data captured by instruments
Or generated by simulator
– Processed by software
– Information/Knowledge stored in computer
– Scientist analyzes database / files
using data management and statistics
– “Computational X” and “X-Informatics” 2009
3
The disappearance of Tenacious (28/1/2007)
Farallon
Islands
Jim Gray
Manager of Microsoft Research's eScience Group.
1998 ACM Turing Award
4
The search for Tenacious (28/1/07 - 16/2/07)
• Night of 28/1: the USCG launched an airborne and seaborne SAR
operation for Tenacious
– The SAR lasted for nearly two weeks - no signs found
• 31/1: the scientific community mobilized to help the SAR mission using
online tools
– Computer scientists, oceanographers, engineers, volunteers, and Silicon Valley
power players [NASA’s JPL, Amazon, Microsoft, Oracle, US Navy, Monterey Bay Aquarium Research
Institute, SDSC, Cornell Theory Center, Purdue, UWisc, Singular, Canadian Space Agency, Digital Globe.]
• A blog was setup to coordinate efforts and share ideas.Main foci of the
effort were:
– Map the trajectory that Tenacious might have followed, in case Jim Gray
lost control of the boat - to help guide the SAR operation
– Discover clues about Tenacious presence at sea
– Map the trajectories of large vessels traveling in the area, that may have
collided with Tenacious
US/CG scoured 132,000 sq. miles of ocean
5
The search for Tenacious: online version
An exemplary e-Science application scenario
• A multidisciplinary virtual organization of people with a common goal
– Scientists, engineers, managers, officials, volunteers
• A variety of algorithms and software tools:
– Ocean-current models and simulators, image processing &
recognition, cellphone signal tracking and triangulation, data-format
transformation, data cleansing, satellite collection planning, data
mining, image geo-referencing
• A deluge of data (hundreds of GBs) retrieved over the net from various
sources, requiring processing and fusion to extract knowledge
– Satellite orbits, satellite imagery at different resolutions, multispectral
datasets, Web Databases, radio buoy and airborne sensors, HF radars, data
about offshore currents, Web cameras
• A federation of computing, networking and service infrastructures
– Grids, clusters, storage devices, crowd-sourcing services
7
Computing Grids
• e-Science motivated the development of Grid technologies and
Federated Computing Infrastructures during the last decade.
• The Grid vision by Foster, Kesselman, Tuecke [Grid 1.0]:
– Distributed computing infrastructures that enable
flexible, secure, coordinated resource sharing among dynamic collections of
individuals and institutions
– Enable communities ( “ Virtual Organizations ” ) to share geographically
distributed resources as they pursue common goals, in the absence of:
Homogeneity, Central location, Central control, Existing trust relationships
• The hype following the Grid:
– One of the sources of the impact of scientific and technological changes on
the economy and society [Jeremy Rifkin, “The European Dream,” Penguin
2004]
– The Grid has been described as the Next Generation Internet, the
implementation of the Global Computer etc.
8
Grid Infrastructure development
‣ Nowadays, Grid infrastructures comprise an impressive
collection of computational and software resources
‣ drawing an increasing number of users from various disciplines
9
Problem
• Collaboration is done externally to scientific
software environments
(email, web, portals, IM, etc.).
• Manual effort for transferring information
from one tool to another.
• Error prone and time consuming.
Lack of a unified, user-friendly software and
collaboration environment for scientists.
11
Current Solutions
Pros
• Professional Networking
• Minimal Collaboration Functionality
General-Purpose
Cons
OSN • External to existing scientific software
environments – Web Based
• Do not support resource* sharing
Pros
• More immersive collaboration environment
than Generic OSN.
• Resource sharing and ability to run
experiments.
Scientific OSN Cons
• Application Domain Specific.
• Proprietary infrastructures – High
maintenance.
• Introduce additional information sources ->
User Information overload 13
Our Solution
g-Eclipse (www.eclipse.org/geclipse)
• Integrated workbench framework
• Build on-top of Eclipse (Extensible and community support)
• Toolset for users, operators & developers of Grid/Cloud infrastructures
(gLite, GRIA, Amazon AWS) – Middleware agnostic
• Rich functionality:
• Development & Deployment
• Benchmarking & Testing
• Workflow Programming
Online Social Networks
• Easy establishment and management of groups
• Automatic dissemination of notifications
• Professional Networking
• High Availability
14
g-Eclipse
Grid Project
View
W
o
r
k
b
e
n
c
h
Information View Authentication View JSDL Editor View
15
g-Social
Build on-top of the g-Eclipse Framework
Aims to enable collaboration among scientists that are/will utilize g-Eclipse
Features
• Social Abstractions (Resources, Meta-data, Authentication).
• Definition of structured and standardized social meta-data
• Enrich social meta-data with links to project related resources.
• Access resources easily .
• Share project data and meta-data.
• Retrieve shared information.
• Seamless interaction with OSN.
• Facebook
• Twitter
• Extensible for other OSNs
g-Social Work Cycle 16
g-Social Abstractions
Enable seamless sharing and retrieval (via an OSN) of all particulars of the
research work performed in the context of a real scientific project.
Abstract a Scientific Collaborative Environment which utilize Online Social
Networks.
17
Abstractions - Resources
Any file(s) related to the execution of
a Grid task specific to a scientific
project
• Input / Output Dataset
• Executable
• Source Code
• Documentation
• Publications
• …
18
Abstractions – Social Meta-data
Descriptive meta-data that provide to
the OSN and its users information
about purpose and function of each
shared particular
• Name
• Function
• Purpose
• Version
• Tags
• License
• ….
19
Abstractions – Authentication Manager
Enforces security and privacy control
of users while interacting with the
OSN
• Authorization / Authentication
against an OSN
• Monitor life-cycle of authentication
tokens
20
Abstractions – Resource Manager
Resource sharing
• Interact with Authentication Manager
• Social meta-data
• Encapsulate the above in a form
acceptable by and OSN
Resource Retrieval
• Extraction of published meta-data
• g-Eclipse Authentication Manager
invocation
• Resource access via g-Eclipse file
system
• Resource import in g-Eclipse workspace
21
Abstractions – OSN Interface
• OSN are by design web-based
systems
• OSN-gEclipse interface serves as an
intermediate between the web-
browser and g-Eclipse.
• Invoking g-Eclipse when user clicks
on an g-Social link inside an OSN.
22
g-Social Implementation
• The g-Eclipse Grid Project.
• A placeholder for the organization of
files/information related to the execution of
Grid/Cloud tasks
• Executables (local file system)
• Input / Output dataset (g-Lite, AWS)
• Documentation
• Publication (IEEE, ACM, Elsevier)
• Infrastructure Configurations
23
Implementation (Social Meta-Data Editor)
• Multi-Page GUI Editor
• Easy Insertion of social
meta-data
• Specify Location of
Resources
• XML content meta-data
• Extend Job Submission Definition
Language (JSDL) schema to include
social meta-data specification.
24
g-Social View
Collaborators Search for Shared Jobs OSN Authentication
List of Shared Jobs Share Job
View Job Details
25
Implementation (g-Social View)
Authorization
• Authenticate / Authorize
against OSN
• Check auth of the underlying
storage infrastructure when
linking or retrieving a
resource
• Manage auth tokens life-
cycle
26
Implementation (g-Social View)
Share Job to OSN
• Share job details as defined
in meta-data editor
• Ask user to which OSN
details should be posted
• Parse social meta-data
• Encapsulate them in OSN
specific post formats.
27
Implementation (g-Social View)
View Share Job Details
• Social Meta-data
• Name
• Description
• Version
• Resource Handles
• Download Resource
28
Conclusions & Future Work
Conclusions
g-Social enhances integrated e-Science Tools (g-Eclipse) with
Social Networking functionality. Specifically it:
• Enables the definition of social meta-data for sharing and
retrieval of information among scientists.
• Enriches meta-data with resource handles which might be
scattered in heterogeneous storage infrastructures.
• Provides mechanisms for sharing and retrieving scientific
information with just a few clicks.
Future Work
• Standardize social meta-data definition
• Support additional OSNs
• Recommendation System
• Release g-Social to Eclipse
29
Questions – Contact Information
Andriani Stylianou (andriani.stylianou@epfl.ch)
Nicholas Loulloudes (loulloudes.n@cs.ucy.ac.cy)
Marios D. Dikaiakos (mdd@cs.ucy.ac.cy)
http://grid.ucy.ac.cy
30