Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The New e-Science (Bangalore Edition)


Published on

Keynote talk at IEEE e-Science Conference, Bangalore, December 2007 (the original Powerpoint 2007 version is available on

Published in: Technology, Education
  • is the place to resolve the price problem. Buy now and make a deal for you.
    Are you sure you want to  Yes  No
    Your message goes here
  • Hi

    I agree that it isn't just the publishing cycle that motivates scientists - there are incentive models which vary from discipline to discipline (and even within disciplines). This is why in myExperiment we focused on attribution, and the question as to whether scientists will (like teenagers) share enough to generate network effects is what I often describe as 'the experiment that is myExperiment' (we have some confidence in this through Openwetware). The myExperiment project has social scientists on board for this very reason. We sometimes say 'scientists collaborate to compete' and in some fields - like drug discovery - it's 'first past the post' that matters. We also have instances of groups being created which then share data within them, and others of individuals creating data which they then gradually expose to more people as needed, building the group around the data. Some communities have a strong notion of hierarchy. So, lots of variations. Meanwhile Openwetware demonstrates a degree of sharing which many find surprising, and shows how behaviour changes in the digital context. We believe the only way to find out is to try it!

    Good point about where to draw the line between electronic enabled science and everyday science - when does science become e-Science? Maybe this is like asking when does learning become e-learning. I am happy to raise this discussion and invite suggestions from the community! I wonder if the answer is in terms of technologies or practice? Yes, a definition might help to scope and focus the community. BTW Bringing e-science through e.g. Microsoft office tools is an area of current work in a number of groups, for example the Microsoft eChemistry project.

    The 'Grid Problem' slide deliberately over-generalises by saying 'Grid is...'. More cautiously I could say 'users perceive that in many deployed grid solutions...' I did this to attract attention from the Grid community to my message. Even though there are many good research efforts which go some way to addressing some of the issues I describe, it is not the case that these are deployed and available to users today. So (a) I very strongly encourage the excellent work of everyone who is working hard addressing these issues, and (b) I am suggesting that 'building up' from the infrastructure to the users is one methodology, but looking at it from a user (and ecosystem) perspective too will help a lot - that's my message. The Grid community does not traditionally take this approach, which is why I am provoking attention to it.

    I agree OGSA-DAI is a simple API for access to data, and I have seen good examples of it being used to 'grid-enable' datasets. If people have good examples of mashups using it I would be interested to know about them. And we are in a mixed world - in one project I have seen grid being used to collect and processs data and then Web 2.0 technologies used for ease of access to it by scientists (the lifecycle again).

    I totally agree that there are very many different types of users. The Web 2.0 methodology is one way of tackling exactly this. One of the points of the talk is to say that Web 2.0 is actually more than a set of technologies - it's a set of design patterns which transcend the immediate discussion and really say something about the relationship between technology and society as things become increasingly digital - these patterns underly the talk. I also agree with you that 'participation' (both in content and development) is not unique to Web 2.0 - in fact Web 1.0 took off for exactly that reason (we didn't all download xmosaic and access one server, we also downloaded httpd and became publishers - the network effect).

    Many thanks for your support for the vision and your constructive comments.

    -- Dave<br /><br/>
    Are you sure you want to  Yes  No
    Your message goes here
  • This is a really interesting presentation that puts eScience in the context of the wider scientific/publishing cycle. It is, however, questionable whether the publishing cycle is the 'only' thing that motivates scientists. For instance, in many cases, a scientist would not be interested in sharing their results until they win a prize (monetary or through recognition) of their results. Being able to just support collaboration, as part of the scientific process, is therefore only one aspect of a much bigger story.

    Certainly a powerful message, that eScience should be about 'empowering' everyday people doing science, rather than constrain it to a limited set of people. If this is the view taken, certainly a more interesting one, it would also be prudent perhaps to consider the development of specialist tools such as mail readers etc -- tools that all scientists require. Having this view of eScience surely gives it a more wider perspective -- and a question would be where (or should one) draw a line between electronic enabled science, and everyday laboratory based science that scientists undertake? Sure, to survive as a community, eScience needs focus?

    The comments on limitations of Grid computing is not accurate. There are now a number of APIs available that support data management -- such as OGSA-DAI. Similarly, a number of projects developing collaboratories and Problem Solving Environment have attempted to focus on the scientist in the past (to limited degrees of success). I guess one of the issues here is that different scientists have different agendas, and it is not as simple to identify the requirements that any one particular scientist (or scientific group) may have. For instance, some scientists may want tools that enable them to do better programming, whilst others require something that is graphical and enable easy interaction with some back end system. It is therefore hard to say what scientists really want, and trying to achieve one type of characterisation may be a limited view?

    On a related note, Web 2.0 is just a collection of technologies. The emphasis on the social theme that many Web 2.0 technologies advocate is certainly not a feature of Web 2.0 only, many vendors such as Yahoo groups, Google groups, etc were trying to do this for many years. The ability to use technologies such as AJAX, JSON etc could help do some of this quicker, but ultimately, the message regarding collaborative development and communities has not changed much.

    I think what you advocate is certainly an important vision for science. The ability to look at the wider scientific process is certainly an important undertaking. It is not clear whether the particular motivation for scientist that you advocate is the only one. Similarly, Grid computing is really a collection of infrastructure technologies after all -- and the limitations on slide 18 is rather limited. Many of these constraints do not hold any more -- as people continue to evolve their infrastructure and add addtional capabilities to their tools.<br /><br/>
    Are you sure you want to  Yes  No
    Your message goes here
  • The talk was videoed at the conference but I don't know if it will be made available - however I'm going to put a text summary on (which is also where you can download the original PowerPoint 2007 version of these slides - they seem to have experienced one or two font problems somewhere in the pipeline through PowerPoint 97-2003 to slideshare...)<br /><br/>
    Are you sure you want to  Yes  No
    Your message goes here
  • Slideshare empowers people too, even big Professors :)<br /><br/>
    Are you sure you want to  Yes  No
    Your message goes here

The New e-Science (Bangalore Edition)

  1. Bangalore Edition
  2. <ul><li>Due to the complexity of the software and the backend infrastructural requirements, e-Science projects usually involve large teams managed and developed by research laboratories, large universities or governments. </li></ul>e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.
  3. How do we know when e-Science has succeeded? Not just accelerated but new A. When everyone is using the Grid B. When there are routine scientific advances that would not have happened otherwise
  4. How do we move from heroic scientists doing heroic science with heroic infrastructure to everyday scientists doing science they couldn’t do before? humanists archaeologists geographers musicologists ... researchers! research It’s the democratisation of e-Science! 
  5. scientists Digital Libraries Graduate Students Undergraduate Students experimentation Data, Metadata Provenance Workflows Ontologies The social process of science Local Web Repositories Virtual Learning Environment Technical Reports Reprints Peer-Reviewed Journal & Conference Papers Preprints & Metadata Certified Experimental Results & Analyses
  6. <ul><li>Between 19 th October and 23 rd November 2007 I attended six international meetings related to e-Science </li></ul><ul><li>Grid 2007 Scientific and Scholarly Workflows e-Social Science 2007 W3C </li></ul><ul><li>Open Grid Forum Microsoft e-Science </li></ul><ul><li>This is what I found </li></ul>
  7. <ul><li>Not just a specialist few doing heroic science with heroic infrastructure </li></ul><ul><li>Chemists are blogging the lab </li></ul><ul><li>Everyone is mashing up </li></ul><ul><li>Everday hardware – multicore machines and mobile devices </li></ul>Everyday researchers doing everyday research 1
  8. <ul><li>Data is large, rich, complex and real-time </li></ul><ul><li>There is new value in data, through new digital artefacts and through metadata e.g. context, provenance, workflows </li></ul><ul><li>This isn’t “anti-computation” –design interaction around data </li></ul>A data-centric perspective, like researchers 2
  9. <ul><li>The social process of science revisited in the digital age </li></ul><ul><li>Collaborative tools – blogs and Wikis </li></ul><ul><li>e-Science now focuses on publishing as well as consuming </li></ul><ul><li>Scholarly lifecycle perspective </li></ul>Collaborative and participatory 3
  10. <ul><li>This is new and powerful! </li></ul><ul><li>Community intelligence </li></ul><ul><li>Review </li></ul><ul><li>Usage informing recommendation </li></ul><ul><li>e.g. OpenWetWare </li></ul><ul><li>e.g. myExperiment </li></ul>Benefitting from the scale of digital science activity to support science 4
  11. <ul><li>Preprints servers and institutional repositories </li></ul><ul><li>Open journals </li></ul><ul><li>Open access to data </li></ul><ul><li>Science Commons </li></ul><ul><li>Object Reuse & Exchange </li></ul>Increasingly open 5
  12. <ul><li>The technologies people are using are not perfect </li></ul><ul><li>They are better </li></ul><ul><li>They are easy to use </li></ul><ul><li>They are chosen by scientists </li></ul>Better not Perfect 6
  13. <ul><li>The success stories come from the researchers who have learned to use ICT </li></ul><ul><li>Domain ICT experts are delivering the solutions </li></ul><ul><li>Anything that takes away autonomy will be resisted </li></ul>Empowering researchers 7
  14. <ul><li>e-Science is about the intersection of the digital and physical worlds </li></ul><ul><li>Sensor networks </li></ul><ul><li>Mobile handheld devices </li></ul>About pervasive computing 8
  15. <ul><li>Everyday researchers doing everyday research </li></ul><ul><li>A data-centric perspective, like researchers </li></ul><ul><li>Collaborative and participatory </li></ul><ul><li>Benefitting from the scale of digital science activity to support science </li></ul><ul><li>Increasingly open </li></ul><ul><li>Better not Perfect </li></ul><ul><li>Empowering researchers </li></ul><ul><li>About pervasive computing </li></ul>Signs of the Times
  16. <ul><li>e-Science is now enabling researchers to do some completely new stuff! </li></ul><ul><li>As the individual pieces become easy to use, researchers can bring them together in new ways and ask new questions </li></ul><ul><li>“ The next level” </li></ul>Onward and Upward “ Standing on the shoulders of giants”
  18. <ul><li>Everyday researchers doing everyday research </li></ul><ul><ul><li>BUT heroic Grid infrastructure not being adopted </li></ul></ul><ul><li>A data-centric perspective, like researchers </li></ul><ul><ul><li>BUT Grid gives APIs to computation not data </li></ul></ul><ul><li>Collaborative and participatory </li></ul><ul><ul><li>BUT Grid has deeply rooted service provider mindset </li></ul></ul><ul><li>Better not Perfect </li></ul><ul><ul><li>BUT Grid aims to provide well-engineered perfect solution </li></ul></ul><ul><li>Giving autonomy to researchers </li></ul><ul><ul><li>BUT Grid imposes institutional control (at this time) </li></ul></ul><ul><li>About pervasive computing </li></ul><ul><ul><li>BUT Grid is about portals,not the next generation of users </li></ul></ul>The Grid Problem
  19. e-Science Pipeline e-Science Technology Creators & Integrators Applications Research EE Research Socio-economic & Commercial Innovation e-Science bespoke tailoring Mass Use by Researchers 5 years 5 years 5 years CS Research e-Science 10s of integrators 100s of embedded consultants 1000s of research users The Arrow Problem Malcolm Atkinson
  20. Web Services RESTful APIs cmd lines ssh http Web Browser Mobile phone iPod Car Equipment PDA P2P mashups workflows services applications Subject ICT experts Computer Scientists Software Companies Workflow tools Ruby on Rails ecosystem Scientists open source Software Engineers nesc
  21. <ul><li>It’s about empowerment as well as provision </li></ul><ul><li>People power </li></ul><ul><li>Hence usability: </li></ul><ul><ul><li>Simple/familiar interfaces for users </li></ul></ul><ul><ul><li>Simple/familiar interfaces for developers </li></ul></ul><ul><ul><li>No need for a summer school! </li></ul></ul><ul><li>Step into user space and look back </li></ul><ul><li>Computer Scientists as facilitators and problem solvers(?) </li></ul>For a flourishing ecosystem...
  22. <ul><li>Wikis </li></ul><ul><li>Mashups </li></ul><ul><li>REST APIs </li></ul><ul><li>Google Maps </li></ul><ul><li>Technologies: </li></ul><ul><ul><li>AJAX, JSON, Ruby on Rails, ... </li></ul></ul><ul><li>Social networking </li></ul><ul><li>Web as a distributed application platform </li></ul><ul><ul><li>Amazon S3 and EC2 </li></ul></ul>But what about Web 2.0?!
  23. <ul><li>Everyday researchers doing everyday research </li></ul><ul><li>A data-centric perspective, like researchers </li></ul><ul><li>Collaborative and participatory </li></ul><ul><li>Benefitting from the scale of digital science activity to support science </li></ul><ul><li>Increasingly open </li></ul><ul><li>Better not Perfect </li></ul><ul><li>Empowering researchers </li></ul><ul><li>About pervasive computing </li></ul>Signs of the Times The Long Tail Data is the Next “Intel Inside” Users add value Network effects by default Some Rights Reserved The Perpetual Beta Cooperate, don’t Contol Software above the level of the single device Web 2.0 patterns
  24. use Web 2.0 here? Grid
  25. use Web 2.0 here? Grid
  26. use Web 2.0 here Grid
  27. A utility is a directly and immediately useable service with established functionality, performance and dependability, illustrating the emphasis on user needs and issues such as trust Services are knowledge-assisted (‘semantic’) to facilitate automation and advanced functionality, the knowledge aspect reinforced by the emphasis on delivering high level services to the user Service-Oriented Knowledge Utility The architecture comprises services which may be instantiated and assembled dynamically, hence the structure, behaviour and location of software is changing at run-time
  28. <ul><li>Web 2.0 is not high performance </li></ul><ul><ul><li>It improves the performance of science and people! </li></ul></ul><ul><li>Web 2.0 is not a properly engineered solution </li></ul><ul><ul><li>Scientists want better, not perfect. And agility. </li></ul></ul><ul><li>Web 2.0 is not secure </li></ul><ul><ul><li>People do lots of “secure” things on the Web </li></ul></ul><ul><li>Web 2.0 is a fad that will pass </li></ul><ul><ul><li>It’s inevitable and it’s already happened! </li></ul></ul><ul><li>Web 2.0 works for teenagers but it won’t for scientists </li></ul><ul><ul><li>Let’s find out...! </li></ul></ul>Myths
  30. <ul><li>Workflows are the new rock and roll </li></ul><ul><li>Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources </li></ul><ul><li>The era of Service Oriented Applications </li></ul><ul><li>Repetitive and mundane boring stuff made easier </li></ul>E. Science laboris
  31. <ul><li>Paul writes workflows for identifying biological pathways implicated in resistance to Trypanosomiasis in cattle </li></ul><ul><li>Paul meets Jo. Jo is investigating Whipworm in mouse. </li></ul><ul><li>Jo reuses one of Paul’s workflow without change . </li></ul><ul><li>Jo identifies the biological pathways involved in sex dependence in the mouse model, believed to be involved in the ability of mice to expel the parasite. </li></ul><ul><li>Previously a manual two year study by Jo had failed to do this. </li></ul>Recycling, Reuse, Repurposing
  32. 40 Taverna downloads per day 2007 2006 2005 2004 2003
  33. <ul><li>Independent third party world-wide service providers of applications, tools and data sets. In the Cloud. </li></ul><ul><ul><li>850 databases, 166 web servers Nucleic Acids Research Jan 2006 </li></ul></ul><ul><li>My local applications, tools and datasets. In the Enterprise. In the laboratory. </li></ul><ul><li>Easily incorporate new service without coding. So even more services from the cloud and enterprise. </li></ul>e-Services in the Cloud
  34. Kepler Triana BPEL Ptolemy II
  35. is… <ul><li>“ Facebook for Scientists” </li></ul><ul><li>A community social network. </li></ul><ul><li>A gateway to other publishing environments </li></ul><ul><li>A federated repository </li></ul><ul><li>A platform for launching workflows </li></ul><ul><li>Publishing self-describing Encapsulated myExperiment Objects </li></ul><ul><li>Mindful publication </li></ul><ul><li>Started March 2007 </li></ul><ul><li>Closed beta since July 2007 </li></ul><ul><li>Open beta November 2007 </li></ul> is...
  38. Google Gadget
  39. Challenge: Policy and Permissions without Tears Ownership and Attribution
  40. 24/5/2007 | myExperiment | Slide
  41. ` Enactor HTML XML Snapshot map of resources with their relationships and versions users descriptions groups friendships tags blobs workflows
  42. scientists Graduate Students Undergraduate Students experimentation Data, Metadata Provenance Workflows Ontologies Digital Libraries The social process of science 2.0 Local Web Repositories Virtual Learning Environment Technical Reports Reprints Peer-Reviewed Journal & Conference Papers Preprints & Metadata Certified Experimental Results & Analyses
  43. <ul><li>e-Science is about doing new science </li></ul><ul><li>Grid is just one part of the solution </li></ul><ul><li>Users are not just consumers of infrastructure. Empower them. </li></ul><ul><li>Web 2.0 is a set of design patterns </li></ul><ul><li>Think Web 2.0 on top of Grid and other services </li></ul><ul><li>Workflows make e-Science easier, and Web 2 makes workflows easier  </li></ul>Take Homes 2.0
  44. <ul><li>Contact </li></ul><ul><li>David De Roure </li></ul><ul><li> </li></ul><ul><li>Carole Goble </li></ul><ul><li>[email_address] </li></ul><ul><li>Thanks </li></ul><ul><li>Geoffrey Fox, Savas Parastatides, myExperiment team & myGrid team </li></ul>