NRNB Annual Report 2013

1,859 views

Published on

Annual progress report for the NIH P41 National Resource for Network Biology

Published in: Health & Medicine
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,859
On SlideShare
0
From Embeds
0
Number of Embeds
644
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

NRNB Annual Report 2013

  1. 1. Annual Progress Report - Research Progress 2013 National Resource for Network Biology P41 GM103504 05/01/2012 - 04/30/2013 WANG, JIGUANG ZHANG, CHAO CHRISTAKIS, NICHOLAS XU, DONG KWOK, PUI-YAN TANG, LING FUNG DUTKOWSKI, JANUSZ WAAGMEESTER, ANDRA LI, JIANFENG DHRUVA, NEIL ZHOU, YIGANG SOBOL, ROBERT W SINHA, SRAVANTHI RANI FRIED, JAKE LAUNGANI, RITISHA WRENSCH, MARGARET LUNA, AUGUSTIN YUMOTO, FUMIAKI CONKLIN, BRUCE HANNUM, GREG JONES, LEANNE HANCOCK, WILLIAM S FLETTERICK, ROBERT J FIJTEN, RIANNE LOTIA, SAMAD VAN IERSEL, MARTIJN KUMAR, PRAVEEN KIPPS, THOMAS ZHANG, KANG GREGG, CHRISTOPHER KUTMON, MARTINA WILLIGHAGEN, EGON RATH, CHRISTOPHER M DORRESTEIN, PIETER ASTAKHOV, VADIM FOWLER, JAMES DUTTA, ANWESHA BANDEIRA, NUNO DAWSON, TED KAMBUROV, ATANAS SUBRAMANI, SURESH PENTCHEV, KONSTANTIN PICO, ALEXANDER DUVVURI, VIKAS NORMAN, MICHAEL L GUO, YURONG VAN ATTIKUM, HAICO FERRIN, THOMAS MAERE, STEVEN IDEKER, TREY SHIH, DAVID DEMCHAK, BARRY MORRIS, JOHN "SCOOTER" PFISTER, SABINA BANDYOPADYAY, SOURAV ECKMANN, LARS KIRBY, MICHEAL MONTOJO, JASON PEARSON, BRET ALMAN, BENJAMIN A VOISIN, VERONIQUE GILSON, MICHAEL RODCHENKOV, IGOR GRAMOLINI, ANTHONY HU, ZHENJUN KAY, STEVEN MCCONNELL, MIKE SHARMA, KUMAR BEMIS, DEBRA EMILI, ANDREW SCHWIKOWSKI, BENNO WOLF, DIETER A GINSBERG, MARK GUITHART, ORIOL CHANG, JOHN T NALDI, AURâLIEN LOPES, CHRISTIAN BADER, GARY NOIROT, PHILIPPE TAYLOR, MICHAEL ISSERLIN, RUTH ANDREWS, BRENDA SANDER, CHRIS DICK, JOHN SIMINOVITCH, KATHERINE AKSOY, BúLENT ARMAN GAIEVER, GURI SINGH, SHEILA ZACKSENHAUS, ELDAD BOONE, CHARLES JURISICA, IGOR STEIN, LINCOLN SANSONETTI, PHILIPPE VARMUS, HAROLD JIAO, DAZHI SAKUNTABHAI, ANAVAJ LIU, JEFF ZANDSTRA, PEER WALLACE, IAIN BRUN, CHRISTINE CERAMI, ETHAN FRANZ, MAX KUCHERLAPATI, RAJU DOGRUSOZ, UGUR RUGHEIMER, FRANK COLLOMBET, SAMUEL THIEFFRY, DENIS SONLU, SINAN The 2013 NRNB Network. On the left is a network representation of all NRNB personnel and collaborators (blue circles), all TRD, DPB, Collaboration, and Service projects (orange diamonds), and associated publications (green triangles). Node size is proportional to the number of connections. Thick red borders indicate personnel, projects and publications directly funded by the NRNB P41 grant. On the right is a zoomed inset, inclusive of all NRNB-funded personnel making up the vital core of the NRNB network. There are 276 nodes and 365 connections in the network. NRNB funds 46 (17%) of these nodes, which make 211 (58%) of the connections. As a Cytoscape network [1], we can interactively explore this representation with our External Advisory Committee, offering dynamic views of our projects, collaborations and budgets. Also see Appendix A for a full-page view of the entire network. HERMJAKOB, HENNING ARANDA, BRUNO GAO, JIANJIONG BAHCECI, ISTEMI LEVINE, DOUGLAS A MESIROV, JILL P WEBSTER, NICK TILL, ANDREAS DONG, YUE FIUME, MARC CHACHCHA, KHUSHI SMOOT, MIKE MORRIS, QUAID GUIDOS, CYNTHIA BRUDNO, MICHAEL BARK, STEVEN J SAITO, RINTARO ONO, KEIICHIRO KUCHINSKY, ALLAN DANSKA, JAYNE MERICO, DANIELE HOOK, VIVIAN HANSPERS, KRISTINA BROWN, JOHN MEYERSON, MATHEW L LADANYI, MARC SAWYERS, CHARLES PEROU, CHARLES M 1. Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T (2011) Cytoscape 2.8: New features for data integration and network visualization. Bioinformatics 27:431–432. BARBER, DIANE L CHANDA, SUMIT K
  2. 2. Annual Progress Report - Advisory Committee 2013 National Resource for Network Biology P41 GM103504 05/01/2012 - 04/30/2013 We held our second External Advisory Committee (EAC), on December 12, 2012, in coordination with the annual Cytoscape Workshops and Network Biology Symposium hosted by NRNB this year at the Gladstone Institutes in San Francisco. In addition to the EAC members listed below, we also had our Program Officer, Doug Healy in attendance. The following report was issued by our EAC. Participating External Advisory Committee Members: • • • • • Stephen Friend, Sage Bionetworks David Hill, Dana-Farber Cancer Institute Tamara Munzner, University of British Columbia Anya Tsalenko, Agilent Technologies Marian Walhout, University of Massachusetts Medical School Overall Perspectives of the NRNB External Advisory Board All of the members of the advisory Committee found this meeting to provide evidence of very strong progress and appreciated the increased clarity as to how to convey it to outsiders. In the past 18 months all of the major suggestions have been effectively addressed. The supplementary material has allowed a very powerful engagement by Alex Pico and the delivery of an entirely new focus to build out the cytoscape tools within a “cytoscape App store”: http://apps.cytoscape.org This has been matched by a comprehensive evolution of functionalities within the new version of Cytoscape 3.0 and a coherent maturation of all the Technology Research and Development Projects TRDs and associated Driving Biological Projects DBPs. The three major suggestions this cycle involve: 1) reviewing both the existing TRDs and DBPs to determine how midcourse optimization of these projects might allow maximal creation of “shining examples” around the strengths of the NRNB, especially by searching for new distal DBPs, 2) resolving the question of how to best measure success for the NRNB with a transition away from paper/citation based metrics to metrics of community enablement and integration, and 3) the importance of preparing for the extension by completing the draft proposal in time to engage the EAC six weeks before it is due to be submitted. In summary, the NRNB has continued to make excellent progress through the first half of this funding period and the committee is strongly supportive of the overall progress and direction. The comments below, albeit pointedly critical, are designed to help the NRNB position itself for the strongest possible competitive renewal in 18 mos. Please see the following descriptions of the specific programs for more detailed comments:
  3. 3. Specific Project Summary Statements 1) TRDs and DBPs (separate one for TRD3 and Cytoscape) All of the NRNB labs continue to do exciting and cutting edge work developing new approaches to develop network-based solutions to address important questions in biological and social sciences. The “network extracted gene ontology” is one example of integrating a novel way to better use ontologies while providing a visual output that offers a clearer and better representation of functional modules. Integrating statistical and scripting tools into Cytoscape is a decided plus, initially done in the context of social networks, that should have broad applicability. Ongoing work is proposing potential paradigm- shifting ways to answer questions and gain insight beyond traditional approaches – using link clustering and network ontology, for example. The recent set of publications across the entire spectrum of NRNB activities shows that good progress is being made in developing new network-based tools and demonstrating the value of studying networks. At the approximately halfway point of this grant, the NRNB has provided clear examples of identifying problems or critical biological questions that require novel approaches, proposing and developing solutions based on integrating information into networks, and implemented potentially useful tools for addressing similar questions. Each of the TRDs was individually successful in that regard. The challenge going forward is to clearly demonstrate that these tools and approaches have applicability beyond the questions/problem(s) that the individual TRDs tackled in the first place. One thing to consider is now how to better integrate across multiple TRDs. For example, can the tools being developed in TRD A, C, & D be used in TRD B – this could be taken on as a collaboration or via a new DBP. Can the tools in TRD D be used to add further insight in developing network as biomarkers or network ontologies efforts? TRD C has made significant and impressive progress in the past year, with flagship projects in Mosaic (ontology-partitioned mosaics) and NeXO (network extracted ontologies). The Mosaic work has already been released as a Cytoscape plugin. The NeXO work is particularly exciting as a path to data-driven ontologies rather than a single monolithic solution that is not sensitive to context. Several possible avenues for moving forward with the NeXO work were discussed, including the possibility of partnering with the existing GO project via supplemental funding. In terms of communicating the overall value of the NRNB to the broader scientific community, there are four distinct elements that need to be clearly articulated in terms of what the TRDs are doing and what the NRNB as a whole has accomplished: NRNB to date has clearly shown 1) an ability to Identify a problem/driving biological question that can not be done without a network approach; 2) an ability to develop new tools and technology for network analysis and visualization; 3) an ability to implement usable tools
  4. 4. and demonstrate proof of concept; and, the most challenging, 4) an ability to demonstrate that the tools are getting into wide use (e.g. via Cytoscape). This will require additional tracking and curation efforts that will be challenging because Cytoscape is now viewed as a “standard tool” and therefore less likely to be cited. The NRNB is poised to be more than a collection of already successful TRDs. There should be some consideration for a major paper that involves ALL TRDs and many of DBPs to show how the new suite of Cytoscape tools can help answer a major question in elucidating genotype-to-phenotype relationships. Cytoscape has become a great collection of tools and NRNB has done great science developing some new tools and using them on a specific question – but the NRNB needs to move beyond being just a developer of Cytoscape tools and should look towards becoming an entity that is more of a “whole is greater than the sum of the parts”. While the entire spectrum of projects involving all TRDs and DBPs is quite exciting, now is the time to begin considering restructuring the DBPs – potentially eliminating some – as plans are developed for the competitive renewal in 18 months. One area to consider is whether or not the NRNB should begin to branch out with respect to other disease models – much of the recent success has been focused on cancer – as there is more and more evidence for many genes to be involved in diseases very distinct from the initial disease associated with any given gene. As previously, Hill’s lab is willing to serve as an alpha or beta test site for data integration and novel visualizations as well as testing plug-ins for statistical analysis coupled to visualizations. In Summary, it is clear that some TRDs are progressing well and are on track to roll out tools for network biology that will be widely used. In other cases, it is not clear the right audience is being reached. With this in mind, we recommend that the NRNB perform a comprehensive review of all TRD projects and strive to align them with a set of DBPs that represent the most active user communities in network biology with the following goals: ● Reach out to key/hub user bases for each technology ● Pursue opportunities for cross pollination/integration/pipelines across NRNB technology projects, which are currently being developed in isolation ● Identify other important resources and tools that NRNB TRDs could integrate with Cytoscape Progress: The team has made great progress towards Cytoscape 3.0: the beta release has been available for many months, and the full release is coming very soon. Many suggestions from the last meeting have already been incorporated, including identifying which previous plugins are high impact and devoting resources to make sure that these are
  5. 5. ported to the new version. The issue of backwards compatibility was raised again, since Cytoscape 3.0 introduces major API changes that prevents old plugins from working without code updates. The verbal answer made it clear that choices had been carefully considered in consultation with the developer community. In particular, the assurance was made that API compatibility is a guaranteed contract for all 3.x versions with no changes made before version 4.0, thanks to the use of semantic versioning. The suggestion was made once again to ensure that keeping the API stable is a very high priority, because as the user community grows in size the costs of breaking backwards compatibility increase accordingly. The consensus was that the process taken as described verbally was sound; it was just poorly documented in the written report. The suggestion for next time is to more explicitly document several things: - process taken (to show that care was in fact taken) - lessons learned: what worked, what didn't - plans for the future The team has made great progress in better documenting the use of Cytoscape by the biology community, with compelling statistics about the amount of use (including the impressive number of 1400 NIH grants). The changes made to the cytoscape.org front page with the tumblr feed showing images and the explicit encouragement that people should cite its use are great. The use of resources to also manually track the divergence between citation rates and use is entirely appropriate (with the interesting result that use is at least 2x the citations). There are many new exciting technical directions. The new AppStore will benefit many constituencies: developers, end-users, and the PIs themselves in documenting usage of its efforts by the community. The set of new features chosen also reflects the needs of many constituencies, for example scaffolding new users with the new welcome/startup screen, and supporting developers with the new API. It's also heartening to see technology transfer from the visualization community with the incorporation of edge bundling. The report mentioned new support for 3D rendering. Concerns were raised about whether devoting resources to this effort is appropriate given the empirical work from visualization community that has found many drawbacks to 3D layout of node-link graphs. The verbal answer was the new modular architecture allows alternate renderers, and 3D was simply one of several, and it was developed by a community member rather than the core developers. 2) Outreach and Impact At the last advisory board meting it was suggested to “distribute open source network
  6. 6. technologies to the greater scientific community”. This meeting Alex Pico presented the NRNB execution on that suggested deliverable. Simply stated there has been awesome progress and much of this stems from the direct leadership of Alex in his new role as an Executive Director of the NRNB. Whether measured by the recently published article in Nature Methods “A travel guide to Cytoscape plugins, or through a visit to the cytoscape app store you can get to by googling “cytoscape apps” http://apps.cytoscape.org or by looking at how often they are used, this stands out as a remarkable success. It is now possible to extend this powerful start and consider annotating it with sections for open source and non-open source apps. There is a possibility to begin a dialog between those that desire new apps with those willing to build them. It might even be possible to now have funding listed and contests to encourage the building out of the most requested apps. 3) Moving forward: Ideas and Topics for Discussion A lot of discussion about moving forward to NRNB effort was centered on increasing outreach to potential users of NRNM resources including Cytoscape, as well as tracking the use of these resources. Big progress has been made already through http://www.nrnb.org website, Cytoscape app store, but more could be done. Some suggestions for increasing outreach to users included targeted communications to potential users either subscribed to Cytoscape mailing list, or authors of papers using Cytoscape. Connections to various social media resources like twitter or facebook could be increased. Quantitatively this outreach could be measured by the number of groups using NRNB resources, not in number of papers citing these resources or Cytoscape. Some of the papers may not cite Cytoscape directly, but have it buried in the Supplementary information that is not being searched or not cited at all. Impact of Cytoscape and NRNB tools in general could be increased by connecting to other public resources for molecular and computational biology. One example is connection with GenomeSpace (www.genomespace.org) which is a platform that connects different bioinformatics tools, making it possible to move data smoothly between these tools and leveraging available analysis and visualizations. Other public resources that could benefit from connection to Cytoscape include Galaxy, KnowledgeBase, and IGV. Sharing between users could be increased by enabling smooth sharing Cytoscape networks on Google Drive or Amazon Cloud, as well as the use of Cytoscape web. One area of applications of network biology tools that could be significantly expanded going forward is social network research, especially analysis of social and molecular networks, and interactions between different groups. NRNB group made an impressive progress with tens of successful Google Summer of Code projects. Going forward it would be great to track careers of these students and students from NRNB mentorship program as another way to measure impact on community and science.
  7. 7. 4) Suggestions For Next EAC Meeting and Report: 1. Next Report This year's report was much better than last year's; however, there is still room for improvement. As suggested, the emphasis shifted from the science results of the DBPs to the more appropriate new developments created through the TRDs; that's a major improvement. However, the problem of documenting to what extent the output of this and previous funding -- new tools or methods -- are used in biological discovery could be even more clearly addressed. For example, in the group's own research papers that are not directly about the development of Cytoscape itself, to what extent was the use of Cytoscape instrumental in achieving the research results? We suggest that this story should be told very explicitly. Another suggestion for the next round is to provide a full list of results or subprojects at a fine-grained level, for example a specific new Cytoscape plugin or new analysis method proposed in a research papers. For each result, identify progress according to a four key milestones: 1. Identify problems 2. propose solutions (for example, new methods in published paper) 3. build generally available tool 4. get other people to use it The goal should not be to reach the final milestone for every idea, but to document progress in terms of moving from earlier ones to later ones. Subprojects may enter at any stage, they don't have to be seeded only through the DBPs in the original grant. Subprojects may also exit at any stage, for example when the decision is made to propose alternate new solutions rather than following up with tool building in every case. It was clear from the verbal discussion that the center should be able produce some very satisfying answers of its achievements along these lines, and that these proofs of accomplishment will be a compelling and convincing part of a renewal proposal. This type of reporting will also help with the argument that the impact of Cytoscape and the NBRB goes beyond simple publication counts and citation counts. The deeper goal of the center is to introduce and encourage network methods in the biology community, so documenting the adoption of methods and tools shows progress towards that goal. A second suggestion is to more clearly explain the boundary between this P41 and the other sources of funding: the related R01, and the grants supporting the DBPs. Ideker articulated a clear story in response to EAC questions: the $300K/yr R01 funds maintenance, while new technology springs from the $700K/yr P41. The committee approves of this story; it just needs to be told clearly and concisely in the written
  8. 8. materials. In particular, document what efforts are funded through the R01 and what are through the P41. Although the NRNB has broader scope than Cytoscape alone, since it is partially funding core Cytoscape work the best way to address this boundary is to at least briefly present the full picture of what work on Cytoscape has been done, and then to explain what parts were funded by the P41. The current report gives the full picture of Cytoscape development, but does not adequately explain the boundary. The administrative information section is very well done. The budget is clearly explained, with crosscutting breakdowns between categories (staff vs. TRDs vs. PI salaries) and PI groups. The breakdown of expenses according to both FTEs and money was also helpful. The discussion of the importance of actively cultivating an open development community is articulate. 2. Next Meeting First, the EAC should be sent the relevant written materials to read in advance of the actual meeting. This year, the report was provided on paper to committee members at the start of the meeting, with an electronic version following a few hours into the meeting. This timing is too late, because it's hard to assimilate the written report in parallel with attending to the presentations. The report should be provided to committee members in advance, ideally one week before the meeting, and at bare minimum at least two days before the meeting. The late timing this year was particularly frustrating given that this report was created many months ago, but through an oversight hadn't been forwarded to us. Second, the EAC agreed that we would best serve the interests of the NBRB by scheduling our next meeting shortly before the renewal proposal is due in what we think will be June 2014. Our intent is to act as pre-reviewers, where we will read a full draft of the proposal in detail before the meeting and then devote the meeting to an in-depth discussion of ways to strengthen and improve it. We propose roughly six weeks before the proposal is due: early enough that our feedback can be responded to, but late enough that the draft proposal is nearly complete rather than preliminary. This meeting would be roughly 1.5 years from now, the same amount of time that has elapsed between our first and second meetings. Third, a suggestion for the renewal proposal is to have a large set of short testimonials from users, rather than (or in addition to) the more usual approach of full formal letters of support from a small number of people. The testimonials would be a few sentences or a paragraph about how Cytoscape has been valuable in their own work; having dozens or even hundreds of these compiled together in one document might have enormous impact on reviewers. 5) Collaborations and service projects A major goal of NRNB is to support collaborations with a broad variety of researchers in Biomedical science. Different types of collaborations have been initiated from very small support-style collaborations to larger collaborations that require active participation by NRNB. The EAC was very impressed with the overall number of collaborations. At the
  9. 9. time of the previous SAB meeting, there were 36 active research collaborations with NIH-supported researchers. In the last 1.5 year or so, another 60 were added, making a total of 96. One issue is that the majority of collaborations are internal Better advertisement of NRNB and its collaborative goals at relevant scientific conferences may help to acquire more external collaborations. Collaborations are only a small part of the NRNB budget with an estimated cost of ~$100,000, but are highly effective at leveraging the NRNB expertise to expand the overall impact and reach of Cytoscape. The term ‘collaboration’ is used in a way that is somewhat ambiguous: within the CSP umbrella is included tiny-scope efforts called ‘support’ (33%), small-scope efforts called ‘consulting’, and medium- scope efforts called ‘collaboration’. However, the DBPs are what we might consider true collaboration, and the hope is that some of the mediumscope efforts would evolve into new DBPs over the time, even as some previous DBPs might be scaled back into a smaller role. However, since the term ‘CSP’ is the standard vocabulary defined by the grant, perhaps it is not realistic to rename these mediumscope efforts. It would be useful to see these numbers proportioned for internal versus external collaborations. These collaborations are currently tracked in a publicly available and transparent way on the NBRB web site with titles, investigators, and NRNB contact. It would be useful if their status could also be tracked. For the renewal, it will be very important to obtain letters or a filled out survey from collaborators regarding the utility of Cytoscape and how it changed their research. 6) Promising ideas for potential supplemental funding The first supplemental effort provided to the NRNB enabling the Cytoscape App Store project has turned out to be a remarkable return on investment, demonstrating a capacity for greater creativity and productivity. We highly recommend additional supplemental grants to maintain, or even increase, this level of activity. During the advisory meeting, we explored a number of proposals worth considering: 1. Moving NeXO forward (see TRD A) by partnering with existing GO projects 2. Enable Cytoscape users to record/reuse/host/share workflows and sessions to promote network biology use cases, enriched publications, reproducibility and collaboration. 3. Interface with a specific key technology that targets a strategic community ripe for network biology perspective/tools (e.g., MIDAS, UCSC Genome Browser, NCBO BioPortal, Galaxy, GenomeSpace, Sage Bionetworks/Synapse, DREAM)
  10. 10. Annual Progress Report - Administrative Information 2013 National Resource for Network Biology P41 GM103504 05/01/2012 - 04/30/2013 Administrative Structure During the first year, we defined the administrative structure of the resource, including some unique new roles within the organization. The roles of Principal Investigator (PI), Co-PI, External Advisory Committee (EAC), Resource Administrator and Chief Software Architect were defined as in the original grant. We defined a new role of Executive Director (ED) to oversee some of the new resource functions that NRNB provides, including Training & Outreach, Communications and Infrastructure. The ED (Alex Pico, Gladstone Institutes) is responsible for coordinating these efforts as well as conducting all of the necessary tracking and due diligence for the annual reporting to NIH. During the second year, we defined the new role of Collaboration Coordinator to screen and process collaboration requests to our resource. This has been a vital role in supporting the 80+ ongoing collaborations during the past two years. During the third year, we defined a proper position for the Roving Engineer who is vital for outreach to new users, app developers and strategic partnerships. Our Roving Engineer is also a major contributor to Cytoscape core design and implementation, embodying the full cycle from users to developers to implementation to release. Finally, we are very pleased to have maintained an active dialog with our EAC members, including Dr. Stephen Friend as chair of the committee. Budget changes have been minimal over the three years, with the exception of the new Collaboration Coordinator and TRD increases for Pico, Ideker and Sander in Year 2, and the new Roving Engineer and subsequent TRD cuts to Pico and Ideker in Year 3. The trend over time has been toward supporting more Outreach initiatives to fulfill our P41 goals. A B Outreach Ideker TRDs Pico Sander Bader Admin Schwikowski Co-PIs Fowler
  11. 11. Figure 1. Budget graphs. Area charts showing the distribution of funds for years 1-3 (x-axis) per category (A) and per group (B). Y-axis is in units of $1,000s of US dollars. Each stripe typically corresponds to an individual with a specific role in NRNB, totaling 6.5 FTEs. Note that groups are sorted by degree of change, which is critical in this style of visualization to minimize misperception of change when slopes are actually parallel. As the basis for the graphs above, here are itemized tables of FTEs and funding for all three years (Table 1). Highlighted in red are the significant changes in Year 3 to FTEs and total dollars. Roles and Groups Collaboration (Ideker) Admin-Asst. (Ideker) Core Tech. (Ideker) TRD-A (Ideker) Admin-PI (Ideker) Communication (Pico) Admin-ED (Pico) Roving Engineer (Pico) TRD-C (Pico) Co-PI (Pico) TRD-A (Sander) Co-PI (Sander) TRD-C (Bader) Co-PI (Bader) TRD-D (Schwikowski) Co-PI (Schwikowski) TRD-B (Fowler) Co-PI (Fowler) SUBTOTAL Supplement (Ideker) Supplement (Pico) Supplement (Bader) SUBTOTAL GRAND TOTAL Year 1 0.00 1.00 0.40 0.50 0.30 0.30 0.50 0.00 0.20 0.02 0.65 0.02 1.00 0.10 1.00 0.08 1.00 0.10 7.17 FTEs Year 2 0.50 0.56 0.40 0.50 0.30 0.30 0.50 0.00 0.48 0.02 0.65 0.02 1.00 0.10 1.08 0.08 0.72 0.10 7.32 Year 3 0.63 0.56 0.40 0.50 0.29 0.25 0.50 0.12 0.13 0.02 0.62 0.02 0.91 0.10 1.08 0.08 0.20 0.10 6.51 Year 1 0 52 47 40 74 29 56 0 21 5 90 5 90 0 81 0 58 21 669 $1,000s Year 2 50 38 51 45 78 29 56 0 39 5 97 5 93 0 83 0 54 26 750 Year 3 50 41 53 36 77 25 57 16 17 0 98 5 90 0 83 0 53 27 728 0.00 0.00 0.00 0.00 7.17 0.40 1.00 0.40 1.80 9.12 0.40 1.00 0.40 1.80 8.31 0 0 0 0 669 45 85 45 175 925 45 85 45 175 903 Table 1. NRNB effort and budget. Annual budgeting of FTEs and $1,000s itemized by roles (per group). Major changes are highlighted in red. Subtotals are provided separately for the main grant and supplemental funding (bold) and Grand Total is in the last row. Allocation of Resource Access Beyond the active distribution and support of Cytoscape, which is covered in later sections, NRNB resource allocation can be categorized in the following way: 1. On-site training events: NRNB staff participated in 13 training events during the reporting period. These events include tutorials, workshops and courses.
  12. 12. 2. Requests for collaboration and mentorship: For the second consecutive year, we have maintained a high number of active collaborations. Many of these collaborations are coming through our participation in Google Summer of Code (GSoC) and our own NRNB Academy efforts (see #3). 3. Google Summer of Code and NRNB Academy: In addition to receiving requests from potential students through these programs, we also receive requests from a number of groups to join our organization as mentors. This brings new technology and ideas to our effort. GSoC has been our most successful outreach program by far. It’s responsible for a quarter of all our NRNB collaborations. It is the most active period for NRNB.org, granting broad exposure for NRNB in the open source community. Building on the success of this model, we launch NRNB Academy last year. Our Academy follows the same approach as GSoC, organizing around available mentors, ideas and interested students. However, we are not restricted to supporting university students in our program as it is independent of GSoC and 100% volunteer based. The Research Progress and Highlights provide more details. 4. Requests for training material support: We receive requests for tutorial materials throughout the year from inside and outside the Cytoscape core development team. Our homegrown Open Tutorials system makes it easy to accommodate all such requests. Open Tutorials is an easy-to-use wiki system that provides content formatted to be used as online sessions, slide shows and printed handouts. This year we are seeing more content from more contributors, in addition to a steady rise in visitors (see details in the Training section below). 5. Providing software community support: Our goal is to develop a generic template of services based on the support we provide the Cytoscape community of users and developers. So far we have extended support to Cytoscape, WikiPathways, Cytoscape Web and the cBio Cancer Genomics Portal. These proven resources demonstrate the broader scope of the NRNB mission. We are providing distribution links, showcases, tutorial support, news and event tracking, and GSoC and NRNB Academy participation to these projects. New this year, is a gallery page with screenshot for all of these tools. Awards and Honors None Dissemination Overall Cytoscape Version 3.0 (v3.0) was released for unrestricted public use on February 1, 2013. It represents an evolution of v2.x resulting from a two-year collaboration of a multinational, multiinstitution team of programmers and biologists. This report describes the Cytoscape software, the infrastructure that supports it, and the activities of the community it serves. Background The overall mission of Cytoscape is to be a freely available worldwide asset supporting network analysis and visualization for systems biology science. The major focus of v3.0 is the modularization and rationalization of code to solve stability issues in v2.x encountered as multiple developers pursued multiple agendas. Under v2.x, internal programmatic interfaces evolved from one release to the next, leading to the failure of working plugins over time and
  13. 13. negative interactions between otherwise working plugins. Ultimately, this resulted in loss of programmer and user productivity, and undermined community confidence in Cytoscape. v3.0 addresses these issues by adopting modular coding practices promoted by the OSGi1 architectural framework. This enables both the Cytoscape core and externally developed apps (formerly called plugins) to evolve independently without compromising unrelated functionality. At the logical level, Cytoscape leverages OSGi precepts to produce v3.0 APIs having cleaner and clearer demarcations between functional areas. At the deployment level, OSGi enables onthe-fly substitution of one processing element for another (e.g., apps) in order to tailor Cytoscape to meet user requirements at runtime without reinstalling or reconfiguring Cytoscape. v3.0 represents a strong investment toward reducing future development and support costs, and increasing reliability and evolvability. We expect to leverage v3.0 as a platform to satisfy the evolving needs of multiple stakeholder groups, and as a platform enabling research on leading edge analysis and visualization techniques. v3.0 is the intended successor to v2.8, with development and support of v2.8 expected to diminish and disappear over time in favor of v3.0 and its successors. v3.0 is upward compatible with v2.8, but not downward compatible. While v3.0 is a substantial reorganization of v2.8, its launch marks an evolution in the Cytoscape team’s approach to community engagement, where different community demographics are engaged in different, demographic-sensitive ways. The team identified four major groups: new users, casual (but not new) users, power users, and app developers. Initial v3.0 release was promoted towards power users and app developers as a way of delivering v3.0’s advanced capabilities to groups most able to leverage them, give qualitative and remedial feedback, and promote v3.0 adoption to other Cytoscape users. This strategy dovetails with v3.0 features (described below) that lower barriers to entry for new and casual users while enabling efficiency and productivity for power users and app developers. The second release (v3.0.1) is imminent – it incorporates various critical fixes and numerous feature requests made by early v3.0 adopters. As such, it will be promoted to the entire Cytoscape community, including new and casual users. v3.0.1 will become the default Cytoscape download, replacing v2.8 as the default. As compared to v2.8, Cytoscape users will benefit most directly from the v3.0 in the long run by: • experiencing  fewer  core  and  app  bugs  from  one  release  to  the  next   • the  availability  of  more  and  richer  apps  (due  to  developers  spending  less  time  tracking  and  fixing   bugs)   • more  core  features  with  higher  biological  and  logistical  value  (due  to  improved  flexibility   provided  by  interface-­‐driven  development)   The v3.0 Release Throughout 2012, Cytoscape developers made a number of beta versions available to early adopters. Issues were tracked in RedMine, and were contributed by both developers and early adopters. The final release was made on February 1, 2013, accompanied by updated user documentation, user tutorials, JavaDoc programmer documentation, app developer tutorials, a new App Developer Cookbook (containing useful code snippets), and release notes. 1  www.osgi.org  –  also  used  as  the  basic  framework  for  Eclipse  and  numerous  commercial  products  
  14. 14. Additionally, a new and comprehensive user-focused Welcome Letter was created to differentiate between different user demographics and engage them appropriately. Principle v3.0 development was carried on by staff and researchers worldwide, including the following institutes: UC San Diego, Pasteur Institute, University of Toronto, Gladstone Institute (UC San Francisco), University of Amsterdam. v3.0 included the following major features: • Upward  compatibility  with  Cytoscape  2.x  networks,  attributes,  analysis,  layout,  and  display   • App  Store  (for  centralized  app  availability)   • Friendly  Welcome  dialog  (to  engage  new  and  casual  users)   • Import  network   • Edge  bend  visual  property   • Edge  bundling   • Grouping  (for  hierarchical  networks)   • Enhanced  search   • Show  All  in  Table  Browser   • Multiple  network  management   • Major  refactoring  to  rationalize/regularize  inter-­‐module  interfaces  (to  aid  app  developers  in   creating  reliable  apps)   Major issues remaining after the v3.0 release included: • Slower  startup  than  v2.x   • Fewer  apps  (plugins)  than  v2.x   • Numerous  undiscovered  or  unaddressed  bugs  (due  to  major  refactoring)   • Smaller  network  capacity  on  32  bit  processors   There are 145 apps (plugins) available in v2.x, though many have gone unmaintained and have fallen out of use. Of the v2.x plugins, 8 were delivered in v3.0 as core functionality: EnhancedSearch MetanodePlugin2 PSICQUICUniversalClient GraphMLReader NCBIEntrezgeneUserInterface ScriptEngineManager JavaScriptEngine NetworkAnalyzer Additionally, the App Store contained another 13 apps (corresponding to many of the most popular v2.x plugins): AgilentLiteratureSearch Cy3PerformanceReporter jActiveModules CentiScaPe Cyni Toolbox MCODE ClueGO CyPath2 PathExplorer CluePedia DynNetwork Venn and Euler Diagram ClusterOne GeneMANIA Bug Bounty To foster early investment and engagement in v3.0 by the user community, we created the Cytoscape Bug Bounty program, which paid out small prizes to users identifying high value bugs in the month of February 2013.
  15. 15. The program produced 35 bugs by 17 qualified reporters: 8 crash/data loss, 19 user interface, and 7 cosmetic. Gift cards were given to the top 9 reporters. It  was  great  fun  to  participate  in  the  February  Bug  Bounty.  Thank  you  for  organizing  it,   and,  in  general,  thank  you  for  making  the  development  of  Cytoscape  an  open  process.   It’s  really  appreciated,  from  the  point  of  view  of  the  users,  when  a  software  is  developed   this  way.   In  general,  I’ve  found  that  the  new  Cytoscape  3.0  version  is  a  great  improvement  over   the  previous.  The  new  “Welcome  screen”,  together  with  many  little  improvements  to  the   menus  and  the  interface,  gave  me  a  feeling  of  very  user  friendly  software.  The  ability  of   downloading  whole  species  for  networks  with  a  click,  or  to  import  them  from  many   sources,  is  attractive  to  many  people,  and  I  know  some  persons  who  will  use  it  for  their   work.  The  App  store  is  also  a  nice  addition,  as  it  is  much  better  to  have  a  common  web   page  for  all  the  plugins  instead  of  having  to  look  for  documentation  dispersed  into  many   little  websites.2   The v3.0.1 Release The v3.0.1 Release is scheduled for April 18, 2013. Its main purpose is to eliminate bugs leading to data loss, program crashes, misleading displays, and small user interface issues. Given this, we expect that it will be suitable for use by the entire Cytoscape community (including new and casual users) in preference to v2.8, and we expect v3.0.1 to become the default download on the Cytoscape web site. The first v3.0.1 release candidate (RC) will become available for download by April 4. It will include fixes or resolutions for 98 reported bugs and other issues, including 30 of 35 reported under the Bug Bounty program. Notably, the v3.0.1 release: • Substantially  increases  the  size  of  network  manageable  on  32-­‐bit  systems   • Migrates  source  from  SVN  to  GitHub  (to  expand  collaboration  opportunities)   At release time, we expect there to be slightly under 200 bugs or unresolved issues remaining on our backlog, including feature requests and issues requiring substantial development or rework. Additionally, app developers have asked for improved documentation to enable quick and reliable app development. Currently, UC San Diego is upgrading three v2.8 plugins to become v3.0 apps, and expects completion in Q3 2013: • GenomeSpace   • MiMI   • BiNGO   Additionally, the NRNB has offered Amazon gift certificates as rewards to app developers for the first 20 apps independently developed and submitted. 2  Giovanni  Marco  Dall’Olio,  March  8,  2013  via  e-­‐mail  
  16. 16. Bug and Issue Tracking Since early 2011, the Cytoscape team has tracked bugs and issues using the RedMine cloud service. As of v3.0, users can inject reports of bugs and issues into RedMine directly from Cytoscape. A CDF plot of bugs and issues logged over time shows aggressive tracking: The following CDF shows that the Cytoscape team has responded to logged reports (by addressing them as bug fixes or scheduling them to be addressed in the future). “Created” means that a ticket was opened, and “Updated” means that a Cytoscape team member has acknowledged it, and has prioritized it for solving or has already solved it. Measured Results Cytoscape Downloads and Web Site Visits Through 2013, the overall number of Cytoscape downloads (including v2.8 and v3.0) continues to rise. The chart below shows the monthly download counts, with data dropouts in November,
  17. 17. 2007 and March, 2009. In February 2013, the download count was 6,685, and the count for March was 7,323. Since 2012, weekly visits (outside of holidays) have increased. The Cytoscape v3.0 web page was first put up in October 2012. The trends since the February, 2013 release are too new to yield conclusions, though it seems that visits have measurably increased. Visits to the Cytoscape download page have remained somewhat constant over time, though seem to have increased since v3.0’s February 2013 release.
  18. 18. In examining year over year visit patterns, 2013 visits have increased by about 30%, with an uptick corresponding to the v3.0 release timeframe. This pattern is reflected in visits to the download page, too. Note that visits to the v3.0 page are associated with about 25% of page visits. (Note that visits to the v3.0 page are prerequisite to downloading v3.0, and therefore bounds the count of v3.0 downloads. Visiting the v3.0 page can have many purposes, only one of which is downloading v3.0.) Between January 1, 2012, and the end of March, 2013, the Cytoscape web site received 393,903 distinct visits. Web site visitors were geographically dispersed worldwide:
  19. 19. Cytoscape visitors arrived most often after performing a Google search, but also arrived from direct links and from links within Cytoscape web pages:
  20. 20. App Store The App Store opened for business on June 1, 2012. Since then, it has received over 33,000 visits from users worldwide: Most visits originate from a link within the Cytoscape web site but a significant number of visits launch from search engines and direct links:
  21. 21. Except for during the holiday season, the traffic to the App Store has consistently grown. By March, 2013, weekly visitors numbered between 1,100 and 1,300. Through March, 2013, a total of 33,596 visits were received: Interest was evenly distributed across a number of app categories: The most frequently downloaded apps (as of March, 2013) were: App ClueGo GeneMANIA jActiveModules MCODE Count 1,394 1,230 1,196 980
  22. 22. Cytoscape Citations The count of Cytoscape-citing papers continues to accelerate year-over-year, with the count for 2013 being incomplete (as of March, 2013). Year-over-year growth has been historically sporadic, and may be showing signs of slowing: Year-over-year Growth 2004-2005 64% 2005-2006 72% 2006-2007 126% 2007-2008 94% 2008-2009 80% 2009-2010 8% 2010-2011 32% 2011-2012 19% 2012-2013 incomplete Community Outreach The Cytoscape community consists of core developers, app developers, and users. Communication and outreach is multimodal: Google Groups for contemporaneous discussion, Google video and Hackathons for core developer meetings, papers, web site and social media, and public meetings and symposia. Google Groups and Video The Cytoscape team has maintained Google Groups since April, 2011. As of March, 2013, there were 4 groups:
  23. 23. Group cytoscape-discuss cytoscape-helpdesk cytoscape-announce cytostaff Membership 1,531 1,148 918 49 Topic Count 2,570 1,413 194 2,643 The discuss and helpdesk groups facilitate self help (through search), peer assistance, and assistance directly by Cytoscape core developers. The announce group is used by Cytoscape core developers to announce new Cytoscape releases, and by app developers to announce new apps. The cytostaff group enables communication between Cytoscape core developers to coordinate activities and exchange technical information. Cytoscape core developers also meet on video chat weekly to plan agendas, triage issues, and conduct infrastructure activities. Hackathons The Cytoscape team conducted a Hackathon at the Gladstone Institute in San Francisco on December 12, 2013, concurrently with the annual general Cytoscape symposium. Participants laid out the following roadmap for short and medium term development: • Table  loading  performance   • Network  panel  update   • Command  language  support   • Search/Filter  API   • Property  Sheets   • Separation  of  ViewModel   • Advanced  Label  Rendering  (Zoom/multi-­‐scale)   • JSON  package  to  support  external  processes   • SBGN  symbols   • Table  merge   • Vizmapper  documentation   • Developer  requests   o Integration  to  R/scripting   o XMLRPC/REST  access   o Headless/daemon  mode   Web Site and Social Media The main Cytoscape web site (cytoscape.org) was augmented to include a branch for v3.0, which includes user and developer documentation, links to the Welcome Document and release notes, and links to presentations and social media sites. Notably, videos of app presentations at the December 13-14 general Cytoscape symposium were posted at: http://nrnb.org/presentations.html
  24. 24. Future Risks The primary objective of the architectural refactoring that transformed Cytoscape v2.8 to v3.0 was to normalize relationships amongst subsystems so that changes could be made in one subsystem without detriment to another. While this evolution has been accomplished, much code was changed, and bugs continue to be discovered and reported by the user community. For now, the community remains forgiving and indulgent, mainly because Cytoscape’s basic functionality appears sound. However, the community perspective may change when v3.0 becomes the default download. While bugs can be fixed on point releases, slow startup times and the slow conversion rate of v2.x plugins into v3.0 apps remain a threat for several quarters. Mitigating strategies include continuing the excellent and diligent support offered by the Cytoscape team and community, which serves to help prioritize release features and to keep user frustration from growing. Additionally, software reliability can be improved by incrementally developing automatic test suites beyond what exists today. While Cytoscape’s semantic versioning provides app developers with important guarantees of interface- and semantic-consistency as Cytoscape evolves, it’s possible that semantic versioning itself may threaten to retard plugin authorship, rendering Cytoscape unresponsive to scientific requirements in meaningful timeframes. The interfaces defined in Cytoscape 3.0 have been shown to be insufficient for the needs of new apps in some cases. While new interfaces can be added, doing so requires incrementing the minor version number (e.g., from 3.0 to 3.1), which is intended to occur only rarely. Furthermore, the operational complexity and overhead of making new Cytoscape releases virtually guarantee the slow evolution of Cytoscape interfaces. Mitigating strategies include deliberately hastening the pace of interface-augmenting releases and engaging app developers to aggressively feed interface requests to the team – possibly at the expense of core development. Notwithstanding the enormous benefits of the architectural refactoring, critical Cytoscape subsystems (e.g., user interface and apps) remain tightly coupled. This coupling threatens (at best) to recapitulate the tangled relationships that triggered the refactoring or (at worst) make the replacement, scaling, or reuse of these subsystems problematic. Eventually, this threatens the evolvability of Cytoscape to serve scientific interests in relevant timeframes. Mitigating strategies include focused refactoring of key subsystems along SOA (service oriented architecture) or COA (component oriented architecture) principles to expose and separate distinct concerns. This type of refactoring can occur while implementing a given use case, and then leveraged to benefit subsequent, related use cases. Patents, Licenses, Inventions, and Copyrights None. We are committed to an Open-Source dissemination policy. Training and Outreach Annual Cytoscape Retreat The annual Cytoscape Workshops and Symposium was hosted by the National Resource for Network Biology (NRNB) at the Gladstone Institutes on the UCSF Mission Bay campus in San Francisco during this reporting period. In addition to developer meetings, the event included user and new developer tutorials, a Plugin/App Expo, a special Network Biology symposium,
  25. 25. and our EAC meeting. The meeting was a huge successful with capacity attendance for the user tutorial and very positive survey responses from attendees. Workshops For the reporting period, NRNB has participated a total of 13 training events in multiple countries. These events include tutorials, workshops and courses. Cytoscape is taught in many classroom and workshop settings. We try to track all of these on our website and Event Tracker. We’ve identified 37 courses offered in the 2012-2013 calendar year! And these are just the ones affiliated with NRNB staff. Open Tutorials Our tutorial management system, Open Tutorials, is still the main source for tutorial materials for the Cytoscape project, and is being used both internally by presenters, and by researchers and developers. Visits to Open Tutorials have continued to increase over the last year, with an average of 3750 visits/month, as compared to 2700 visits/month for the previous reporting period. More than half of all visits (57%) are from new visitors. We estimate that the increase in traffic is mainly from users, as we have had only two new editors in the same period. Tutorial development during the past year was focused on a set of user tutorials for Cytoscape 3.0, covering the most common use cases and describing the user interface and new welcome screen. We plan to add several additional user tutorials over the next 6 months. Overall, Open Tutorials has allowed NRNB to reach our goal of providing tutorial support to a broad and diverse community. Social Media We have initiated a social media effort for Cytoscape through a number of different tools (http://www.cytoscape.org/community.html). For example, a Twitter account is used for quick announcements (http://twitter.com/cytoscape) and YouTube is utilized for video tutorials (http://www.youtube.com/results?search_query=cytoscape). During this reporting period we continued the popular Tumblr site to capture published figures using Cytoscape. Pairs of figures are posted on a weekly basis on the front page of cytoscape.org based on this Tumblr feed. We now regularly get authors submitting their recent publications to us, asking to feature them via our Tumblr site. This is directly helping to promote the use and citation of Cytoscape. Google AdWords We were awarded a non-profit account in the Google AdWords program. We are managing 8 Ad Group campaigns consisting of over 880 keywords and phrases. Last month alone we received over 7,000 clicks on these ads to our NRNB sites. These activities are worth over $8,800 a month (a 550% increase over last year), which we are getting free-of-charge. We have a spending limit of $329 per day through this program, a potential value of $120,000 per year, so we will continue to identify new ads and relevant resources. Google Summer of Code and NRNB Academy In addition to the outreach effort described above, we also leverage a Google-sponsored program called Google Summer of Code to attract new developers. This year we are coordinating 30 mentors, leveraging the effort of developers from open source communities surrounding NRNB-related tools. Last summer through the GSoC program we received over 60
  26. 26. student applications. From these we selected 16 students to mentor on Cytoscape and NRNBrelated projects. All 16 projects passed and completed the summer successfully! Google paid $5,000 per student, making their investment $80,000 in NRNB for 3 months of work. Inspired by this very successful model for recruiting new code contributors, we designed and launched NRNB Academy last year. Through NRNB Academy, we offer anybody the opportunity to work with our open source development team on network biology related tools and resources. The program offers a framework for training by providing project ideas and by pairing participants with mentors. It is completely volunteer-based and offers participants flexible project terms. Since its launch in January 2011, we have had 14 requests from participants, and we currently have 4 students enrolled. The first graduate completed their project in September 2012. In addition to ongoing student projects, the program has also resulted in one collaboration and continues to be a source for project ideas and mentors for our GSoC effort. Based on our experience so far, this program is not only effective in producing useful tools and resources, but it also serves as a mechanism to increase long-term development collaborations. Our first graduating student continues to be involved as a contributor, and two of the ongoing students are involved in longer-term ongoing projects as well.
  27. 27. Annual Progress Report - Research Highlights 2013 National Resource for Network Biology P41 GM103504 05/01/2012 - 04/30/2013 Contents ● ● ● Network Approach to Building Gene Ontologies First Release of Cytoscape 3.0 and the Cytoscape App Store NRNB Google Summer of Code Program Reaches New Levels Network Approach to Building Gene Ontologies Ontologies are of key importance to many domains of biological research. The Gene Ontology (GO), in particular, has been instrumental in unifying knowledge about biological processes, cellular components, and molecular functions through a hierarchy of concepts and their interrelationships. However, given only partial biological knowledge and inconsistency in how this knowledge is curated, it has been difficult to construct, extend and validate GO in an unbiased manner. We have recently showed that the existing collection of high-throughput network maps, as are now becoming available, can be analyzed to automatically assemble an ontology of gene function that rivals manually curated efforts [1]. Our systematic computational approach combines evidence from physical, genetic and transcriptional networks to produce an ontology comprised of 4,123 biological concepts and 5,766 hierarchical concept relations. Using a new ontology alignment procedure, we found that the network-based ontology captures the majority of known cellular components and identifies approximately 600 new cellular components and component relations – many of which we were able to validate either experimentally or bioinformatically. By working closely with the GO curators, we were able to incorporate selected new components and relations into the Gene Ontology, thus providing proof-of-principle for how to systematically update and revise the GO structure based on largescale data 1. Dutkowski, J., Kramer, M., Surma, M.A., Balakrishnan, R., Cherry, J.M., Krogan, N.J., and Ideker, T., A gene ontology inferred from molecular networks. Nat Biotechnol, 2013. 31(1): p. 38-45. First Release of Cytoscape 3.0 and the Cytoscape App Store The overall mission of Cytoscape is to be a freely available worldwide asset supporting network analysis and visualization for systems biology science. Cytoscape Version 3.0 (v3.0) was released for unrestricted public use on February 1, 2013. It represents an evolution of v2.x resulting from a two-year collaboration of a multinational, multi-institution team of programmers and biologists. The major focus of v3.0 is the modularization and rationalization of code to solve stability issues in v2.x encountered as multiple developers pursued multiple agendas. Version 3.0 addresses these issues by adopting modular coding practices promoted by the OSGi
  28. 28. architectural framework. This enables both the Cytoscape core and externally developed apps (formerly called plugins) to evolve independently without compromising unrelated functionality. Since 2012, weekly visits (outside of holidays) have increased. The Cytoscape v3.0 web page was first put up in October 2012. The trends since the February, 2013 release are too new to yield conclusions, though it seems that visits have measurably increased. Visits to the Cytoscape download page have remained somewhat constant over time, though seem to have increased since v3.0’s February 2013 release. To help address the needs of users, we launched the Cytoscape App Store (http://apps.cytoscape.org) to coincide with the release of Cytoscape 3.0, a major rearchitecturing of Cytoscape for improved stability, performance, and versatility. The overarching goals of the Cytoscape App Store are to highlight the important features apps add to Cytoscape, to enable researchers to find apps they need, and for developers to promote their apps. For each Cytoscape 3.0 app, the App Store supports unique features like one-click install and comprehensive download statistics. The App Store opened for business on June 1, 2012. Since then, it has received over 33,000 visits from users worldwide. Except for during the holiday season, the traffic to the App Store has consistently grown. By March, 2013, weekly visitors numbered between 1,100 and 1,300. Through March, 2013, a total of 33,596 visits were received. The App Store is already playing a broader role in the Cytoscape community than just a place for browsing and submitting apps. For instance, we held a competition for the best Cytoscape 3.0 apps in December 2012. The first prize was shared by ClueGO, which visualizes the relationship between gene ontology terms; and DynNetwork, which visualizes networks with time-based movement. We plan to host more competitions in the future to encourage Cytoscape 3.0 app development. Apps and the app developer community play a critical role in success of Cytoscape, ensuring its continued relevance and reach as the field of network biology evolves. The new Cytoscape App Store aims to increase the visibility and accessibility of apps, providing support to both Cytoscape users and app developers. We anticipate that traffic will continue to increase as apps–and the App Store–become more prominent in the Cytoscape community. NRNB Google Summer of Code Program Reaches New Levels Last summer through the Google Summer of Code (GSoC) program we received over 60 student applications. From these we selected 16 students to mentor on Cytoscape and NRNBrelated projects. All 16 projects passed and completed the summer successfully! This is almost double the number of students we mentor through GSoC in a typical year and puts NRNB in the top 10 supported organizations out of 180 open source orgs accepted into the Googel program. Google paid $5,000 per student, making their investment $80,000 in NRNB for 3 months of work. Inspired by this very successful model for recruiting new code contributors, we designed and launched NRNB Academy last year. Through NRNB Academy, we offer anybody the opportunity to work with our open source development team on network biology related tools and resources. The program offers a framework for training by providing project ideas and by pairing participants with mentors. It is completely volunteer-based and offers participants flexible
  29. 29. project terms. Since its launch in January 2011, we have had 14 requests from participants, and we currently have 4 students enrolled. The first graduate completed their project in September 2012. In addition to ongoing student projects, the program has also resulted in one collaboration and continues to be a source for project ideas and mentors for our GSoC effort. Based on our experience so far, this program is not only effective in producing useful tools and resources, but it also serves as a mechanism to increase long-term development collaborations.
  30. 30. Summary Continued advances in high-throughput experimental technologies release enormous amounts of interaction data into the public domain. Analysis of these interactions – and the networks they form – relies in large part on robust bioinformatics technology. The mission of the NRNB (nrnb.org) is to develop and support a suite of bioinformatics tools that broadly enable the study of network biology. In our third year as a resource, we have significantly advanced our goals through basic research, collaboration, dissemination of software tools, and community support. Here, we describe our progress in research, both basic and collaborative. This progress includes the use of network modules for patient diagnostics; tools that use ontologies to enable new network analyses and visualizations; tools that generate ontologies from networks; novel investigations at the interface of social networks and health; and major new releases of our Cytoscape platform and App Store. Each progress report below specifies the associated personnel and FTEs funded by the NRNB grant. In terms of our own research, NRNB enables a stable effort from each of the resource member sites, ranging from 0.20 to 1.08 FTEs. Many of these TRD projects leverage effort from other grants and funding mechanisms as well in order to maximize the return on investment. Nevertheless, without NRNB support, these projects would be significantly diminished, if not discontinued, and would lack the cohesion and synergy provided by a network biology resource (see reports #1-7 below). In terms of the services, training and dissemination, the impact of the NRNB resource is clear. Specifically, the extra effort needed to drive our mailing list response rate to over 90% is due to this resource (see Administrative Information report); the Open Tutorials system for collecting, maintaining and serving tutorial materials; the administration of NRNB’s participation in Google Summer of Code and our own NRNB Academy (see report #9 below); the organization of the annual Network Biology SIG and Cytoscape Workshops; the new Cytoscape App Store, which has catalyzed Cytoscape user and developer communities (see report #10 below). These efforts are maintained by the 0.5 FTE executive director and 0.25 FTE communications coordinator roles defined and funded by NRNB. And finally, NRNB has wide-ranging impact on biomedical research, both nationally and internationally through its collaboration projects. NRNB member sites were collectively maintaining an estimated two-dozen collaborations prior to the formation of this Resource. During the first year, we established close to 40. And for the past two years, NRNB is now maintaining 80-100 collaboration projects. These project range from the application of Cytoscape as a research tool for network analysis and visualization, to the development of Cytoscape plugins for custom data types and analyses, to the development and application of other network and pathways tools and resources for network biology (see report #8 below). This activity is a direct result of NRNB roles for executive director, communications coordinator and collaboration coordinator (0.63 FTE). We’ve come a long way in just three years, and NRNB is still maturing. With continued support, we are committed to maintaining and growing these efforts as a Resource for the network biology community.
  31. 31. Contents I. Technology Research and Development: Progress and Applications References and figures are provided for each project and numbered independently. This year, per the direction of our EAC, we are using a 4-Stage model to provide a common context in describing the wide variety of technologies being developed in both our TRD and Collaboration projects. You will see references to "(Stage 2)", for example. The 4-Stage model is described and illustrated at the beginning of the next section (II. Collaboration, Table 1.). 1. 2. 3. 4. 5. A Gene Ontology Extracted from Molecular Networks (Ideker) Network Analysis Tools for Cancer Genomics (Sander) Network Analysis Methods for Inferring Causality in Signaling Networks (Sander) Using Cytoscape for Social Network Research (Fowler) Cytoscape 3.0 and CytoscapeWeb for the Visualization and Representation of Biological Networks (Bader) 6. Analyzing Complex Networks Using Ontologies and Cytoscape 3.0 (Pico) 7. The CYNI Modular Network Inference Framework (Schwikowski) II. Collaboration and Service Projects: Progress In addition to the direct impact of our TRD projects on our research, NRNB also impacts new science through our many CSPs. A description for each CSP is provided in the bulk of the report. Here, we summarize the scope of our collaborations and provide a new 4-Stage model and illustration to convey the range of our efforts as well as progress from year-to-year. Major service projects are also described in this section. 8. Collaboration Landscape 9. Google Summer of Code and NRNB Academy III. Progress on Supplemental Award, 2011-2013 We were awarded a two-year supplemental grant to work on the Cytoscape App Store. This is a progress report on the second year. 10. The Cytoscape App Store (Pico, Bader) Appendix A. The 2012 NRNB Network A full-page view of this year’s network representation of NRNB.
  32. 32. I. Technology Research and Development: Progress and Applications References and figures are provided for each project and numbered independently. This year, per the direction of our EAC, we are using a 4-Stage model to provide a common context in describing the wide variety of technologies being developed in both our TRD and Collaboration projects. You will see references to "(Stage 2)", for example. The 4-Stage model is described and illustrated at the beginning of the next section (II. Collaboration, Table 1). 1. A Gene Ontology Extracted from Molecular Networks (Ideker, 0.5 FTE: Janusz Dutkowski) Ontologies are of key importance to many domains of biological research. The Gene Ontology (GO), in particular, has been instrumental in unifying knowledge about biological processes, cellular components, and molecular functions through a hierarchy of concepts and their interrelationships. However, given only partial biological knowledge and inconsistency in how this knowledge is curated, it has been difficult to construct, extend and validate GO in an unbiased manner. We have recently showed that the existing collection of high-throughput network maps, as are now becoming available, can be analyzed to automatically assemble an ontology of gene function that rivals manually curated efforts [1]. Our systematic computational approach (Fig. 1) combines evidence from physical, genetic and transcriptional networks to produce an ontology comprised of 4,123 biological concepts and 5,766 hierarchical concept relations (Fig. 2). Using a new ontology alignment procedure (Fig. 1), we found that the networkbased ontology captures the majority of known cellular components and identifies approximately 600 new cellular components and component relations – many of which we were able to validate either experimentally or bioinformatically. By working closely with the GO curators, we were able to incorporate selected new components and relations into the Gene Ontology, thus providing proof-of-principle for how to systematically update and revise the GO structure based on large-scale data (Stages 1 & 2). The network-extracted ontology is a new resource for systems and synthetic biology – i.e. a data-driven catalogue of cellular machinery, from genes, to complexes, to pathways and higherorder processes. It provides a powerful tool for performing multi-scale analysis of biological networks, including automatically identifying, annotating and visualizing the complete hierarchical structure. We also show how integrating the ontology with additional highthroughput datasets leads to identification of new components and processes altered in human disease. Based on our results, we suggest a new role for ontologies in bioinformatics: rather than merely being used as a gold-standard for performing functional enrichment, ontologies should serve as evolvable models that are validated, revised, and expanded based on new genomic data. Moving forward, it will be interesting to see how the network-extracted ontology can further be extended. For instance while NeXO represents a rigorous approach to capture ontology terms and term relations, the ability to systematically annotate the type of relation that occurs between terms (e.g. “is a”, “part of”, “regulates”) poses a separate and very interesting challenge. An in-
  33. 33. depth investigation is needed to assess which network properties are best at separating the different types of relations, and whether there are additional data sets that might be brought to bear on this problem (Stage 3). Similarly, while NeXO identifies the majority of known cellular components, it will be interesting to further investigate what types of network data could be used to increase the coverage of biological processes and molecular functions. Finally, a key question is whether enough high-quality data exist to build NeXO ontologies for other species, particularly human, and, whether it is better to structure a common ontology for all species, as has been done in GO, or to focus on individual species-specific ontologies. Figure 1. Automated assembly and alignment of gene ontologies. (A) Probabilistic community detection within the input networks yields a binary tree in which nodes correspond to ontology terms and links correspond to parent-child term relations. Unsupported terms are replaced by multi-way joins, and additional parent-child relations are added based on network data. The resulting ontology is aligned against the Gene Ontology, in a way that (B) prohibits non-unique mappings and ancestor-descendant criss-crossing.
  34. 34. Figure 2. The NeXO ontology is shown as a tree, with nodes indicating terms and edges indicating hierarchical relations between terms, i.e. that one term contains another. Node sizes indicate the number of genes assigned to a term. Node colors represent the degree of correspondence to a term in GO as determined by ontology alignment, with high-level alignments labeled. Insets show the hierarchy identified for the ribosome and actin cytoskeleton.   References 1. Dutkowski, J., Kramer, M., Surma, M.A., Balakrishnan, R., Cherry, J.M., Krogan, N.J., and Ideker, T., A gene ontology inferred from molecular networks. Nat Biotechnol, 2013. 31(1): p. 38-45. 2. Network Analysis Tools for Cancer Genomics (Sander, 0.62FTE: Ben Gross) This project is focused on building network analysis tools for interpreting high-throughput cancer genomic data sets to identify altered disease networks and enable the identification of networkbased biomarkers in cancer. Our primary focus is building user-friendly, open source tools for visualizing and analyzing multidimensional cancer genomic data sets (including copy number, mutation, and mRNA expression) in the context of known biological pathways and interaction networks, and making these tools broadly available to clinical, experimental and computational investigators within the cancer research community. Providing such tools to the cancer research community is critical, as numerous large-scale projects, including the Cancer Genome Atlas (TCGA) project and the International Cancer Genome Consortium (ICGC), are profiling dozens of cancer types and subtypes. Identifying altered pathways and networks within each of these cancer types remains a critical and open challenge. During our first several years of NRNB funding, we completed a prototype project for displaying multi-dimensional cancer genomic data in the context of molecular interaction networks. We
  35. 35. chose to implement the prototype in CytoscapeWeb [1], as CytoscapeWeb does not require any additional software installation or require Java Web Start. It therefore significantly lowers the barriers for usage, particularly for biologists and clinical researchers ----- two of our main target user groups. We transitioned our tools from prototype to production mode (Stage 3), and have made our software available to the entire cancer research community. Cancer researchers are now using these tools to perform network analysis on up to 20 different cancer types, including TCGA-funded projects, such as glioblastoma multiforme (GBM) [2] and serous ovarian cancer [3] (Stage 4). The cBioPortal for Cancer Genomics code base has recently reached a stable state where it is now being considered as a general framework to build our other NRNB-related tools on. Our recently finalized drug-target data support in the context of cBioPortal’s network analysis is one such example. During the past year, we improved the network analysis capabilities of the cBioPortal by providing query and visualization of aggregated drug data from multiple resources. With this new feature, the portal currently contains gene-centric drug-target information from the following resources: DrugBank [8], KEGG Drug [9], NCI Cancer Drugs (http://www.cancer.gov/cancertopics/druginfo/alphalist), and Rask-Andersen et al. [10]. Within the network analysis view, drugs are hidden by default, but can be added to the network via the Genes & Drugs menu on the right side of the screen. Users now have the option of displaying FDA-approved drugs, cancer drugs defined by NCI Cancer Drugs, or all drugs targeting the query genes. For example, when the user queries for the gene EGFR in the portal, we not only show the network context of this gene, but also provide information about the drugs targeting the product of this gene: gefitinib and erlotinib are tyrosine kinase inhibitors that target the catalytic domain of EGFR, and cetuximab and trastuzumab are monoclonal antibodies that target the extracellular domain of EGFR and ERBB2, respectively (Fig. 1) [11].
  36. 36. Figure 1: Improved Network tab: Network analysis of epidermal growth factor receptor networks in serous ovarian cancer. (A) Network view of the EGFR and ERBB2 neighborhood in serous ovarian cancer (TCGA data set) rendered by Cytoscape Web. EGFR and ERBB2 are query genes (thick border), and nearest neighbor genes are color coded by their alteration frequency in ovarian cancer. One can display drugs that target EGFR or ERBB2 (hexagons, orange if FDA approved), as well as details about genomic alterations and links to external resources (lower left panel, example MYC). (B) The portal overlays multidimensional genomic data (copy number, mutation, and mRNA expression) onto all nodes in the network. (C) Edges can represent different interaction types (color-coded, such as “reacts with”). (D) Options for filtering, cropping and searching the network of interest. Our new drug-target feature is now available as part of the open-access cBioPortal and is helping cancer researchers in exploring the therapy options within the network context of genes of interest (Stage 4). Outreach Plans Since its launch in mid-2010, the cBioPortal has been extensively used by cancer researchers around the globe, particularly by The Cancer Genome Atlas (TCGA) network. The portal currently attracts more than 1,500 unique visitors per week. In order to help researchers use cBioPortal in their studies, we are actively communicating with various communities, such as the TCGA network and publicizing the tool through different channels. During the last year, we have adapted and are currently maintaining an e-mail list for users who have questions regarding the use of the cBioPortal. This e-mail list and the questions answered by our group are publicly available at our Google Groups page (http://groups.google.com/group/cbioportal/). Furthermore we have recently completed a manuscript that explains the use cases of cBioPortal and its network analysis feature in details
  37. 37. in order to encourage wider adaptation. We believe this publication (Science Signaling) will help researchers interested in Cancer Research to use the portal in a more efficient way. We have also participated in the last year’s Google Summer of Code (GSoC) Program for two separate projects under the NRNB organization. The first project, a Cytoscape 3.0 Application to facilitate downloading cancer genomics data through the cBioPortal Web API services, was successfully lead by Dazhi Jiao under the advisement of two members from our group. This Cytoscape 3.0 application allows users to download data from cBioPortal, visualize it in the network context either in an overall or sample-specific manner, and analyze it with the help of additional Cytoscape 3.0 applications (see Figure 2). The source code for this project is freely available at our Google Code project web site (http://bit.ly/cbioportal). The software implementation for this project is currently being finalized (Stage 3) and we are planning to distribute this application through Cytoscape’s App Store interface in the next year (Stage 4). Figure 2: A screenshot of the Mondrian application, an open-source project conducted as part of the Google Summer of Code 2012 project. The image shows how genomics data, downloaded from the cBioPortal through this application, is being overlaid onto the user’s network of interest. Once the data is loaded from the cancer studies of interest through the cBioPortal’s Web Api, users have the option to explore multi-dimensional cancer-related data within Cytoscape framework in a fashion that is similar to cBioPortal’s network analysis feature. Our second GSoC project was lead by the summer student, Istemi Bahceci under the coadvisement of one member of our group in conjunction with our NRNB-collaborator Ugur Dogrusoz at Bilkent University. The aim of this project was to extend CytoscapeWeb to support the Systems Biology Graphical Notation (SBGN) for more detailed biological pathway visualization. This project was completed over the last summer and we are currently
  38. 38. integrating it into the cBioPortal’s to provide better network analysis options for users (Stage 4, please see the following section). New Driving Biological Projects In the next year, we are anticipating improving the network analysis feature in two ways: 1) detailed visualization of the pathways and reactions in the network view; 2) inference of indirect drug targets, for potentially interesting therapy options, by using genomic alteration and drug-target data. Currently, interaction types that are shown in the network analysis view are derived from the BioPAX to SIF inference rules [7]. For example: In Same Component indicates that Genes A and B are involved in the same biological component, such as a complex; State Change indicates that Gene A causes a state change, such as a phosphorylation change within Gene B. This reduction from BioPAX to SIF was necessary as the Cytoscape Web framework, by then, was not supporting visualization of more complex elements, such as compartments. With the technology being developed as part of the CSP-100 project (Gary Bader), it recently become feasible to visualize biological networks in a more detailed way, therefore enabling the use of Systems Biology Graphical Notation (SBGN) for better representation of BioPAX. As part of our NRNB collaboration with Ugur Dogrusoz (Bilkent University, Turkey), we are aiming to adopt SBGN-complaint views to visualize multi-dimensional cancer genomics data with the network context (see Figure 3). This project has recently been implemented as a proof-of-concept prototype and is now being integrated into cBioPortal (Stage 3 -> 4). When complete, this new feature will allow better presentation of proteomics data (e.g. Reverse Phase Protein Array data provided as part of the TCGA network) by allowing users to optionally switch from a genecentric to protein-centric view. Figure 3: Proposed additions to the current simple network view. On the left is the Simple Interaction derived from BioPAX; on the right is an example visualization of a BioPAX network obtained from Pathway Commons. The latter is utilizing the new visualization capabilities, Systems Biology Graphical Notation (SBGN), of the CytoscapeWeb project. The SBGN view provides a more detailed representation
  39. 39. of the pathway and also provides protein-centric view with Proteomics data mapped to specific proteins or phospho-proteins. In the next year, we are also planning to utilize genomic alteration and pathway data to infer clinically relevant uses of drug-target data. For this, we intend to use down- and up-stream relationships between genes to suggest drugs of possible interest that can indirectly target a particular genomic alteration event in cancer samples (see Figure 4). One historical example for such cases is the use of AKT inhibitors in patients who bear a homozygous PTEN deletion. Without the gene PTEN and its product, Akt proteins, which are down-stream of PTEN, cannot be suppressed, and therefore are found to be upregulated in cancer samples that have the homozygous PTEN deletion. In the presence of an AKT inhibitor, this up-regulation effect can be counteracted. Another similar example of this concept is the use of CDK4/6 inhibitors when CDKN2A is either mutated or homozygously deleted in cancer cells. Pathway resources, such as Pathway Commons, already provide this type of relationships between genes; and we plan to extract this information in a systematic way and combine it with the drug-target data in order to infer such therapy options in an automatic manner within the cBioPortal framework. This method and the prototype are currently under development (Stage 2). Figure 4: Conceptual framework for inferring novel and drug-based therapy options based on specific genomic alteration with the use of pathway context -- e.g. use of AKT inhibitors when PTEN is altered in the tumor. 3. Network Analysis Methods for Inferring Causality in Signaling Networks (Sander, 0.62FTE: Ben Gross) The goal of our second TRD project is to develop network analysis tools that algorithmically infer causality within signaling networks and make these tools available. High-throughput screens conducted with libraries of small molecules or inhibitory RNAs have the ability to identify compounds that induce tumor suppressive responses in cancer cells [12]. While the effects of such perturbations can be easily linked to transcriptional changes, identifying the causal mechanism is a main challenge. In collaboration with Somwar and colleagues [13], we
  40. 40. used a computational approach to predict the target of a small molecule inducing reduced growth in lung adenocarcinoma cell lines. Interestingly, experimental follow up confirmed the prediction. Building on this concept, we have been working on computational approaches to model causal signaling cascades inducing observed transcriptional changes within perturbed cancer cell lines. We have been exploring the use of optimization algorithms adapted from statistical physics to identify the minimal set of interactions able to connect genes that are differentially expressed after a perturbation, with candidate targets of the same perturbation (Stage 2). This initial approach relied on an algorithm that solves the Steiner-tree problem. Given a set of “terminal” nodes, the Steiner-tree is defined as the tree of minimum weight connecting these terminals, allowing the inclusion of additional nodes. Differentially expressed genes after a perturbation and/or candidate targets of the same perturbation can be used as terminals. Our prediction was that the resulting Steiner-tree could therefore contain both gene interactions able to explain the observed transcriptional changes and the putative target of the perturbation. Within this past year, we determined that this approach does not work as well as expected, and are now in the process of exploring a new algorithmic framework that combines Gaussian graphical models with maximum entropy methods. New Driving Biological Projects A new biological driver for deriving causality networks is inferring causal relationships within data types and between data types, such as copy number changes and cancer genomics. For example, we would like to investigate the relationship between mutations in the TP53 tumor repressor and the complex copy number profile in ovarian cancer. Another example is the exploration of causal relationships between gene mutations. For example, mutations in the POLE gene lead to a characteristic spectrum of mutations in other proteins. We have preliminary results and plan to develop a network analysis approach to identify causal relationships. We are also considering looking at interactions between microbial subpopulations, starting with the gut microbiome, where a set of interacting bacterial populations change under fluctuating constraints provided by the host and nutrient intake. Recent work has shown the precise composition and evolution of this population is closely coupled to the state of health of the host. Certain deviations from equilibrium present a significant risk of invasion by pathogenic bacteria, as seen with some cancer patients receiving bone marrow transplantations [23]. A more detailed understanding of the relationships between gut microbial subpopulations following such aggressive treatments in the host could inform therapeutic development leading to improved outcomes. Our Related Publications • Gao J, et al, Integrative Analysis of Complex Cancer Genomics Profiles using the cBioPortal. Science Signaling Protocol (in Press).
  41. 41. • Molinelli* E, Korkut* A, Wang* W, MIller M, Gauthier N, Jing X, Kaushik P, et al. Perturbation Biology: inferring signaling networks in cellular systems. PLoS Comp Bio (in Review). • Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature. 2012; 490(7418):61-70. • Cerami E, et al, The cBio Cancer Genomics Portal: An open platform for exploring multi-dimensional cancer genomics data. Cancer Discovery. May 2012, 2:401. • The Cancer Genome Atlas Network, Comprehensive Molecular Characterization of Human Colon and Rectal Cancer. Nature 2012; 487(7407):330-337. • The Cancer Genome Atlas Network, Comprehensive genomic characterization of squamous cell lung cancers. Nature 2012; 489:519-525. References 1. Lopes CT, Franz M, Kazi F, Donaldson SL, Morris Q, Bader GD: Cytoscape Web: an interactive webbased network browser. Bioinformatics, 26(18):2347-2348. 2. TCGA: Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 2008, 455(7216):1061--1068. 3. Integrated genomic analyses of ovarian carcinoma. Nature 2011, 474(7353):609-615. 4. Keshava Prasad TS, Goel R, Kandasamy K, Keerthikumar S, Kumar S, Mathivanan S, Telikicherla D, Raju R, Shafreen B, Venugopal A et al: Human Protein Reference Database--2009 update. Nucleic acids research 2009, 37(Database issue):D767-772. 5. Matthews L, Gopinath G, Gillespie M, Caudy M, Croft D, de Bono B, Garapati P, Hemish J, Hermjakob H, Jassal B et al: Reactome knowledgebase of human biological pathways and processes. Nucleic acids research 2009, 37(Database issue):D619-622. 6. Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, Hannay T, Buetow KH: PID: the Pathway Interaction Database. Nucleic acids research 2009, 37(Database issue):D674-679. 7. Cerami EG, Gross BE, Demir E, Rodchenkov I, Babur O, Anwar N, Schultz N, Bader GD, Sander C: Pathway Commons, a web resource for biological pathway data. Nucleic acids research, 39(Database issue):D685-690. 8. Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, Pon A, Banco K, Mak C, Neveu V et al: DrugBank 3.0: a comprehensive resource for 'omics' research on drugs. Nucleic acids research 2011, 39(Database issue):D1035-1041. 9. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, & Kanehisa M (1999). KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic acids research, 27(1), 29–34 10. Rask-Andersen M, Almen MS and Schioth HB: Trends in the exploitation of novel drug targets. Nature Drug Discovery 2011; 10:579-590. 11. Raymond E., Faivre S, Armand JP: Epidermal growth factor receptor tyrosine kinase as a target for anticancer therapy. Drugs 2000; 60:41-42. 12. Somwar R, Shum D, Djaballah H, Varmus H: Identification and preliminary characterization of novel small molecules that inhibit growth of human lung adenocarcinoma cells. Journal of biomolecular screening 2009, 14(10):1176-1184. 13. Somwar R, Erdjument-Bromage H, Larsson E, Shum D, Lockwood WW, Yang G, Sander C, Ouerfelli O, Tempst PJ, Djaballah H et al: Superoxide dismutase 1 (SOD1) is a target for a small molecule identified in a screen for inhibitors of the growth of lung adenocarcinoma cell lines. Proceedings of the National Academy of Sciences of the United States of America 2011, 108(39):16375-16380. 14. Stratton MR, Campbell PJ, Futreal PA: The cancer genome. Nature 2009, 458(7239):719--724. 15. Hanahan D, Weinberg RA: The hallmarks of cancer. Cell 2000, 100(1):57--70. 16. Hanahan D, Weinberg RA: Hallmarks of cancer: the next generation. Cell 2011, 144(5):646-674.
  42. 42. 17. Ciriello G, Cerami E, Sander C, Schultz N: Mutual exclusivity analysis identifies oncogenic network modules. Genome research 2012, 22(2):398-406. 18. Vaske CJ, Benz SC, Sanborn JZ, Earl D, Szeto C, Zhu J, Haussler D, Stuart JM: Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics 2010, 26(12):i237-245. 19. Vandin F, Upfal E, Raphael BJ: Algorithms for detecting significantly mutated pathways in cancer. Journal of computational biology : a journal of computational molecular cell biology 2011, 18(3):507-522. 20. Turner N, Tutt A, Ashworth A: Hallmarks of 'BRCAness' in sporadic cancers. Nat Rev Cancer 2004, 4(10):814-819. 21. Storrs C: Combing the Cancer Genome. The Scientist 2012, Mar. 22. Taylor BS, Schultz N, Hieronymus H, Gopalan A, Xiao Y, Carver BS, Arora VK, Kaushik P, Cerami E, Reva B et al: Integrative genomic profiling of human prostate cancer. Cancer cell 2010, 18(1):11-22. 23. Jenq RR, Ubeda C, Taur Y, Menezes CC, Khanin R, Dudakov JA, Liu C, et al. Regulation of intestinal inflammation by microbiota following allogeneic bone marrow transplantation. J Exp Med. 2012;209(5):903-911. 4. Using Cytoscape for Social Network Research (Fowler, 0.2FTE: James Fowler) In addition to the Network Correlation plugin developed in collaboration with Alex Pico's group last year, we have now also used Cytoscape to study the network of interactions in the Olfactory system for a manuscript on “Friendship and Natural Selection” (in review) (Stage 1). A theory paper predicting what we observe here was published last year [1]. The target audience for this work is other social network scholars and people interested in applying these techniques to social network data. In our friendship and natural selection paper we show that friends are more genetically related than strangers, to the tune of about fourth cousins. So any project that takes into account population structure might also consider structure induced by friendship. And if there are gene characteristics available in Cytoscape, we could apply the Network Correlation plugin to see how far in gene-gene interaction networks these characteristics tend to correlate (Stage 2-4). In terms of Cytoscape integration with our new work, it would be great to have a database of natural selection scores available for each gene in human studies, so scholars could easily visualize what parts of their network are under recent natural selection. I have Pardis Sabeti’s Composite of Multiple Signals scores for about 3 million SNPs. It would also be nice to have easily available within Cytoscape tools for translating from SNPs to genes. We will work with the Cytoscape team to make natural selection data available and to implement methods for its visualization. This work might relate to the visualization TRDs by Bader and Pico groups within NRNB. Social Networks and Health We originally proposed using trend motifs as a new statistical method to investigate "Social Networks and Disease". We are no longer working on this because we have come to believe there are other methods that are more suitable. So we are now at Stage 1 in our work on "Social Networks and Health", which has already led to a number of publications:
  43. 43. • Strully KW, Fowler JH, Murabito J, Benjamin EJ, Levy D, Christakis NA. Aspirin Use and Cardiovascular Events in Social Networks, Social Science & Medicine 74 (7), 1125–1129 (March 2012) • O'Malley J, Arbesman S, Steiger DM, Fowler JH, Christakis NA. Egocentric Social Network Structure, Health, and Pro-Social Behaviors in a National Panel Study of Americans, PLoS ONE 7(5): e36250 (May 2012) • Christakis NA, Fowler JH. Social Contagion Theory: Examining Dynamic Social Networks and Human Behavior. Statistics in Medicine 32 (4): 556–577 (February 2013) • Shakya HB, Christakis NA, Fowler JH. Parental Influence on Substance Use in Adolescent Social Networks. Archives of Pediatrics & Adolescent Medicine 166 (12): 1132-1139 (December 2012) • Rudolph AE, Crawford ND, Latkin C, Fowler JH, Fuller CM. Individual and Neighborhood Correlates of Membership in Drug Using Networks with a Higher Prevalence of HIV in New York City (2006-2009), Annals of Epidemiology, forthcoming The target audience for this work includes scholars in public health. I will be teaching a class on networks and we will use existing tools in Cytoscape there that contribute to Stage 4 (broad adoption) for this approach. There are also some non-health-related projects that will be precursors to a new project in which we will match death records to the Facebook data to ascertain social network correlates of health: • Jones JJ, Settle JE, Bond RM, Fariss CJ, Marlow C, Fowler JH. Inferring Tie Strength from Online Directed Behavior. PLoS ONE 8 (2): e52168 (February 2013) • Jones JJ, Bond RM, Fariss CJ, Settle JE, Kramer ADI, Marlow C, Fowler JH. Yahtzee: An Anonymized Group Level Matching Procedure. PLoS ONE 8 (2): e55760 (February 2013) • Bond RM, Fariss CJ, Jones JJ, Kramer ADI, Marlow C, Settle JE, Fowler JH. A 61-Million-Person Experiment in Social Influence and Political Mobilization. Nature 489: 295–298 (13 September 2012) This work might be ideally suited for a supplement grant. We would use the Facebook and death data to predict longevity and health factors that influence it (like MI). Next, we would develop and disseminate a Facebook App that anyone can download that will give them health stats based on their data. We could then use Cytoscape to show people their networks and the health risks of their friends and friends’ friends (Stage 4). References 1. Fu F, Nowak MA, Christakis NA, Fowler JH. The Evolution of Homophily. Scientific Reports 2: 845 (13 November 2012) 5. Cytoscape 3.0 and CytoscapeWeb for the Visualization and Representation of Biological Networks (Bader, 0.91FTE: Christian Lopes, Jason Montojo, Igor Rodchenhov) Technologies developed with NRNB funds Our goal is to develop new technologies for visualization and representation of biological networks. Our grant aims are: Aim 1. Simplifying network views by hierarchically organizing networks and their modules.

×