Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Insight into AstraZeneca's Technology Services.


Published on

Presentation given at the Big Data in Pharma Europe conference, London February 19th 2014 ( Updated for Enterprise Search Europe Summit April 29th (

Overview about the innovation approach taken within Technology Services at AstraZeneca, showcasing the approach, 6 examples of pilots and proof-of-concepts, and with a case study of how to implement a revolution in search analytics, using R&D as a springboard for enterprise.

Published in: Technology, Education
  • good deck
    Are you sure you want to  Yes  No
    Your message goes here
  • Hi Fabien - I know that we did some comparisons with TEMIS years ago and it would good to hear how the product has evolved since then.
    Are you sure you want to  Yes  No
    Your message goes here
  • Nick,
    we have a powerful and innovative solution improve in life science.
    please have a look to :
    the i3 Analytics ClinicalTrials Navigator and BioNews Navigator, both of which offer a novel approach to biopharma analytics, i3 Analytics was able to get a glimpse over the horizon to see what new therapeutic strategies are in store for RRMS patients.
    Best regards,
    Are you sure you want to  Yes  No
    Your message goes here

Insight into AstraZeneca's Technology Services.

  1. 1. Fostering Collaboration Using Analytics & Real-time Big Data Search: Insight into Technology Services Nick Brown Technology Services Lead (EMEA) AstraZeneca
  2. 2. AstraZeneca History Health Connects Us All AstraZeneca is a biopharmaceutical company with R&D at its core. Our business is providing innovative, effective medicines that make a real difference to patients. We have grown from agrochemicals and paints, to pharmaceuticals and biologics. As we virtualise our R&D activities, working increasingly with external researchers, access and how we leverage information is critical to our success. Unfortunately there is no silver bullet and search is still an evolving art – it is how we innovate and analyse our information will play a huge part in our future.
  3. 3. Culture of Innovation Through Technology Services Drive technology standards across AstraZeneca by fostering collaborative pilots to simplify landscape Nurture ideas that deliver immediate value to the business, leading to step-changes 1 2 3 Create a safe environment to explore innovative ideas closely with the business functions
  4. 4. Fail Fast Whilst Delivering Business Value Our technology services team works with novel approaches from start-ups, research labs, biotechnology companies & entrepreneurs using a 3 step model for rapid business value. 5 day Proof-of-concepts (PoC) use existing functionality. Fail fast if unsuccessful. 3 month Pilots with a senior leader and a single business problem to solve. All pilots must deliver successful value and if the approach is seen to disrupt with step- wise improvements, then we work closely to engage and drive implementation. 11 9 3PoC Pilot Implement
  5. 5. Mobile Ideation Pilot Gamification & Lightweight Co-developed a mobile ideation platform to enable AstraZeneca and it’s partners to tap into ideas of the collective workforce. Expanded from PoC to working prototype very rapidly and now keen interest from other areas of the business to leverage. If broken down into small pieces, the crowd can help analyse big data sets
  6. 6. Open Up And Unlock Potential Of Our Big Data Many more people outside of AstraZeneca have already solved big data challenges, so we piloted multiple crowd-sourcing platforms such as a community of >100,000 data scientists Using online competitions, we received solutions from experts in oil & gas, meteorological and mathematics. We made available millions of historical prescription and call pattern data points for one of our major brands, to model and identify key impact metrics of promotional material.  Identified 3 key metrics taken back into the field  Evaluated different models and an ecosystem approach for crowd-sourcing probably right for us now We can learn from every industry that faces different big data challenges
  7. 7. Identifying New Entities Across Big Data Textual Haystacks Scientific abstracts Algorithm to identify novel entities Comparison to existing CI databases Identification of potential NME opportunities Initiated a proof-of-concept with Thomson Reuters to leverage techniques applied in news editorials to identify new potential drug candidates not seen in CI databases. Drug entities are captured by information companies through analyst reports, websites and publications, but candidates from smaller biotechs or academics can be missed  Identified >50 late-stage drug candidates from BRIC-MT with huge potential for our in-licensing teams across AstraZeneca as relatively unknown externally. It’s often the small data hidden in big data that offers the real gems
  8. 8. Sponsored IBM Extreme Blue internship program where students are challenged to solve a business problem in 10 weeks. The team developed oncology patient website to initially record their real voice, phonetic algorithms, and a mobile android application to capture GPS and predict sentences that laryngectomy patients would use in real life. Capturing The Real Voice For Cancer Patients The team went on to win the 2013 European Expo within IBM across all projects and were interviewed by Computer Weekly (press release). We are now working with colleagues in oncology to understand the next steps and discussing wider options with local UK charities and foundations. big data analytics has the potential to help patients in their daily lives
  9. 9. Case Study: PoC to Implementation Distributed R&D Photo Credit:!/image/256556252.jpg
  10. 10. R&D Search Pilots 3 way head-to-head competition In Q1 2013, assessed 20 enterprise search platforms & piloted 3 companies in an internal competition to revolution search within R&D. Our tests included indexing 50M documents, semantic tagging, text analytics and building a search based application with visualisation. Sinequa selected as most advanced big data analytics & real time search
  11. 11. Rapidly build business intelligence applications including mobile15 >120 Connectors to unstructured & structured, internal & external data11 Accurate semantic mark-up with most advanced text-mining capabilities12 Intelligent, intuitive search hides advanced & complex search features13 Generate insight, analytics & alerts across billions of knowledge facts14 R&D Search Platform Implementation In July 2013, licensed Sinequa for R&D search with the intention of establishing the hardware platform in Q3 and releasing to R&D in Q4.
  12. 12. Virtual Team Connected By Passion To build our applications rapidly, we supplement our team with external experts, including running competitions on open innovation platform like TopCoder.
  13. 13. R&D Search & Analytics Real-time, Big Data Volume SCALE OF DATA Variety DIFFERENT FORMS OF DATA Velocity ANALYSIS OF STREAMING DATA Veracity UNCERTAINTY OF DATA For more information on R&D Search, contact Nick Brown The 4V’s of R&D Search Over 80% of our scientific information is unstructured and distributed in silos across our business. By adopting Big Data approaches, we aim to improve access and our decision making. By 2014 Implement R&D Search across iMED, GMD and MedImmune. Springboard for one enterprise search platform for AstraZeneca.  Filter only HQ scientific content  External (publications, patents, trials, grants, news, conference reports)  Internal (sharepoint, documentum, fileshares, oracle, O365, bespoke)  Daily incremental, real-time news  Automatically tagged 20 scientific vocabularies  Deduplication of >100M documents  >20 Terabytes of internal content  Over 1 billion knowledge connections
  14. 14. R&D Search Screenshot R&D Searches across all internal and external content, developing a relevancy algorithm to find key scientific documents, leveraging all synonyms under the hood
  15. 15. Big Data Analytis Not Just Search & Find Teams search their rich internal sources but now find relevant documents and any associated drugs, genes, mechanisms, diseases and even people. From the start of our project, our intention was to using a big data engine to turn scientific information into business intelligence through search- based applications.
  16. 16. R&D KOL (Key Opinion Leaders) Visual Insight Search isn’t just about finding people! In days, we can build visualisations that extract insight enabling business decisions (eg KOLs) without a single document ever being read.
  17. 17. R&D Intelligence Powerful Analytics We built R&D Intelligence to find things you don’t know about ! This computes sentence level co-currence between any two entities instantly to spot new opportunities Fantastic for drug repositioning, finding new life-cycle management ideas and target identification, but enables scientists to view only sentence evidence they have rights to.
  18. 18. R&D Experts Find & Connect within AZ & MedImmune Find and connect to the key experts on any scientific topic across R&D  Automatically updated profiles  Minimise duplication  Increase cross R&D collaboration  Advertise yourself  Enables social network analysis
  19. 19. R&D ChemSearch Hunt By Chemical Sub-Structures Users can draw a compound and search for exact, sub-structure or similar structures.  Search against hundreds of million of AZ compounds in R&D search library  Find documents with sub-structures
  20. 20. R&D Pulse Alerts To New Content R&D Pulse aims to give users access to only the latest information (past 2 weeks) with access to all internal and external content. In addition, users can click to view the story instantly or setup daily or weekly alerts, as well as use common search strategies. In addition, users can star favourite articles to come back to or read later.
  21. 21. R&D Search On The Move Mobile Client Accessible via Amazon web-services using Ping Federate (authentication) and Data Power (access & exchange), to enabled mobile applications to query against our big data search index. This makes our cloud-services lightweight and quick, but elastic to expand as demand increases. Our business applications are typically built in responsive, HTML5 and CSS3 to accommodate smart phones, tablets and laptop users.
  22. 22. Piloting Novel Technologies For Measurably Accurate Text Analytics Even with R&D Search, continue to test additional enhancements to our engine IBM Watson Converting unstructured data into structured knowledge is key With IBM’s Emerging Technology Research, piloted rule-based analytics engine that creates structure from unstructured text as accurately as manual curation! Unstructured text or tables Structured data MATA
  23. 23. Preventing Users From Big Data Overload Using Trend Analytics PoC with Saama Technologies to demonstrate data science capabilities using over 500 million connections from 60 million scientific documents Using tools like Google Big Query, able to process all of this information, across 5 data types, using each pairwise combination of 10 scientific vocabularies with trend analytics in seconds Use analytics to alert to only critical information in the huge torrent flow
  24. 24. R&D Search is a foundation Some towers will continue to be built We are still creating the foundations for our big data engines, some will grow, new ones will develop – but the future is very exciting for data scientists.
  25. 25. Thank You Acknowledgements & Questions This presentation describes work that has taken place in the past 12 months, clearly not possible without the enormous support from many people, with great thanks to: AstraZeneca: Ravi Sajja, Paul Fitzpatrick, Nasko Radev, Rob Hernandez, Susan Donohoe, Ming Chen, Fari Song, Steve Woodward, Akshay Tankhiwale, Kris Nayak, Sunny Advani, Youssef Belghali Sinequa: Christian Sestier, Tim Bell, Ariane Cavet, Frédéric Lardé & Alex Bilger. Pebble Code: John Mildinhall, Tak Tran, Mark Durrant, Nancy Lee & Toby Hunt. IBM: Tim Donovan, Jessica Evans, Matthew Lee, Joshua Lund, Sylvain Garcon, James Magowen, James Luke, Edd Biddle, Alan Knox & Henry Grahame-Smith. Thomson Reuters: Matthew Gowen, Redmond Garvey & Annabel Griffiths. Saama Technologies: Anil Nair, Krunal Patel, Laeeq Siddique, Aditya Phatak & Mark Hanson. Get In Touch For more information or to discuss a novel, exciting technology that you think we would be interested in, please reach out to me at