Your SlideShare is downloading. ×
0
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
End User Informatics
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

End User Informatics

1,277

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,277
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Most end users think their data requirements are unique, which is not the case. Within an organization there is a pattern to the data requests. Avoid creating data silos
  • EAI technology was developed to enhance application level information exchange. An EAI message is hard-coded into an application’s logic and is efficient only at exchanging messages, carrying information and data from one application to another. EAI solutions have no means of optimizing queries where large datasets are involved. EII:it can present non-relational as if it were in relational format.
  • Our objective and goal is to satisfy or come close to satisfying the ever growing and insatiable demand of end-users for information. Enterprise data continues to grow exponentially every year. In the new world of iPhones where users have the illusaion that they can touch and feel data, latency is extremely irritating. Utopia is a Zero Latency environment, with reports being made available at the snap of a finger. As we go further into the presentation we’ll highlight what reality is. Insert two slides, end user perspective of architecture vs.IT
  • End users have a very low sensitivity to irrelevant and erroneous information on reports. We have often heard that reports adoption is low, actually it is the relevance and accuracy of information being presented that drives adoption. Report acceptance by end users is directly dependant on the following factors
  • They start slow, not well done and usually irreversable
  • RIA a cross between Web applications and traditional desktop applications, transferring some of the processing to the client end AJAX-enabled dynamic web functionality – sliding bars, live graphs, personalized rollover/hover content, etc.. The User Experience on Google, Ebay, Netflix and Yahoo are what users expec from a web experience.
  • The IT community is seeing RIAs as successful models for creating lightweight front-ends for SOAs.
  • Informatics/reporting is a process, that other BI vendors fail to recognize. To integrate data one must integrate processes
  • Transcript

    • 1. Informatics Ambareesh Kulkarni
    • 2. Informatics defined
      • Informatics is the application of technology to bring Data, People and Systems together
      • Bioinformatics is very Complex representation of Simple data
      • Cheminformatics is very Simple representation of Complex data
    • 3. Current State
    • 4. Problem Statement…. “ There's too much data and it's duplicated hundreds of times. The mistake companies make is that they start from the data they have. They need to ask what data do their users need and what are the questions they are asking. Understand the questions, how they can be answered and what kind of data is needed.” Quote by CIO of Major Corporation
    • 5. Integrated Solutions - Business Case: IDC White Paper
      • Information Tasks
        • Email – 14.5 hours a week
        • Create documents – 13.3 hours a week
        • Search – 9.5 hours a week
        • Gather information for documents – 8.3 hours a week
        • Find and organize documents – 6.8 hours a week
      • Gartner: “Organizations spend an estimated $750 Billion annually seeking information necessary to do their job.”
    • 6.
      • Time Wasted (per year)
        • Reformat information - $57 million per 10,000 users
        • Not finding information - $53 million per 10,000 users
        • Recreating content - $45 Million per 10,000 users
      Data Integration- Business Case: IDC White Paper
    • 7.
      • Reduce development costs, cycle times
        • Increase employee efficiency
        • Less time looking, more time doing
      • Enhance communication
        • Capture and reuse knowledge
        • Innovate better & faster
      • Cost of not finding right information
        • Business – lost money, opportunities
      Data Integration - Business Case: General ROI issues IDC White Paper
    • 8. Key Takeaways
      • Data Integration is not easy and represents ~80% of effort for a typical data integration project.
      • Incompatible data are the largest, most expensive, and time-consuming portion of IT projects.
      • Most data is in an unstructured format (outlook, word, PDF, images etc.)
    • 9. Evolution of data integration technologies
    • 10. Evolution of Integration Architectures Point to Point HUB + Spoke HUB + EII
    • 11. Defining EII, EAI, ETL Data Integration EII EAI Enterprise Information Integration Enterprise Application Integration Reports from multiple apps/data sources Transactions to multiple apps e.g. Real-time access to product silos for customers, employees e.g. Compound name change in one application propagated to other products EII ETL Real-time Batch Extract, Transform, Report in real-time Extract, Transform, Load; later report on data warehouse e.g. report data from operational applications e.g. build duplicate reporting data mart and/or redesign data warehouse
    • 12. Tools vs. Development Platform Enterprise Application Requirements Tools Development Platform
    • 13. What do end users really care about?
      • The Internet has raised the bar for Informatics expectations
      • Complex Query? Millions of Rows? Full table Scan?
      • Users don’t really care. If they can view stock prices in real time, why not corporate data.
      • In an ideal world, data analysis needs to be at speed of thought.
      • Bigger, better, faster, cheaper
    • 14. Business users view Data Pipeline Pilot Reports
    • 15. IT perspective
    • 16. Key Takeaways
      • Provide an Integrated view of data across multiple systems; flat files, data warehouses , data marts.
      • Avoid “boiling the ocean” Jump start data integration efforts with PP to quickly meet an important user requirement and then decide if the data should be persisted in a data warehouse or data mart.
      Use Pipeline Pilot to:
    • 17. Action from Insight Data is a New form of Energy
    • 18. Why is data integration so important?
      • Data in any organization is distributed in various disconnected and disparate systems
      • There is always a need to combine most current data with historical values
      • The success of the internet has created data sources outside the internal network
      • Data has informational value only when combined with other & related data
    • 19. WARNING SIGNS : Of Poor Data Integration
      • Incomplete Data foundation
      • Inability to consolidate data from multiple sources
      • No single version of the truth
      • Poor audit trail and data lineage
      • Historical values not retained in a data warehouse or data mart
      • Lack of integrated 360 deg view
      • High cost of maintaining “one-time” in-house code
      • Inability to comply with regulatory requirements
      Presentations or discussions that are prefaced with statements like “most of our analysis would have been accurate, except for the missing data from….” or “ Due to discovery of data not included in the last analysis , we are reversing our decision to……”
    • 20. WARNING SIGNS : Of Poor Data Integration
      • Incomplete Data foundation
      • Inability to consolidate data from multiple sources
      • No single version of the truth
      • Poor audit trail and data lineage
      • Historical values not retained in a data warehouse or data mart
      • Lack of integrated 360 deg view
      • High cost of maintaining “one-time” in-house code
      • Inability to comply with regulatory requirements
      As a result of an out-of-order condition for a critical chemical, a scientist must expedite the order and pay a premium price. When the chemical arrives the scientist (or worse her boss) discovers that another division had excess quantity of the same chemical and was looking to sell it at a discount.
    • 21. WARNING SIGNS : Of Poor Data Integration
      • Incomplete Data foundation
      • Inability to consolidate data from multiple sources
      • No single version of the truth
      • Poor audit trail and data lineage
      • Historical values not retained in a data warehouse or data mart
      • Lack of integrated 360 deg view
      • High cost of maintaining “one-time” in-house code
      • Inability to comply with regulatory requirements
      Scientists argue about the fact that analysis results differ-even though the data came from the same operational data source
    • 22. WARNING SIGNS : Of Poor Data Integration
      • Incomplete Data foundation
      • Inability to consolidate data from multiple sources
      • No single version of the truth
      • Poor audit trail and data lineage
      • Historical values not retained in a data warehouse or data mart
      • Lack of integrated 360 deg view
      • High cost of maintaining “one-time” in-house code
      • Inability to comply with regulatory requirements
      A technician alerts his management team of scientists to a potential problem discovered while running a query against a database. The technician cannot, however, answer the follow-up question , ” How long has the problem existed?”
    • 23. WARNING SIGNS : Of Poor Data Integration
      • Incomplete Data foundation
      • Inability to consolidate data from multiple sources
      • No single version of the truth
      • Poor audit trail and data lineage
      • Historical values not retained in a data warehouse or data mart
      • Lack of integrated 360 deg view
      • High cost of maintaining “one-time” in-house code
      • Inability to comply with regulatory requirements
      A Scientist runs a report every week against a LIMS, however to see a period-to-period comparison, the scientist maintains a spreadsheet into which he creates a new column every week and enters the data manually
    • 24. WARNING SIGNS : Of Poor Data Integration
      • Incomplete Data foundation
      • Inability to consolidate data from multiple sources
      • No single version of the truth
      • Poor audit trail and data lineage
      • Historical values not retained in a data warehouse or data mart
      • Lack of integrated 360 deg view
      • High cost of maintaining “one-time” in-house code
      • Inability to comply with regulatory requirements
      A customer calls tech. support to enquire about a pending case. While the customer support engineer has access to the case details, has no information available on whether the customer is current on maintenance, how many end-users they are licensed for or what options the customer has purchased.
    • 25. WARNING SIGNS : Of Poor Data Integration
      • Incomplete Data foundation
      • Inability to consolidate data from multiple sources
      • No single version of the truth
      • Poor audit trail and data lineage
      • Historical values not retained in a data warehouse or data mart
      • Lack of integrated 360 deg view
      • High cost of maintaining “one-time” in-house code
      • Inability to comply with regulatory requirements
      Minor change-requests take weeks to be implemented, any modifications have to be thoroughly tested for accuracy and integrity,
    • 26. WARNING SIGNS : Of Poor Data Integration
      • Incomplete Data foundation
      • Inability to consolidate data from multiple sources
      • No single version of the truth
      • Poor audit trail and data lineage
      • Historical values not retained in a data warehouse or data mart
      • Lack of integrated 360 deg view
      • High cost of maintaining “one-time” in-house code
      • Inability to comply with regulatory requirements
      CEO and CFO are uncomfortable signing off on the quarterly numbers as there is no way to trace the numbers back to the source systems.
    • 27. Case Study (closer to home): Services Order Report
      • Poor data quality
      • Redundant information
      • Duplicate entries
      • Hard to read
      • Huge amount of time required to clean it up
    • 28. Information-sensitivity
      • Data Availability and Accessibility
      • Data Quality
        • DQ = Completeness X Validity
        • E.g. Measure of Completeness = # of null values in a column
        • E.g. Measure of Validity = “ We have 4 regions, but there are 18 distinct values in the region column”
        • Pitfall: Don’t take accountability for DQ on the source system
        • Push accountability where it belongs, in the source system(s)
      • Timeliness of Data, relevant to the questions being asked by the user
      • SQL and programming accuracy
      Information Quality is a Direct Function of……
    • 29. Case Study (closer to home): Internal Revenue Forecasting process Orders QTD                                                                                Pipeline                                                                                     Delivered  Forecast     Run the Services Products and Orders report in RSVPP ……; Export out the results and filter for product services (Column AM) and sum the Total Sale Price USD column Run the Services Opportunities report in SFDC ; export out the result…… Assuming Access is up to date………..; export to Excel; filter by product services and sum USD Amount column Assuming Access is up to date run the Total Forecast report; Export to Access ; …………
    • 30. Near real-time data access
    • 31. Extract, Transformation & Load=Push big data
      • Batch extract from transaction systems
      • Bulk transformation
      • Push load into data warehouse
      Extract Load Transformation Data Warehouse Real Time
    • 32. Pipeline Pilot and Real time Data access Data Access Data Adapters Data Transformation Transform Calculate Security Relational Flat Files ERP Legacy EJB XML <XML> Information Access Web Services ODBC JDBC
      • Flexible Data Access capabilities
        • Single access point to data
        • Consumer sees only the end result
      • Shared platform service
        • Available to all technologies
      • Reusable building blocks
        • Targeted to specific needs
        • Reduces costs and time to market
        • Supports incremental development
    • 33. Case Study: PI Historian
      • PI Historian, product provided by OSI, captures data real-time from the research test rigs
      • Data capture in PI is triggered by events
      • PP allows scientists to read the data from PI historian as it becomes available and also combine it with other information (e.g. associate real-time test data with historical characteristics of a catalyst
    • 34. Data provisioning pros and cons
    • 35. Data Integration Total Cost of Ownership Really Matters
    • 36. Evolution Of an Informatics System 1 “ Just give me a list of compounds from the database, sorted by compound name”
    • 37. Evolution Of an Informatics System “ We also need to see the related toxicology information and for the list to be grouped by compound” 1 2
    • 38. Evolution Of an Informatics System “ We’d like to get a list of some of the related compound information, too, grouped by the first letter of the compounds name.” 1 2 3
    • 39. Evolution Of an Informatics System “ Actually, we’d like to be able to produce a completely separate report for compound and related toxicology information .” 1 2 3 4
    • 40. Evolution Of an Informatics System “ We don’t like running the reports manually. Can they be scheduled?” 1 2 3 4 5
    • 41. Evolution Of an Informatics System “ We have quite a few users using this system now and there’s some fairly sensitive data in there.” 1 2 3 5 6 4
    • 42. Evolution Of an Informatics System “ We need to be able to drill down into more detail” 7 1 2 3 5 6 4
    • 43. Evolution Of an Informatics System 7 8 1 2 3 5 6 “ We need to track which users have used what Protocols” 4
    • 44. Evolution Of an Informatics System “ We need to be able to easily search the information we need.” 9 6 8 4 7 1 2 3 5
    • 45. Evolution Of an Informatics System 9 6 8 4 7 1 2 3 5 “ We need these reports linked to our business process” “ We need to be able to approve or reject the reports” “ We need a single version of the truth” “ We don’t want to be waiting around for the results” “ We don’t want to be re-typing information from these reports into our other application” “ We need to be able to see the underlying detail” “ We need to print the reports out to take into meetings” “ We need the output as Excel” “ We need charts” “ We need to know who’s looked at the reports” “ We need a simple way to see the entire contents of the report” “ We need a report that looks like an existing flow chart”
    • 46. Hidden Costs
      • Organizations that believe that they can build a data integration solution at the fraction of cost of a COTS solution….
      • Discover that any savings in up-front costs are very quickly incurred multiple times over the lifetime of the solution
      • Typical effort to build a custom data integration solution can be upwards of 5000-5500 man days
      • Some of the tasks that need to be undertaken to provide a functioning solution:
      Application Architecture Data cleansing & enrichment services Integration framework User Interface design Common field matching Security Batch processing capabilities Application Integration Audit & Logging capabilities
    • 47. Build versus Buy Decision Criteria Data Integration Considerations Build your own Buy Initial Start-up cost Lower Higher Ongoing Operating cost Higher Lower Ongoing Support & Maintenance In-house responsibility Vendor One time “quick and dirty” task Consider Maybe overkill unless one-time task becomes ongoing request IT Staff requirements Higher Lower IT Productivity Detracts from Contributes to Data sources/data targets Single/single Multiple/multiple, Multiple/single, Single/multiple Complex transformations Limited: IT must write complex code Comprehensive Integration Usually overlooked Industry standards
    • 48. Industry Trends End-user Informatics
    • 49. Web 2.0 What’s Setting Expectations Today
    • 50. Next-Generation Enabling Technologies & New User Demands Are Emerging
      • Rich Internet Experience
      • Web 2.0
      • Portlet components
      • XML and derivatives
      • Dynamic, Ajax-based UI
      SOA Infrastructure Leverage existing systems and components Standardization Data-driven environment Open APIs to customize apps Personal Dashboards Integrate data from multiple sources Multi-account views Cross-account planning
    • 51. Web 2.0 features on our projects
    • 52. Web 2.0 features on our projects
    • 53. Advanced Reporting/Visualization Collection
    • 54. Scientific Business Process Management and PP
      • Fuse scientific and analytical data with process data
      • Use Pipeline Pilot in automated process decisions
      • Display reports and data at appropriate points in the process
      • Use data to modify process execution
    • 55. Consolidated Informatics Platform Consolidated Informatics Platform Many Databases Many Tools Dashboards Current Future Many Databases Spreadsheets Analytics Scorecards Self- service Reports Data Mining Portals Web Reports Web Reports
    • 56. Key Takeaways
      • Provide Accurate, Integrated & Seamless Informatics Solutions
      • Reduce redundant and replicated data bases
      • Rationalize existing Reporting tools and technologies
      • Build Agile, Flexible and Reusable solutions
      • Empower the end-users
        • “ Shift Right”
    • 57. Shift Right

    ×