Bioinformatioc: Information Retrieval - II


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

Bioinformatioc: Information Retrieval - II

  1. 1. Information Retrieval - II <ul><li>Information retrieval (IR) is the science of searching for documents, for information within documents and for metadata about documents, as well as that of searching relational databases and the World Wide Web. </li></ul><ul><li>IR is interdisciplinary, based on computer science, mathematics, library science, information science, information architecture, cognitive psychology, linguistics, statistics and physics. </li></ul>
  2. 2. Information Storage and Retrieval (ISAR): Operations performed by the hardware and software used in indexing and storing a file of machine-readable records whenever a user queries the system for information relevant to a specific topic. For records to be retrieved, the search statement must be expressed in syntax executable by the computer.
  3. 3. Information Storage and Retrieval (ISAR): A computer hardware and software system designed to accept, store, manipulate, and analyze data and to report results, usually on a regular, ongoing basis. An IS usually consists of a data input subsystem, a data storage and retrieval subsystem, a data analysis and manipulation subsystem, and a reporting subsystem.
  4. 4. Information Storage and Retrieval (ISAR): Widely used in scientific research, business management, medicine and health, resource management, and other fields that require statistical reporting.
  5. 5. Information Retrieval Process: An information retrieval process begins when a user enters a query into the system. Queries are formal statements of information needs, for example search strings in web search engines. In information retrieval a query does not uniquely identify a single object in the collection. Instead, several objects may match the query, perhaps with different degrees of relevancy
  6. 6. Information Retrieval Process: An object is an entity which keeps or stores information in a database. User queries are matched to objects stored in the database. Depending on the application the data objects may be, for example, text documents, images or videos. Often the documents themselves are not kept or stored directly in the IR system, but are instead represented in the system by document surrogates.
  7. 7. Information Retrieval Process: Most IR systems compute a numeric score on how well each object in the database match the query, and rank the objects according to this value. The top ranking objects are then shown to the user. The process may then be iterated if the user wishes to refine the query.
  8. 8. Performance measures: Many different measures for evaluating the performance of information retrieval systems have been proposed. The measures require a collection of documents and a query. All common measures described here assume a ground truth notion of relevancy: every document is known to be either relevant or non-relevant to a particular query. In practice queries may be ill-posed and there may be different shades of relevancy. Precision and Recall are two widely used measures for evaluating the quality of results in domains such as Information Retrieval and statistical classification.
  9. 9. Performance measures: Precision can be seen as a measure of exactness or fidelity, whereas Recall is a measure of completeness. In an Information Retrieval scenario, Precision is defined as the number of relevant documents retrieved by a search divided by the total number of documents retrieved by that search, and Recall is defined as the number of relevant documents retrieved by a search divided by the total number of existing relevant documents (which should have been retrieved).
  10. 10. Performance measures Precision Precision is the fraction of the documents retrieved that are relevant to the user's information need.
  11. 11. Performance measures <ul><li>Recall </li></ul><ul><li>Recall is the fraction of the documents that are relevant to the query that are successfully retrieved. </li></ul>
  12. 13. General applications of information retrieval <ul><li>* Digital libraries </li></ul><ul><li>* Information filtering </li></ul><ul><li>* Media search </li></ul><ul><li>o Blog search </li></ul><ul><li>o Image retrieval </li></ul><ul><li>o Music retrieval </li></ul><ul><li>o News search </li></ul><ul><li>o Speech retrieval </li></ul><ul><li>o Video retrieval </li></ul><ul><li>* Search engines </li></ul><ul><li>o Desktop search </li></ul><ul><li>o Enterprise search </li></ul><ul><li>o Federated search </li></ul><ul><li>o Mobile search </li></ul><ul><li>o Social search </li></ul><ul><li>o Web search </li></ul>
  13. 14. Domain specific applications of information retrieval Domain specific applications of information retrieval <ul><li>* Expert search finding </li></ul><ul><li>* Genomic information retrieval </li></ul><ul><li>* Geographic information retrieval </li></ul><ul><li>* Information retrieval for chemical structures </li></ul><ul><li>* Information retrieval in software engineering </li></ul><ul><li>* Legal information retrieval </li></ul><ul><li>* Vertical search </li></ul>
  14. 15. District Health Information System (DHIS)‏ The District Health Information System (DHIS) is a highly flexible, open-source health management information system and data warehouse. It is developed by the Health Information Systems Programme (HISP) project.
  15. 16. District Health Information System (DHIS)‏ The solution covers aggregated routine data, semi-permanent data (staffing, equipment, infrastructure, population estimates), survey/audit data, and certain types of case-based on patient-based data (for instance disease notification or patient satisfaction surveys). The system supports the capture of data linked to any level in an organizational hierarchy, any data collection frequency, a high degree of customization at both the input and output side. It has been translated into a number of languages.
  16. 20. Health Information Systems Program (HISP)
  17. 25. Procticals <ul><li>EHR (Google Health and MS HealthVault)‏ </li></ul><ul><li>MRS (OpenMRS)‏ </li></ul><ul><li>HIS (DISH v 2.0)‏ </li></ul><ul><li>CPOE </li></ul><ul><li>Bioinformatics Portal (PU)‏ </li></ul>
  18. 34. Thank you...