Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Oracle Text in APEX

1,741 views

Published on

Oracle Text is a facility within the database that provides more advanced indexing & search techniques - including the ability to index documents stored in your database; on your server; or even the web!

Now you can incorporate this functionality into your web application using Oracle Application Express.

This presentation will demonstrate how easy it is to combine the two, and give you a platform for further expansion and exploration within a very powerful product.

Published in: Software
  • Be the first to comment

  • Be the first to like this

Oracle Text in APEX

  1. 1. SAGE Computing Services Customised Oracle Training Workshops and Consulting Oracle Text in Apex Advanced Indexing Techniques Integrated with Application Express Scott Wesley Systems Consultant & Trainer
  2. 2. Agenda • Introduction • Architecture • Fundamentals • Considerations • Setting Up • Samples • Index Maintenance • Visualisation • New Features
  3. 3. Larry Lessig?
  4. 4. the law is strangling creativity http://www.ted.com/talks/larry_lessig_says_the_law_is_strangling_creativity.html http://presentationzen.blogs.com/presentationzen/2005/10/the_lessig_meth.html
  5. 5. Identity 2.0 – Dick Hardt http://identity20.com/media/OSCON2005/
  6. 6. who’s the Dick on your site
  7. 7. Connor McDonald http://www.oracledba.co.uk
  8. 8. so today’s going to be more like this
  9. 9. and this
  10. 10. after I show a few pictures
  11. 11. who_am_i;
  12. 12. http://strategy2c.wordpress.com/2009/01/10/strategy-for-goldfish-funny-illustration-by-frits/
  13. 13. balance
  14. 14. Why use Oracle Application Express?
  15. 15. Why use Oracle Text?
  16. 16. What is Oracle Text?
  17. 17. Document Collection
  18. 18. Catalogue Information
  19. 19. Document Classification
  20. 20. Architecture
  21. 21. Class Description Datastore How are your documents stored? Filter How can the documents be converted to plain text? Lexer What language is being indexed? Wordlist How should stem and fuzzy queries be expanded? Storage How should the index data be stored? Stop List What words or themes are not to be indexed? Section Group How are documents sections defined?
  22. 22. 1) Example
  23. 23. CREATE INDEX ctx_name ON my_names(name) INDEXTYPE IS ctxsys.context PARAMETERS ('DATASTORE CTXSYS.DEFAULT_DATASTORE');
  24. 24. SQL> SELECT SCORE(1), name 2 FROM my_names 3 WHERE CONTAINS(name, 'fuzzy(john,,,weight)', 1) > 0 4 ORDER BY SCORE(1) DESC; SCORE(1) NAME ---------- ---------------------------------------- 100 John 100 John 70 Jon 70 Jon 63 Joan 63 Joan 52 Jong 48 Jona 8 rows selected.
  25. 25. 2) Datastore
  26. 26. CTXSYS.DEFAULT_DATASTORE
  27. 27. BLOB
  28. 28. BFiles
  29. 29. Pointers to objects on file system
  30. 30. URLs
  31. 31. Pointers to objects on the intertube
  32. 32. User Defined
  33. 33. Why would you?
  34. 34. 3) Index Type
  35. 35. a) CONTEXT
  36. 36. Document Collection
  37. 37. large document size
  38. 38. provides a score
  39. 39. asynchronous index & table data
  40. 40. CONTAINS
  41. 41. b) CTXCAT
  42. 42. Catalogue Information
  43. 43. smaller documents
  44. 44. text fragments
  45. 45. multiple attributes
  46. 46. set lists
  47. 47. similar to typical index paradigm
  48. 48. transactional
  49. 49. CATSEARCH
  50. 50. c) CTXRULE
  51. 51. Document Classification
  52. 52. routing information
  53. 53. displace manual interaction
  54. 54. not binary files
  55. 55. MATCHES
  56. 56. 4) Considerations
  57. 57. location of text
  58. 58. document format
  59. 59. bypassing rows - images
  60. 60. character set
  61. 61. language
  62. 62. fuzzy matching & stemming
  63. 63. wildcard query performance
  64. 64. stopwords & stopthemes
  65. 65. query performance and storage of LOBs
  66. 66. mixed queries
  67. 67. 5) Setting up
  68. 68. GRANT ctxapp TO ausoug;
  69. 69. create & delete indexing preferences
  70. 70. use Oracle Text PL/SQL supplied packages
  71. 71. 1* select grantee, owner, table_name, privilege from dba_tab_privs where table_name = 'CTX_DDL' SQL> / GRANTEE OWNER TABLE_NAME PRIVILEGE -------------------- ------------ ------------------------------ -------------------- CTXAPP CTXSYS CTX_DDL EXECUTE APEX_040000 CTXSYS CTX_DDL EXECUTE APEX_030200 CTXSYS CTX_DDL EXECUTE AUSOUG CTXSYS CTX_DDL EXECUTE XDB CTXSYS CTX_DDL EXECUTE 5 rows selected.
  72. 72. PLS-00201: identifier "string" must be declared
  73. 73. CTX PL/SQL Packages GRANT EXECUTE ON CTXSYS.CTX_CLS TO ausoug; GRANT EXECUTE ON CTXSYS.CTX_DDL TO ausoug; GRANT EXECUTE ON CTXSYS.CTX_DOC TO ausoug; GRANT EXECUTE ON CTXSYS.CTX_OUTPUT TO ausoug; GRANT EXECUTE ON CTXSYS.CTX_QUERY TO ausoug; GRANT EXECUTE ON CTXSYS.CTX_REPORT TO ausoug; GRANT EXECUTE ON CTXSYS.CTX_THES TO ausoug; GRANT EXECUTE ON CTXSYS.CTX_ULEXER TO ausoug;
  74. 74. Using URL Datastore in 11g CREATE ROLE apex_url_datastore_role; GRANT apex_url_datastore_role TO APEX_040000 WITH ADMIN OPTION; GRANT apex_url_datastore_role TO ausoug; EXEC ctxsys.ctx_adm.set_parameter ('file_access_role' ,'APEX_URL_DATASTORE_ROLE');
  75. 75. Demonstrations Script Description Ctx_blobs.sql Import & index a range of documents Ctx_bfiles.sql Import & index BFILE pointers Ctx_urls.sql Index & search URL references Ctx_dict.sql Index & search English dictionary words Ctx_views.sql Index view SQL text for impact analysis Ctx_apex_files.sql Duplicate and search Apex file repository Ctx_apex_backups.sql Hunt through your (automated) Apex app backups Ctx_names.sql Basic name filter options Ctx_products.sql Multiple column searches Ctx_category.sql Attribute based searching Ctx_classify.sql Classify documents into categories
  76. 76. 6) Index maintenance
  77. 77. indexing errors
  78. 78. resume failed index
  79. 79. ALTER INDEX ctx_surname REBUILD PARAMETERS ('resume memory 10m');
  80. 80. recreate index online (11g)
  81. 81. EXEC ctx_ddl.recreate_index_online ('ctx_surname', 'replace lexer sw_lexer');
  82. 82. rebuilding an index
  83. 83. ALTER INDEX ctx_surname REBUILD PARAMETERS('replace lexer sw_lexer') ONLINE;
  84. 84. ctx_report.index_stats
  85. 85. create table ausoug.my_stats (stats clob); declare x clob := null; begin for r_rec in (select * from ctxsys.ctx_indexes where idx_owner = 'AUSOUG' and idx_type = 'CONTEXT') loop ctx_report.index_stats(r_rec.idx_name,x); insert into ausoug.my_stats values (x); end loop; commit; dbms_lob.freetemporary(x); end; /
  86. 86. 7) Data Dictionary
  87. 87. SQL> select count(*) 2 from all_views 3 where owner = 'CTXSYS'; COUNT(*) ---------- 58
  88. 88. 8) Common Questions
  89. 89. DML operations on a CONTEXT index
  90. 90. ctxsys.ctx_user_pending
  91. 91. synchronise the index synchronize
  92. 92. EXEC ctx_ddl.sync_index('ctx_surname');
  93. 93. dbms_job
  94. 94. dbms_scheduler
  95. 95. how often?
  96. 96. optimise the index
  97. 97. can get fragmented
  98. 98. inverted index
  99. 99. each entry contains list of documents
  100. 100. DOG - DOC1 DOC3 DOC5 DOG - DOC7 DOG - DOC9 DOG - DOC11
  101. 101. ctx_ddl.optimize_index
  102. 102. capacity planning?
  103. 103. Object of Interest Num Rows Table Size Index size Dictionary 150k 7 27 Documents 28 34 1.5 Names 27k 1 6 Views 2k 7 2 BFiles 4 Product 1 URL 1
  104. 104. more text
  105. 105. cleaner data
  106. 106. less overhead
  107. 107. document format
  108. 108. next steps?
  109. 109. read Application Developer’s Guide
  110. 110. find examples
  111. 111. experiment
  112. 112. SAGE Computing Services Customised Oracle Training Workshops and Consulting Question time Presentations are available from our website: http://www.sagecomputing.com.au enquiries@sagecomputing.com.au scott.wesley@sagecomputing.com.au http://triangle-circle-square.blogspot.com

×