pdf-4 Data Mining – Carbamazepine Polymorphs Example


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

pdf-4 Data Mining – Carbamazepine Polymorphs Example

  1. 1. Data Mining with DDView+ and the PDF-4 Databases Carbamazepine Polymorphs Some slides of this tutorial have sequentially-layered information that is best viewed in ‘Slide Show’ mode
  2. 2. This is one of three example-based tutorials for using the data mining capabilities of DDView+ with the PDF-4+ database and it covers the following topic: Two other similar tutorials for data mining exist and cover the following topics: <ul><li>CIGS Photovoltaics </li></ul><ul><ul><li>solid solution / cell parameter relationship </li></ul></ul><ul><li>FeO Non-stoichiometric Oxides </li></ul><ul><ul><li>sorting out temperature and stoichiometric effects on cell parameters </li></ul></ul><ul><li>Carbamazepine Polymorphs </li></ul><ul><ul><li>a PDF-4/Organics application </li></ul></ul><ul><ul><li>investigating polymorphic forms of an active pharmaceutical ingredient (API) </li></ul></ul>
  3. 3. Carbamazepine Polymorphs <ul><li>An example for the PDF-4/Organics 2008 database that answers the following questions: </li></ul><ul><ul><li>How many polymorphs are known to exist? </li></ul></ul><ul><ul><li>How do I distinguish these polymorphs by XRPD? </li></ul></ul>
  4. 4. Carbamazepine Polymorphs <ul><li>Search for PDF entries: </li></ul><ul><ul><li>Empirical Formula: C 15 H 12 N 2 O </li></ul></ul><ul><ul><ul><li>Note that this is not unique to carbamazepine </li></ul></ul></ul><ul><ul><li>Name: carbamazepine </li></ul></ul><ul><ul><ul><li>By itself, this name search will also include hydrates, solvates, derivatives </li></ul></ul></ul><ul><ul><li>Elements: Only {C, H, N, O} </li></ul></ul><ul><ul><ul><li>This could be performed, but really is a redundancy if the empirical formula search given above is performed </li></ul></ul></ul>
  5. 5. Results Table Preferences <ul><li>Useful information for sorting out polymorphic forms should be in the results table for retrieved entries </li></ul><ul><li>A plot of space group vs. reduced cell volume can often provide relevant groupings for polymorph differentiation </li></ul><ul><li>Another useful plot would be any of the reduced cell edge lengths vs. reduced cell volume </li></ul><ul><li>Suggested fields for results table: </li></ul><ul><ul><li>PDF # </li></ul></ul><ul><ul><li>Empirical Formula </li></ul></ul><ul><ul><li>Quality Mark </li></ul></ul><ul><ul><li>Compound Name </li></ul></ul><ul><ul><li>Common Name </li></ul></ul><ul><ul><li>International Space Group </li></ul></ul><ul><ul><li>Space Group Number </li></ul></ul><ul><ul><li>Reduced cell a </li></ul></ul><ul><ul><li>Reduced cell b </li></ul></ul><ul><ul><li>Reduced cell c </li></ul></ul><ul><ul><li>Reduced cell volume </li></ul></ul>
  6. 6. Specifying Results Table Preferences for Carbamazepine Polymorphs Search <ul><li>Selected Fields: </li></ul><ul><ul><li>Use these buttons to move a selected item up or down in the listed order for the results table. </li></ul></ul><ul><ul><li>Prepare the ‘Selected Fields’ list for this exercise as illustrated here. </li></ul></ul><ul><li>Available Fields: </li></ul><ul><ul><li>Use these buttons to move selected items between the ‘Available Fields’ list of 60 items and the ‘Selected Fields’ list of items that will be displayed in the results table. </li></ul></ul>
  7. 7. Entering the Empirical Formula Search Criterion Empirical Formula criteria is entered on the ‘Elements’ tab of the ‘Search’ window. There should be no space between each element and its ‘subscript’, and one space before entering each subsequent element . These elements can be entered in any order using the ‘Contains Elements’ option, but must match the order of the formula on the PDF card if the ‘Contains Phrase’ option is used.
  8. 8. Entering the ‘Carbamazepine’ Name Search Criterion The name criterion is entered on the ‘Names’ tab Since ‘carbamazepine’ may appear as either a ‘Compound Name’ or ‘Common Name’, the criterion is best entered in the ‘All Names’ category. Because it is a single word, it does not matter if ‘Contains Words’ or ‘Contains Phrase’ is selected. The search can now be performed.
  9. 9. Carbamazepine Polymorphs <ul><li>17 hits in PDF-4/Organics 2008 database </li></ul><ul><ul><li>12 have cell parameter and space group information </li></ul></ul><ul><ul><li>5 different space group designations (1,2,14,15,148) </li></ul></ul>
  10. 10. Carbamazepine Polymorphs <ul><li>To illustrate groupings related to space groups and reduced cell volumes given in the results table, the following is performed to create the appropriate graph: </li></ul>Enter ‘RedCellVol’ for the X-axis field, and SG # for the Y-axis field
  11. 11. Space Group / Reduced Cell Volume Groupings for 12 Carbamazepine PDF Entries P2 1 /n C2/c R-3 P1 P-1 Note the groupings have an approximate 2:3:4 ‘Reduced Cell Volume’ ratio. The Z values for the unit cells in each group will almost assuredly have the same ratio. The groupings suggest that the likely number of known polymorphs for carbamazepine is 3, 4, or 5.
  12. 12. Carbamazepine – Another View: a-axis Length vs. Reduced Cell Volume Gamma (4?) Alpha (1) Beta (6) Form IV (1) Here, the groupings have been labeled based on information from the ‘Compound Name’ or ‘Comments’ field from at least one member of the group.
  13. 13. Carbamazepine – Comparing XRD Data <ul><li>To sort out all the entries, including those without unit cell information, the X-ray powder diffraction (XRPD) patterns themselves can be used </li></ul><ul><li>With DDView+, the user can overlay simulated XRPD patterns based on individual PDF entry data </li></ul><ul><li>Example – overlay simulated XRPD patterns for entry 00-033-1566 (reported as alpha form but without cell parameters) and entry 00-043-1998 (reported as alpha form with cell parameters included) </li></ul>Double click anywhere on these ‘Results’ table rows to open the corresponding PDF entries (or right-click and select ‘Open PDF Card’)
  14. 14. Overlaying Simulated XRPD Patterns for Comparison This icon, when clicked, displays an XRPD pattern simulated from the peak list of the current PDF entry. <ul><li>To overlay an XRPD pattern from another entry, use the ‘Plots’ drop-down menu and select ‘Add Full Trace …’. </li></ul>Additional XRPD patterns can be overlaid using the ‘Plots’/’Add Full Trace …’ option <ul><li>To overlay an XRPD pattern from another entry, use the ‘Plots’ drop-down menu and select ‘Add Full Trace …’. </li></ul><ul><li>Enter the PDF entry number for the pattern to be overlaid: ‘00-043-1998’. </li></ul><ul><li>To overlay an XRPD pattern from another entry, use the ‘Plots’ drop down menu and select ‘Add Full Trace …’. </li></ul><ul><li>Enter the PDF entry number for the pattern to be overlaid: ‘00-043-1998’. </li></ul><ul><li>Once the PDF # has been entered, click ‘OK’ to overlay the requested XRPD pattern. </li></ul>
  15. 15. α -Carbamazepine Pattern Comparison These two patterns show a similar set of peaks, at roughly similar positions, indicating polymorphic similarity. Intensities are somewhat dissimilar but could be explained by sample thickness: red pattern from thin layer, blue pattern from thick sample. The comments section for entry 00-033-1566 indicates that this is a deleted pattern, having been replaced by 00-043-1998.
  16. 16. Mis-assigned Polymorphic Form Entry 00-043-1988 (red pattern) was also labeled as the α -form by its author. However, one can see from the comparison with a known α -form pattern (top graph – blue) and a known γ -form pattern (bottom graph – blue), that this entry is actually for a γ -carbamazepine.
  17. 17. Overlaying Many XRPD Patterns Using ‘Ctrl-Click’, one can select multiple entries from the ‘Results’ window and overlay their XRPD patterns on one graph. Here, all the reported α -form entries have been selected. A ‘Right-Click’ on one of these selected entries brings up the menu shown here. The ‘Open Diffraction Pattern’ choice will prepare the desired graph of overlaid patterns.
  18. 18. Overlay of Reported α -Carbamazepine Patterns on One Graph The result clearly shows the obvious difference between the 00-043-1988 XRPD pattern (blue) and the other two. One can use this technique to determine which XRPD patterns among the 17 entries are similar, including those without polymorphic form or space group designations. Results of such a determination are shown on the next slide.
  19. 19. Grouping of Similar XRPD Patterns for 17 Carbamazepine PDF Entries One can overlay the simulated patterns for all 17 PDF entries to establish four groups of similar patterns. Note: Several entries with no space group or polymorphic form information have been reasonably assigned to one of these four groups. 2 entries 2 reported as α -form 1 reported SG as R-3 9 entries 7 reported as β -form 5 reported SG as P2 1 /n 1 reported SG as P2 1 /c 5 entries 1 reported as γ -form 1 misreported as α -form 3 reported SG as P-1 1 reported SG as P1 1 entry 1 reported as Form IV 1 reported SG as C2/c
  20. 20. Carbamazepine - Cell vs. Space Group 17 PDF Entries (12 Experimental, 5 Calculated from Single Crystal data) Beta Gamma Form IV Alpha Revisiting the ‘Cell vs. Space Group’ graph, all 17 entries can be assigned to one of these four groupings, based on similarity of their XRPD patterns.
  21. 21. Carbamazepine Polymorphs <ul><li>Crystallographic and powder diffraction data suggest 4 known polymorphic forms </li></ul><ul><li>Data mining and display capabilities of DDView+ give users the ability to categorize the database entries </li></ul><ul><li>The user’s own pattern can be compared with the known polymorph patterns to ascertain polymorphic form </li></ul>
  22. 22. International Centre for Diffraction Data 12 Campus Boulevard Newtown Square, PA 19073 Phone: 610.325.9814 Fax: 610.325.9823 Thank you for viewing our tutorial. Additional tutorials are available at the ICDD web site ( www.icdd.com ).