Your SlideShare is downloading. ×
Robust data synchronization with ibm tivoli directory integrator sg246164
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Robust data synchronization with ibm tivoli directory integrator sg246164


Published on

Published in: Technology, Business

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Front coverRobust DataSynchronizationwith IBM Tivoli Directory IntegratorComplete coverage of architecture andcomponentsHelpful solution and operationaldesign guideExtensive hands-onscenarios Axel Buecker Franc Cervan Christian Chateauvieux David Druker Eddie Hartman Rana Katikitala Elizabeth Melvin Todd Trimble Johan
  • 2. International Technical Support OrganizationRobust Data Synchronization with IBM TivoliDirectory IntegratorMay 2006 SG24-6164-00
  • 3. Note: Before using this information and the product it supports, read the information in “Notices” on page ix.First Edition (May 2006)This edition applies to Version 6.0.0 (with Fixpak 3: TIV-ITDI-FP0003) of IBM Tivoli DirectoryIntegrator.© Copyright International Business Machines Corporation 2006. All rights reserved.Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADPSchedule Contract with IBM Corp.
  • 4. Contents Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi The team that wrote this redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xivPart 1. Architecture and design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 1. Business context for evolutionary integration. . . . . . . . . . . . . . 3 1.1 A close look at the challenge. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Benefits of synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 Directory Integrator in non-synchronizing scenarios . . . . . . . . . . . . . . . . . . 7 1.4 Synchronization patterns and approaches . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4.1 How and when synchronization can be invoked . . . . . . . . . . . . . . . . . 8 1.4.2 Data flow patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5 Business and technical scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.5.1 Multiple existing directories and security concern . . . . . . . . . . . . . . . 11 1.5.2 Existing directory cannot be modified . . . . . . . . . . . . . . . . . . . . . . . . 12 1.5.3 Single sign-on into multiple directories with Access Manager . . . . . 13 1.5.4 Data is located in several places. . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.5.5 Use of virtual directory - access data in place. . . . . . . . . . . . . . . . . . 13 1.6 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Chapter 2. Architecting an enterprise data synchronization solution . . . 17 2.1 Typical business requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2 Detailed data identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.1 Data location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.2 Data owner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.3 Data access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.4 Initial data format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.5 Unique data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3 Plan the data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.1 Authoritative attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.2 Unique link criteria. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.3 Special conditions or requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.4 Final data format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.5 Data cleanup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24© Copyright IBM Corp. 2006. All rights reserved. iii
  • 5. 2.3.6 Phased approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.7 Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.4 Review results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.5 Instrument and test a solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.5.1 Create workable units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.5.2 Naming conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.5.3 High availability and failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5.4 System administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.5.5 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.5.6 Password synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.6 Who are the players in the solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.6.1 Common roles and responsibilities . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.7 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Chapter 3. Directory Integrator component structure . . . . . . . . . . . . . . . . 41 3.1 Concept of integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.1.1 Data sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.1.2 Data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.1.3 Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.2 Base components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2.1 AssemblyLines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.2 Connectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.2.3 Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.2.4 EventHandlers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.2.5 Hooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.2.6 Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.2.7 Function components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.2.8 Attribute Map components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.2.9 Branch components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.2.10 Loop components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.2.11 Password synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.3 Security capability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.4 Physical architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.4.1 Combination with an enterprise directory . . . . . . . . . . . . . . . . . . . . . 68 3.4.2 Base topologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.4.3 Multiple servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.5 Availability and scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.6 Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.7 Administration and monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.8 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Part 2. Customer scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Chapter 4. Penguin Financial Incorporated . . . . . . . . . . . . . . . . . . . . . . . . 91iv Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 6. 4.1 Business requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.1.1 Current architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.2 Functional requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.3 Solution design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.3.1 Architectural decisions for phase 1 . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.3.2 Architectural decisions for phase 2 . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.4 Phase 1: User integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.4.1 Detailed data identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.4.2 Data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.4.3 Instrument and test a solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.5 Phase 2: Password synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 4.5.1 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 4.5.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 4.5.3 Detailed data identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 4.5.4 Plan the data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 4.5.5 Review results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 4.5.6 Instrument and test a solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Chapter 5. Blue Glue Enterprises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 5.1 Company profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 5.2 Blue Glue business requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 5.3 Blue Glue functional requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 5.4 Solution design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 5.5 Phase 1: Human resources data feed. . . . . . . . . . . . . . . . . . . . . . . . . . . 275 5.5.1 Detailed data identification, data flows and review . . . . . . . . . . . . . 275 5.5.2 Instrument and test solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 5.6 Phase 2: Store management application . . . . . . . . . . . . . . . . . . . . . . . . 300 5.6.1 Detailed data identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 5.6.2 Data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 5.6.3 Review results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 5.6.4 Instrument and test solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313Part 3. Appendixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 Appendix A. Tricky connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415 Introduction to JDBC drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416 Database connectivity to Oracle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416 Obtaining the drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418 Installing the drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 Driver configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 Database connectivity to DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422 Obtaining the drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424 Installing the drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424 Driver configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424 Contents v
  • 7. Database connectivity to SQL Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426 Obtaining the drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 Installing the drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 Driver configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 Connectivity to Domino Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434 Identity Manager Notes Agent configuration . . . . . . . . . . . . . . . . . . . . . . . 436 Appendix B. Directory Integrator’s view of JavaScript . . . . . . . . . . . . . . 439 The script engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440 Scripts and configuration files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440 Scripting tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441 Scripts: Where . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442 Scripting JavaScript and Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443 Core JavaScript. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443 Regular expressions (regex) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444 Java through JavaScript . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446 Java to JavaScript and back . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 Common tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451 Creating arrays and Java utility objects . . . . . . . . . . . . . . . . . . . . . . . . . . 451 Managing dates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452 Working with entries and attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453 Appendix C. Handling exceptions and errors. . . . . . . . . . . . . . . . . . . . . . 455 Reading the error dump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456 Errors = exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459 The error object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462 Exception handling in script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 Error Hooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 Mandatory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464 Connection Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466 Mode-specific On Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467 Default On Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467 Logging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467 Appendix D. Additional material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471 Locating the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471 Using the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471 How to use the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477vi Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 8. Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481 Contents vii
  • 9. viii Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 10. NoticesThis information was developed for products and services offered in the U.S.A.IBM may not offer the products, services, or features discussed in this document in other countries. Consultyour local IBM representative for information on the products and services currently available in your area.Any reference to an IBM product, program, or service is not intended to state or imply that only that IBMproduct, program, or service may be used. Any functionally equivalent product, program, or service thatdoes not infringe any IBM intellectual property right may be used instead. However, it is the usersresponsibility to evaluate and verify the operation of any non-IBM product, program, or service.IBM may have patents or pending patent applications covering subject matter described in this document.The furnishing of this document does not give you any license to these patents. You can send licenseinquiries, in writing, to:IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.The following paragraph does not apply to the United Kingdom or any other country where such provisionsare inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDESTHIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimerof express or implied warranties in certain transactions, therefore, this statement may not apply to you.This information could include technical inaccuracies or typographical errors. Changes are periodically madeto the information herein; these changes will be incorporated in new editions of the publication. IBM maymake improvements and/or changes in the product(s) and/or the program(s) described in this publication atany time without notice.Any references in this information to non-IBM Web sites are provided for convenience only and do not in anymanner serve as an endorsement of those Web sites. The materials at those Web sites are not part of thematerials for this IBM product and use of those Web sites is at your own risk.IBM may use or distribute any of the information you supply in any way it believes appropriate withoutincurring any obligation to you.Information concerning non-IBM products was obtained from the suppliers of those products, their publishedannouncements or other publicly available sources. IBM has not tested those products and cannot confirmthe accuracy of performance, compatibility or any other claims related to non-IBM products. Questions onthe capabilities of non-IBM products should be addressed to the suppliers of those products.This information contains examples of data and reports used in daily business operations. To illustrate themas completely as possible, the examples include the names of individuals, companies, brands, and products.All of these names are fictitious and any similarity to the names and addresses used by an actual businessenterprise is entirely coincidental.COPYRIGHT LICENSE:This information contains sample application programs in source language, which illustrates programmingtechniques on various operating platforms. You may copy, modify, and distribute these sample programs inany form without payment to IBM, for the purposes of developing, using, marketing or distributing applicationprograms conforming to the application programming interface for the operating platform for which thesample programs are written. These examples have not been thoroughly tested under all conditions. IBM,therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy,modify, and distribute these sample programs in any form without payment to IBM for the purposes ofdeveloping, using, marketing, or distributing application programs conforming to IBMs applicationprogramming interfaces.© Copyright IBM Corp. 2006. All rights reserved. ix
  • 11. TrademarksThe following terms are trademarks of the International Business Machines Corporation in the United States,other countries, or both: AIX® Informix® OS/2® Cloudscape™ IBM® Redbooks™ Distributed Relational Database Lotus Notes® Redbooks (logo) ™ Architecture™ Lotus® RACF® Domino® Metamerge® RDN™ DB2® Netfinity Manager™ Tivoli® DRDA® Netfinity® Update Connector™ Everyplace® Notes® WebSphere® HACMP™ iNotes™The following terms are trademarks of other companies:iPlanet, Java, Javadoc, JavaScript, JDBC, JDK, JMX, JVM, J2EE, Solaris, Sun, Sun Java, Sun ONE, and allJava-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, orboth.Microsoft, Windows NT, Windows, and the Windows logo are trademarks of Microsoft Corporation in theUnited States, other countries, or both.Intel, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered trademarks of IntelCorporation or its subsidiaries in the United States, other countries, or both.UNIX is a registered trademark of The Open Group in the United States and other countries.Linux is a trademark of Linus Torvalds in the United States, other countries, or both.Other company, product, or service names may be trademarks or service marks of others.x Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 12. Preface Don’t be fooled by the name; IBM® Tivoli® Directory Integrator integrates anything, and it is not in any way limited to directories. It is a truly generic data integration tool that is suitable for a wide range of problems that usually require custom coding and significantly more resources to address with traditional integration tools. This IBM Redbook shows you how Directory Integrator can be used for a wide range of applications utilizing its unique architecture and unparalleled flexibility. The following examples may resonate with business needs in your infrastructure, while others can provide insight that can help understand the breadth of Directory Integrator’s capabilities: Continuously maintaining records in one or more databases based on information in other data sources such as files, directories and databases. Migrating data from one system to another, or synchronizing legacy (or existing) data where systems cannot be replaced or shut down. Automatically transforming files from one format to another. Adding supplementary identity data to LDAP directories when deploying white pages, provisioning, and access control solutions. Reacting to changes to data (such as modification, additions, and deletions) in the infrastructure and driving this information to systems that need to know about it. Integrating geographically dispersed systems with multiple choices of protocols and mechanisms; such as MQ, HTTP, secure e-mail and Web Services. Extending the capabilities and reach of existing systems and applications, giving them access to the rich communications and transformation capabilities of Directory Integrator. This book is a valuable resource for security administrators and architects who want to understand and implement a directory synchronization project.The team that wrote this redbook This redbook was produced by a team of specialists from around the world working at the International Technical Support Organization, Austin Center.© Copyright IBM Corp. 2006. All rights reserved. xi
  • 13. The team that wrote this book is shown in the picture above. They are from top left to right: Rana, Todd and Franc; and bottom left to right: David, Axel, and Beth Axel Buecker is a Certified Consulting Software IT Specialist at the International Technical Support Organization, Austin Center. He writes extensively and teaches IBM classes worldwide in the areas of software security architecture and network computing technologies. He holds a degree in Computer Science from the University of Bremen, Germany. He has 19 years of experience in a variety of areas related to workstation and systems management, network computing, and e-business solutions. Before joining the ITSO in March 2000, Axel worked for IBM in Germany as a Senior IT Specialist in Software Security Architecture. Franc Cervan is an IT Specialist working in Technical Presales for the IBM Software Group, Slovenia. He holds a diploma in Industrial Electronics from the University of Ljubljana and has 10 years of experience in security and systems management solutions. After joining IBM in 2003, his area of expertise are Tivoli Security and Automation products. Christian Chateauvieux is a Consulting IT Specialist helping and mentoring the IBM Tivoli Software Technical Sales Teams across the EMEA geography. He is axii Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 14. technical advocate of Tivoli Security solutions, promoting and supporting thesales and marketing initiatives associated with the Tivoli Directory portfolio andthe rest of the IBM Tivoli Security portfolio, including Tivoli Identity Manager andTivoli Access Manager in EMEA. He is an expert in Tivoli Directory products andjoined IBM in 2002. Prior to this he had two years in Metamerge® professionalservices and support. Christian holds a master’s degree of Computer Sciencesfrom the National Institute of Applied Sciences (INSA) in France and is ITILcertified.David Druker is a Consulting IT Specialist for Tivoli Security products. Hecurrently works in the IBM Channel Technical Sales organization and is arecognized authority on IBM Tivoli Directory Integrator solutions. David holds aPh.D. in Speech and Hearing Science from the University of Iowa. He joined IBMin 2002. Prior to that, he wrote code, built scientific apparatus and managed avariety of systems in both business and scientific enterprises.Eddie Hartman is part of the Tivoli Directory Integrator development team,working with design, documentation and storytelling. Eddie studied ComputerScience at SFASU in Nacogdoches, Texas, and at the University of Oslo inNorway.Rana Katikitala is an Advisory Software Specialist for Tivoli Security in the IBMSoftware Labs, India. He has eight years of experience in the IT industry in theares of development, support, and test of operating systems, systemsmanagement software, and e-business solutions. He holds a master’s degree inStructural Engineering from Regional Engineering College (REC) Warangal,India. His areas of expertise include IBM OS/2®, Windows® 2000, Netfinity®Manager™, IBM Director, Healthcare domain solutions of HIPAA (HealthInsurance Portability and Accountability Act) and HCN (Healthcare CollaborativeNetwork) and Tivoli Security solutions.Elizabeth Melvin is a Certified Consulting IT Specialist in Austin, Texas, workingfor the IBM TechWorks Americas Group as a subject matter expert supportingsoftware sales. She has 16 years of experience in a variety of areas includingsystems security, identity/data management and architecture as well as networkcomputing. She holds a degree in Management of Information Systems from theUniversity of Texas in Austin. Her areas of expertise include securityinfrastructure and data synchronization software.Todd Trimble is a Certified IT Product Specialist. He is ITIL certified and has 25years experience in the security and systems management solutions area. Toddjoined IBM in 1998 and has been working with the Tivoli Security products onmajor customer engagements. He is responsible for providing a validatedtechnical solution that resolves the identified business requirements andeliminates the technical issues and concerns prior to the sale of the IBM TivoliSecurity portfolio. Preface xiii
  • 15. Johan Varno is the Lead Architect for Tivoli Directory Integrator at the IBM Oslo Development Lab in Norway. He holds a degree in Computer Science from the University in Oslo and an MBA from the Norwegian School of Management. He has 24 years of experience in a variety of areas relating to network technologies, software development, and business development. Prior to working in IBM, Johan was cofounder and CTO of Metamerge. Thanks to the following people for their contributions to this project: Keith Sams, Jay Leiserson, Bob Hodges, Ralf Willert, Rudy Sutijiato, Cameron MacLean, Kraicho Kraichev, Lanness Robinson, Jason Todoroff IBM US Yogendra Soni IBM India David Moore IBM Australia Gabrielle Velez International Technical Support OrganizationBecome a published author Join us for a two- to six-week residency program! Help write an IBM Redbook dealing with specific products or solutions, while getting hands-on experience with leading-edge technologies. Youll team with IBM technical professionals, Business Partners and/or customers. Your efforts will help increase product acceptance and customer satisfaction. As a bonus, youll develop a network of contacts in IBM development labs, and increase your productivity and marketability. Find out more about the residency program, browse the residency index, and apply online at: welcome Your comments are important to us! We want our Redbooks™ to be as helpful as possible. Send us your comments about this or other Redbooks in one of the following ways:xiv Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 16. Use the online Contact us review redbook form found at: your comments in an e-mail to: your comments to: IBM Corporation, International Technical Support Organization Dept. OSJB Building 905 11501 Burnet Road Austin, Texas 78758-3493 Preface xv
  • 17. xvi Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 18. Part 1Part 1 Architecture and design In this part, we introduce the general components of the IBM Tivoli Directory Integrator V6 and what it has to offer in the directory synchronization field of the overall security architecture. After talking about business context, architectures and design, Part 2, “Customer scenarios” on page 89 provides solution oriented scenarios with technical hands-on details.© Copyright IBM Corp. 2006. All rights reserved. 1
  • 19. 2 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 20. 1 Chapter 1. Business context for evolutionary integration The deployment of a new IT system, such as an enterprise portal or a single sign-on service, usually requires integration with existing data in the organization. Sometimes the new system can plug directly into what exists, but very often, and for different reasons that will be described later, this is not the case. The usual approach to the problem is some combination of copying, merging, modifying, or synchronizing data between two or more systems, such as files, databases, directories, enterprise applications, or other repositories. When choosing an integration approach there are a number of issues to be considered, such as technical consequences and limitations, availability, security, and governance; as well as selecting a solution that balances costs, maintainability, and future flexibility. As a general purpose integration toolkit, IBM Tivoli Directory Integrator (TDI) represents an easy to use, rapidly installed, incremental, re-usable framework, well suited for maintainability, and offering great flexibility in terms of alternate approaches to solving almost any integration challenge. We will look at some examples and scenarios to illustrate this flexibility in this chapter. The remainder of this book describes the architecture and design of Directory Integrator and looks in depth at how two different business cases can be addressed with Tivoli Directory Integrator.© Copyright IBM Corp. 2006. All rights reserved. 3
  • 21. 1.1 A close look at the challenge Nobody wants to shake the infrastructure too hard. Its holding up the house. Furthermore, it has grown to fit, the result of evolution: Natural selection; survival of the highest switching cost. And yet, businesses still undergo the expense and trauma of infecting their infrastructure with new software. And they usually do it for the same reason: to increase value produced by the organization while decreasing the cost involved in its production. The goal is to improve organizational efficiency, quality, traceability, agility, or all of the above. But when companies tamper with the underpinnings of the enterprise, they tread softly; sometimes so softly that initial goals evaporate down to just getting new software deployed and running. This task would be less formidable were it not for the riddle of shared data. Applications need data—annoyingly often the same data. Since most of these products are engineered independently of each other, they probably dont see eye-to-eye on how data is handled. This includes home-grown solutions as well as commercial products, even many built by the same vendor. Some use standards, while others maintain their switching costs with proprietary approaches. And even if two systems agree on a common data store, they probably do not concur on its structure. So you end up with multiple data sources carrying bits and pieces of the same information. Disparate pockets of data, with dependent systems in a tight orbit around them. Experience shows that this sort of data fragmentation is the rule rather than the exception. It is the result of the evolutionary, periodically explosive growth of a companys machine and software infrastructure, and sustained by the constant fear of breaking something important. Terms like golden directory are born of this inhibiting, but justifiable fear. And when enough data sources are golden the infrastructure becomes very heavy. It solidifies and loses agility, making the ordeal of adding new systems and services even more painful. Nobody plans for this to happen. It is the natural result of unresolved governance. Intrinsically, applications presume ownership of their own data—a presumption likely shared by their principle users in the organization. This works fine for some types of information, but fails dramatically for others; for example (but not limited to) identity data. Let us rephrase that. Nowhere is this more true than for identity data. Organizations often discover that their identity information data and structure is, more often than not, owned by everybody, and yet by nobody in the organization.4 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 22. This apparently contradictory statement refers to the fact that information aboutpeople in the organization is typically managed in multiple places, yet notcoordinated in terms of governance or data structure. This is not a big problemwhen applications and user data live in isolation, for example information aboutemployees residing solely in the HR system and users in the LAN directory1. Thisindiscretion is often tolerated until the risks involved become too great (orsometimes, until they simply become obvious).The proliferation of user registries and the ensuing security exposure make theargument for directory integration particularly compelling: An employee may beterminated, but theres no guarantee that there wont be access rights left insome subset of directories, invisibly providing unwarranted access privileges;Sanctioned users are burdened with a multitude of user names and passwordsspread all over the place, each of which they must remember and maintainseparately, and which they probably write down somewhere. This in itselfrepresents a security risk, in addition to the productivity loss caused byinconsistent provisioning. Not to mention increasingly tougher audit requirements(for example, the Sarbanes-Oxley Act2) forcing people to get serious abouttraceability and security.Moreover, identity data fragmentation becomes a serious roadblock asorganizations increasingly implement large-scale, cross-organization solutionsthat require consistent data, managed in a 24x7 environment, scalable forgrowing usage and demands, and possibly including customers and partners.Deploying enterprise portals and services (like simplified or single sign-on)without an enterprise view of identities is practically impossible. Success, for bothtactical deployments and continued strategic growth, hinges on tying the chaos ofexisting user registries into a holistic model.Although the utopian proposition is to condense disparate registries down to asingle physical directory, the multitude of identity stores wont be going away aslong as applications depend on them in their own specific ways. As a result, thecommon approach to addressing data fragmentation is with integration tools thatallow silos to stay in place, but give the appearance of unified access. Ideally,with tools for building integration through careful evolution, rather than revolution.This means that deployment is broken into measured steps, bringing newsystems and repositories into the picture over time. If the process is plannedcorrectly, ROI can begin as soon as the first sub-step is complete.This document is not about implementing a single enterprise-wide directory thatbecomes the master for all others, although such can certainly be implementedwith Tivoli Directory Integrator. However, it is about the options available with1 Even though integration at this stage also makes sense from a security and data integrityperspective.2 More information about the Sarbanes-Oxley Act can be found at Chapter 1. Business context for evolutionary integration 5
  • 23. Tivoli Directory Integrator to deal with the wide spectrum of integration challenges encountered when deploying identity based applications in the enterprise.1.2 Benefits of synchronization When implementing a synchronization solution, the result is an environment where shared data looks the same for all consuming applications. This is because changes are propagated throughout the synchronized network of systems, molded in transit to fit the needs of each consumer. Each data source is kept up-to-date, maintaining the illusion of a single, common repository. Each application accesses its data in an optimal manner, utilizing the repository to its full potential without creating problems for the other applications. Synchronization strategies are increasingly the choice for deploying new IT systems. For identity management, this is usually a centralized or metadirectory style synchronization, where a high speed store (like a directory) is used to publish the enterprise view of its data. This approach has a number of advantages: Security requirements vary from system to system, and they can change over time. A good repository (like a directory) provides fine-grained control over how each piece of data is secured. Some provide group management features as well. These tools enable you to sculpt the enterprise security profile as required. Each new IT deployment can be made on an optimal platform instead of shoe-horned between existing systems into an uninviting infrastructure. Applications get to live in individually suited environments bridged by metadirectory synchronization services. If the availability and performance requirements are not met by some system (legacy or existing, or new), it can be left in place and simply synchronize its contents to a new repository with the required profile; or multiple repositories to scale. A metadirectory uncouples the availability of your data from that of its underlying data sources. It cuts the cord, making it easier to maintain up-time on enterprise data. Disruption of IT operations and services must be managed and minimized. Fortunately, the metadirectorys network of synchronized systems evolves over time in managed steps. Branches are added or pruned as required. Tivoli Directory Integrator is designed for infrastructure gardening.6 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 24. A good metadirectory provides features for on-demand synchronization as well3. Sure, joining data dynamically can be prohibitively expensive in terms of system and network load; but sometimes its the optimal solution.1.3 Directory Integrator in non-synchronizing scenarios While Tivoli Directory Integrator is a powerful tool to deal with a large number of synchronization scenarios, its core is a general purpose integration engine that can be used by other systems in real-time, providing these systems with very interesting capabilities. Below are some examples of deployed solutions to illustrate such usage: A mainframe application sends MQ messages that Tivoli Directory Integrator picks up, then accesses other data systems in the enterprise, performs some operations and transformations on the data set and responds back through MQ to the mainframe. The Tivoli Access Manager SSO (single sign-on) service calls Tivoli Directory Integrator during user login in order to authenticate their credentials against one or multiple systems not supported out-of-the-box by Tivoli Access Manager. Automatic provisioning of new users is done as required. Tivoli Directory Integrator monitors the operational status of an LDAP directory and sends SNMP traps to enterprise monitoring systems. A SOA-based application calls Tivoli Directory Integrator through Web services, and Tivoli Directory Integrator writes data to specially formatted log files and updates databases. Tivoli Directory Integrator intercepts LDAP traffic to transparently make multiple directories look like one to an LDAP client application. As in all Tivoli Directory Integrator solutions, any number of Tivoli Directory Integrator connectors, transformation, and scripting can be brought to bear on the data flow. As seen from the above deployments, Tivoli Directory Integrator isnt limited to synchronizing data. The next sections provide additional scenarios and examples that illustrate how Tivoli Directory Integrator is inserted into a data flow, enabling real-time operations to be executed that otherwise would have required complex and custom code. 3 In addition to change-driven, schedule-driven and event-driven Chapter 1. Business context for evolutionary integration 7
  • 25. 1.4 Synchronization patterns and approaches This section takes a look at synchronization from a conceptual perspective. First, we look at how and when, meaning how Tivoli Directory Integrator is invoked to perform its work. Then we look at some of the typical data flow patterns that are encountered.1.4.1 How and when synchronization can be invoked Tivoli Directory Integrator-based synchronization solutions are typically deployed in one of the three following manners, although combinations are also frequently used to enable the various data flows that entire solution requires: Batch - In this mode Tivoli Directory Integrator is invoked in some manner (through its built-in timer, command line or the Tivoli Directory Integrator API), and expected to perform some small or large job before either terminating or going back to listening for timer events or incoming API calls. This is often used when synchronizing data sources where the latency between change and propagation is not required to be near real-time. Event - Tivoli Directory Integrator can accept events and incoming traffic from a number of systems, including directory change notification, JMX™, HTTP, SNMP, and others. This mode is typically used when Tivoli Directory Integrator needs to deal with a single, or a small number of data objects. Call-reply - This is a variation of the event mode, but the difference is that the originator of the event expects an answer back. IBM products use the Tivoli Directory Integrator API to call Tivoli Directory Integrator, and solutions in the field often use HTTP, MQ/JMS and Web services to invoke a Tivoli Directory Integrator rule and get a reply back. There is no single answer to the questions of when to choose between batch or event-driven integration. For example, enterprises have varying requirements regarding the propagation of identity data. Delays can be acceptable in the seconds, minutes, and even in the hours range. It must also be determined whether the data sources can provide a data change history (LDAP directories often have changelogs) or notification mechanisms when data changes. Tivoli Directory Integrator can be utilized both as a batch system, checking for changes every so often, as well as a notified system, reacting only when the source system sends a data change notification. Also keep in mind that the above modes are not exclusive of each other, all of them can be utilized in the same Tivoli Directory Integrator deployment.8 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 26. 1.4.2 Data flow patterns Tivoli Directory Integrator is often used to implement not just one, but a number of data flows. Data can flow from one system to another, but also from many systems to one. As a system becomes the source of data from many systems, it often evolves to the next stage, where it becomes the source for updates into many others. It is important to understand and then map the intended flow of data. Although the current infrastructure does not yet look like the picture in Figure 1-1, it does illustrate that the enterprise applications are being rolled out with increasing speed in large organizations. These systems often do not share identity repositories (although the same directory may host several instances), simply because the applications have diverging requirements on data format, as well as the system owners have different perspectives on how to manage and access the identity data. A well-crafted integration solution will let each business owner have full control of their data system, while ensuring that common data is kept in harmony across the entire infrastructure. Other enterprise applications Single Provisioning Sign-on LAN Portal Personal profile Personalization White pages Content Management Figure 1-1 IT infrastructure example A commonly underestimated part of synchronization projects is the planning of data flows. Successful deployments document the flow of attributes at an early stage and therefore identify the number and type of data flows required. A project might look very complicated at first glance, but once the flows are identified, the project can be approached in incremental steps. Chapter 1. Business context for evolutionary integration 9
  • 27. Although the project could at first glance look like a very complex many-to-many data flow scenario, it might after inspection reveal itself to be a number of simple one-to-one, many-to-one or one-to-many data flows. Next, we take a look at these simple data flow patterns that a project typically consists of. One-to-one data flow The simplest data flow is the copying or synchronizing of data from a single source to a single target. However, just because the flow is simple, there can be any kind of transformation performed on the data, either in content, syntax, format or protocol. Here are some examples of such data flows: Updating a database with data from a file that was made available as a report from another system. Generating a file that contains changes made in a database. Keeping a directory synchronized with another, transferring only changes as they occur on the source directory. Reading an XML file and writing a CSV formatted file with a selected subset of the XML file. Even though the flows above are conceptually simple, transformation of the data might be required that introduces complexity. For example, when dealing with identity data, there could be a requirement to join a number of groups into a single one in the target directory. This join could have further restrictions based on other data in the source system, such as address, department, or job function. Many-to-one data flow As previously discussed, data ends up in email Directory multiple repositories for a number of good TDI Directory reasons. As this happens, additional context is built into the systems as well. Both explicit and Database implicit relationships between the data are File established, which are lost when just copying the data to a new system. Furthermore, the existing systems continue to be updated and managed as before, so copying data quickly looses its relevance. Sometime a federated approach can be used to access this data set in real-time, but often this is not acceptable because of performance or availability requirements. Therefore, a synchronization data flow must involve multiple source systems in the process of maintaining a target system with the re-contextualized data. A many-to-one data flow uses the source systems for purposes such as verifying information, making decisions in the data flow, and merging (joining) additional attributes to the initial data set that is intended for the target system.10 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 28. One-to-many data flow The illustration does not fully describe the email combinations that are possible in Directory one-to-many scenarios. The main point is TDI Directory that data needs to be updated, Database maintained or created in several places. For example, as e-mail addresses are File added in the e-mail directory, Tivoli Directory Integrator ensures that this is updated in the single sign-on directory for authentication purposes. However, the ERP system also likes to subscribe to this information as it is used in automated ERP-based messages to employees. So in this example, Tivoli Directory Integrator would update both the SSO directory as well as the ERP system as part of a data flow. Another example is propagating password changes in a directory to a number of other directories. In one-to-many data flows it is important to consider what could happen if a flow was interrupted and data not updated in all systems as was expected. In transactional systems, roll-back is used to reset the involved systems to the state they had before the data flow started. However, in most identity synchronization projects, this is not much of a problem since the entire data flow can be repeated—it is not like transferring the same amount of money twice to another bank account. However, roll-back or compensating logic can be added to a Tivoli Directory Integrator solution should this be required.1.5 Business and technical scenarios The previous section looked at synchronization concepts in general. Also, some of the benefits of synchronization were discussed in another section. Now we investigate some real-life scenarios to illustrate the business context. The examples below are intended to bring them to life so that the reader can more readily recognize and identify synchronization opportunities when faced with a new business or technical deployment challenge. The fictional company PingCo is used to illustrate the scenarios. Let us now look at a few identity use cases to illustrate the issues that throw wrenches into the machinery that organizations have spent years building.1.5.1 Multiple existing directories and security concern PingCo is building a portal that will be used by both employees and external customers. PingCo has already implemented separate employee and business partner directories, but the employee directory is on the corporate intranet and will not be made accessible to non-VPN external users. The portal will be placed Chapter 1. Business context for evolutionary integration 11
  • 29. in the DMZ, with no access into the internal network. One solution is to use Tivoli Directory Integrator to synchronize the employee and the business partner directory into a new directory placed in the DMZ. Only the necessary information about the employees is transferred into the DMZ directory to reduce security exposure. PingCo can choose whether or not to securely synchronize the employee passwords into the external directory, or create new passwords (but the same user name) for employees that access the external portal. The above scenario could be modified to include organizations with many internal directories, possibly managed by separate business units or other organizational entities that challenges coordination of efforts. Synchronizing the content (with possible filtering of data) from the directories lets them keep ownership of data, yet enables common applications to be deployed on the joint set of identity data on a new directory that reduces the dependence on each sub-directory with minimum performance impact.1.5.2 Existing directory cannot be modified PingCo intends to deploy an enterprise single-sign-on (SSO) service and have a directory with all employees. However, for some reason PingCo cannot let the SSO service use the existing directory directly. Sometimes directories are only accessed in read-only mode, but sometimes applications that use directories also need to store data in them as well. That can become a hurdle for reasons such as: Technical. The existing applications that use the directories cannot deal with this change. Availability. The business owners of the existing directory are not able to meet the availability requirements of an enterprise (and possible cross-enterprise) SSO service. Governance. Existing business owners of the directory dont want others to modify a system that they own and manage. Performance. The added performance impact of the SSO service could extend beyond what the directory platform can provide. Security. Although the user names are already there, the SSO service adds new data that might be considered even more sensitive. The solution in this case is a simple synchronization to a new directory. It could even be a separate logical directory tree on the same machine or an entirely different directory implementation on a more scalable and secure physical machine. PingCo would have the choice of where passwords are managed and changed. Any change to one directory would immediately be made on the other as well.12 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 30. With IBM SSO (single sign-on) offerings, Tivoli Access Manager, there is an additional option available as described in the following section. That scenario works with a single directory for Tivoli Access Manager authentication, but keeps all other data in a separate and secure directory.1.5.3 Single sign-on into multiple directories with Access Manager PingCo intends to implement a single sign-on service with Tivoli Access Manager, and users are defined in multiple directories. Tivoli Directory Integrator integrates with Tivoli Access Manager Version 5.1 and later through its EAI (External Authentication Interface) so that Tivoli Directory Integrator can authenticate users across any number of back-end sources that Tivoli Directory Integrator supports. For example, when a user provides credentials to Tivoli Access Manager, Tivoli Directory Integrator is invoked and then attempts to authenticate into a number of directories with custom filters and modifications to the base credentials. Tivoli Directory Integrator can also look at the supplied credentials and do direct authentication to a target directory rather than trying all of them if such information is available.1.5.4 Data is located in several places PingCo intends to deploy a portal based application that requires information about employees, their work location as well as who their manager is. This information does exist in the infrastructure, but not in a single location. There are directories that contain both unique and overlapping information about employees. The HR system knows about work location and the managers of the employees. To make things even more complicated for the solution architect, the HR group is not willing to provide direct access to their system, but are willing to provide a weekly report with the required information. This is a classic example of where Tivoli Directory Integrator can bring order to the chaos by connecting to all of the directories, identify the unique set of users, and merge that data with the weekly feed from HR. The end result is a directory where all information is collected and users have work location and manager information added in from the HR system. Once the initial job has been completed, Tivoli Directory Integrator continues to monitor the sources for changes, including the weekly report from HR, and identify the records that have been added, modified, and deleted.1.5.5 Use of virtual directory - access data in place PingCo needs to authenticate users against one or more directories that cannot be synchronized, possibly because they belong to somebody else who does not allow this to be done. If PingCo uses Tivoli Federated Identity Manager or Tivoli Chapter 1. Business context for evolutionary integration 13
  • 31. Access Manager then there are authentication plug-ins available (using the External Authentication Interface) to Tivoli Directory Integrator. However, in other situations, Tivoli Directory Integrator can intercept LDAP messages and forward them to one or more LDAP directories in a round-robin/chaining or other custom logic on behalf of the client. This scenario is often described as a virtual directory approach since the client does not need to know that its actually communicating with a number of directories in real-time. This approach has some apparent benefits (and sometimes offer the only practical option), such as leaving data in place, removing the requirement for synchronization. However, there are both short-term and long-term issues that should be considered: Availability - Some attribute relationships cannot be reliably resolved in real-time due to unstable systems, scheduled maintenance, broken links, latency, firewalls, and so forth; or because some relationships are too complex to resolve quickly. Synchronization can spend the time it takes to map their data. Performance - A virtual directory imposes itself into every data access operation. A separate synchronized directory maximizes performance while it maintains the enterprise view via change-based synchronization. Performance requirements are often underestimated as the use of new enterprise applications often grow past what was initially assumed. This is especially true for enterprise portals and single sign-on projects, where a successful deployment creates major benefits, but increases resource consumption. Reliability - The virtual directory is dependent on all connected systems being available and online. The owners of those systems might not be willing to provide that level of service to the rest of the enterprise. A synchronized solution will always be available, and there is no impact of an off-line subsystem. Also, if the synchronization engine (not the synchronized directory itself) is offline, data gets out-of-date. This is amended as soon as the synchronization is restarted. If the virtual directory is down, all dependent applications are down as well. Agility - New enterprise data means new data relationships, so with both approaches the integration solution must be updated to include these. However, the out-of-band nature of synchronized solutions significantly facilitates maintenance and upgrade since data flows and integration flows can be added without impacting the operational availability of the directories. Scalability - Virtual directories cant scale the way real directories can. Even with caching, they will always be limited by the scalability of the systems with the source data. Furthermore, a good enterprise directory can be massively scaled in multi-master-slave configurations for high performance.14 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 32. 1.6 Conclusion Synchronization introduces a number of benefits to the architectural design of new enterprise solutions. Rather than trying to craft an optimal situation, synchronization can provide a pragmatic approach that is less costly to build and maintain, while adding operational benefits such as performance, availability and agility. These benefits certainly do not apply to all scenarios, but on the other hand are often not evaluated because the architectural 20-20 vision prevails where the pragmatic mind would have provided quicker time to value as well as a more future-proof solution since changes are often less predictable than we would like. Chapter 1. Business context for evolutionary integration 15
  • 33. 16 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 34. 2 Chapter 2. Architecting an enterprise data synchronization solution How do you eat an Elephant? The answer is one bite at a time. The Tivoli Directory Integrator getting started guide describes this as the best practice for solving large data synchronization problems as well. The key to success is to reduce complexity by breaking the problem up into smaller, manageable pieces. This means starting with a portion of the overall solution, preferably one that can be completed in a week or two. Ideally, this is a piece that can be independently put into production. That way, it is already providing return on investment while you tackle the rest of the solution. This is also the best practice approach for gathering the necessary information to craft a successful enterprise data synchronization solution. This chapter outlines a series of questions that need to be answered prior to the installation of the product, or the creation of a single AssemblyLine. The goal is to collect the necessary information that will allow you to easily build, deploy and manage a successful Tivoli Directory Integrator solution. Simply consider this a necessary step before you get to enjoy using the product. At a minimum, you must be able to answer the following questions:© Copyright IBM Corp. 2006. All rights reserved. 17
  • 35. What typical business requirement is Tivoli Directory Integrator trying to solve? What data stores are required to solve the problem? How can you instrument and test the solution? Who is responsible for what activity?2.1 Typical business requirements Tivoli Directory Integrator is a truly generic data integration tool that is suitable for a wide range of problems that usually require custom coding and significantly more resources to address with traditional integration tools. It is designed to move, transform, harmonize, propagate, and synchronize data across otherwise incompatible systems. However, before the tool can be used, it might be necessary to understand what has brought about the data synchronization requirement. For example, is it the result of a company’s acquisition of another firm, in which case the acquired company’s uses need to be integrated and kept in synch with the parent companies data stores, thereby providing a common data source to be used with the development of a new enterprise application? A secondary goal may be the synchronization of user passwords. Tivoli Directory Integrator can be used in conjunction with the deployment of the IBM Tivoli Identity Manager product to provide a feed from multiple HR systems as well as functioning as a custom Identity Manager adapter. Both of these scenarios will be further expanded upon later in this book. Regardless of the scenario, it is essential to gain a full understanding of the environment. This allows you to document the solution. Typically this is accomplished by the development of a series of use cases that are designed to clarify the business needs and refine the solution through an iterative process that ultimately provide you with a complete list of documented and agreed to customer business requirements. For example, is the data synchronization solution viewed as business critical, and will it need to be instrumented into a high availability solution; or is a guaranteed response time a business requirement that has to be addressed? It is important to point out, that in most cases you are manipulating user identity data. As such, the appropriate security safeguards for privacy and regulatory compliance requirements need to be addressed during the requirements gathering phase.18 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 36. The ultimate goal is to determine how the information will need to flow through the enterprise to solve the stated business requirements. This is the essential first step in breaking down the complex problem of enterprise data synchronization into manageable pieces. At a minimum, the solution architect will need to be able to provide: An agreed upon definition of the business requirements and the translation of the business objectives into concrete data and directory integration definitions. A concise understanding of the various data stores that are part of the solution and under what circumstances the information needs to flow through the organization as well as the authoritative source for each data element that will be managed. The diagram in Figure 2-1 depicts the various steps required to instrument an enterprise data synchronization solution. Detailed data identification · Location – data source · Owner · Access Tivoli Directory Integrator · Initial format · Unique data Review results · Enables initial design documentation and communication Business requirements Data synchronization · Business scope solution · Business benefits Instrument and test Plan data flows · Workable units · Authoritative attributes · Naming conventions · Unique link criteria · Availability/failover · Special business requirements · System administration · Final data format · Security · Data cleanup · Password synchronization · Phased approach · FrequencyFigure 2-1 Solution architecture process flow It is important to note that some of the elements in the process flow described in the figure above are outside of the Tivoli Directory Integrator product sphere— indicated by not being placed completely inside the grayed in area. Those found entirely inside of the grayed in area are wholly a part of the solution. Let us take a closer look at each of the different disciplines in order to clarify what we mean. Chapter 2. Architecting an enterprise data synchronization solution 19
  • 37. 2.2 Detailed data identification This section discusses the best practice for identifying the nature of the data required to solve the defined business problem. Once the business requirements and corresponding use cases have been clearly stated and agreed upon, the next step in architecting a data synchronization solution is to identify the nature of the data that will be utilized. At a minimum, the solution architect will need to be able to: Identify as much as possible about the data. Provide a document that describes the data flow. Describe how the results of the first two steps will be reviewed. By following this best practice technique of identifying, planning, and reviewing the nature of the data, the solution architect will be able to craft the technical solution requirements and design to match the driving business needs. To continue with the best practice of simplifying a complex problem, the systematic definition of the required data will further simplify the task of creating a successful project. Detailed data identification starts with the understanding that this is the time where the business based use cases are used to add more clarity to what is to be accomplished. At a minimum the solution architect must identify the following: Data location Data owner Data access Initial data format Uniqueness of data2.2.1 Data location The location of the data is typically the primary factor in determining the ultimate solution design and architecture. The solution architect will be required to identify both the physical and logical location of the data to be used to satisfy the use case. Some examples of physical location are items such as the data exists in a specific regional location, is on a particularly slow or fast hardware platform, or happens to be limited in accessibility due to distance or network speed. These factors are used when planning data flows and designing the physical architecture of the data synchronization solution. The logical location of the data translates very specifically to IBM Tivoli Directory Integrator components that are mentioned in the following chapter. By20 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 38. determining the data sources in the use case, the solution architect can then determine the type of connection to be used along with the underlying technology to be utilized. An example of identifying a logical location of data might be that the use case involves synchronizing data located within a directory server. The logical location of the directory server’s data would be described by the server name and/or IP address. The underlying technology to be used to connect to a directory server would typically be the LDAP protocol or possibly via an LDIF file. Similarly, if the use case incorporated the use of a database, the data source would be identified as possibly relational in format and accessibly via a JDBC™ technology connection.2.2.2 Data owner Determining the owner of the data helps the architect identify any possible requirements introduced to the solution due to privacy or compliance concerns. Does the data have a requirement to be handled in a special way or is it even possible to use the data within the desired use case given its current location and form? Regulatory and corporate policies should be reviewed with the data owner at this time as well.2.2.3 Data access Many times, the data owner is often the same organization or person who provides the data access. However, this is not always the case. Data access involves the determination of what level of access can be granted to the data store or source to be able to synchronize the required attributes. An example of this is a business use case that requires the solution to synchronize to an LDAP server. A best practice would be for the owner of the LDAP server to provide an individual login account with special privileges just for Tivoli Directory Integrator to use. The result of this allows the server owner to track the activity generated by the synchronization solution as well as effectively maintain any security policies the organization may have in place for that server. If the solution only requires access to a specific container on that LDAP server, the login account could be limited to read and write privileges within that specified container. This is an example of where the solution architect would specify what access privileges are required to each data source in the use case.2.2.4 Initial data format Identifying the initial data format involves the determination of all the possible values each attribute could have when initially connecting to the data source. The Chapter 2. Architecting an enterprise data synchronization solution 21
  • 39. reason for this is that data values tend to show up in one of four states; null, blank, out-of-range and valid. As such, the best practice is to determine when the solution will account for all four possible states, as well as, how to handle any special conditions that could be encountered. For example, how does the solution resolve duplicate or multiple values. Tip: A common pitfall many solutions encounter is the issue of converting integer value data to strings. This happens most often when synchronizing from a database if you are not careful to take note of the format of the field values in a database. For example, many fields within databases designed to handle a numeric entry, such as employee number, use an integer format. Sometimes your data synchronization solution requires you to parse or otherwise process these values as though they were a string within IBM Tivoli Directory Integrator.2.2.5 Unique data The identification of unique data is typically accomplished at the same time that the initial data format is determined. Often the data values or attributes to be used are in a specific format that needs to be accounted for within the data synchronization solution. Tip: For the advanced user, Tivoli Directory Integrator can be used to help identify some of the specifics of the data by using data and schema discovery functions in Directory Integrator.2.3 Plan the data flows The second step of designing a solution deals with planning the data flows. Many times this occurs simultaneously with the data identification phase. At a minimum, the solution architect needs to identify the following details: Authoritative attributes Unique link criteria Special conditions or business requirements Final data format Data cleanup Phased approach Frequency22 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 40. 2.3.1 Authoritative attributes When planning the flow of data, identifying which attributes are authoritative in what data source(s) is paramount. For example, an enterprise may determine that the human resources application is authoritative for all attributes describing an employee except for the employee’s e-mail address. The e-mail server is considered the authoritative data source for the e-mail address attribute. It is ideal that there be only one data store within the enterprise identified as being authoritative per attribute. It is possible to have multiple data stores as authoritative for the same attribute being synchronized. The most common attribute being the user password. It is best not to have any attributes have more than one authoritative data source. Tip: This is where the best practice mentioned earlier in the data access section of having separate logins for each connection comes in handy, so you know who is changing what attribute in its authoritative data store.2.3.2 Unique link criteria When synchronizing data within an enterprise, it is a technical requirement to identify some way to link the data sources. Simply put, how do you identify the same user across multiple data stores? A common way to link the multiple data stores is via a user’s unique identification number. For employees, it tends to be their unique employee number. In some cases, it is the e-mail address and in others it is some combination of attribute values. If there is no pre-existing unique identifier between data sources to be synchronized, one much be generated using some combination of attribute values or by using the best available logic applied to the business case. Fortunately, Tivoli Directory Integrator provides a simple way to link data sources on very simple or detailed linking criteria.2.3.3 Special conditions or requirements In many cases, special conditions or requirements exist within the use cases. This is often more obvious after the solution architect completes the detailed data identification process. A simple example of a special condition would be when the origination data source only contains the values of first name and last name for a user and the requirement is to synchronize their full name into a new attribute in the destination data source. This is where the solution architect would note the condition required to concatenate the user’s first name and last name together to generate the full name. Chapter 2. Architecting an enterprise data synchronization solution 23
  • 41. Another example of a special requirement might be that only users in certain departments have their e-mail address synchronized.2.3.4 Final data format When planning the flow of data for each use case, identifying the expected format of the data in the target system(s) is critical. The solution architect needs to resolve two concerns. In the first concern we have to perform identification of attributes that might have special or unique formatting of the data values. In some cases, this can create a requirement that might alter the expected flow of data. A common example of this occurs when the use case requires the attribute for a user’s manager to be synchronized into an LDAP data store. Since the solution architect previously identified the nature of the LDAP data store, they can then determine if the LDAP server requires the manager attribute to be the data format of a fully qualified distinguished name. The second concern regarding the final data format involves what has been mentioned in 2.2.4, “Initial data format” on page 21. The solution must allow for handling any of the four possible data states for the expected output. Once again, those data states are null, blank, out-of-range, and valid. This is less of an issue here. It occurs most often when the destination data store is being altered by many sources.2.3.5 Data cleanup At this stage of planning, it has most likely become apparent if a separate or additional data flow might be required to handle data that needs to be either cleaned up or has no matching attribute(s) between the source and destination data stores. These two conditions are the most common and are often referred to as handling dirty data and creating unique link criteria. If it becomes apparent this task is rather large, it is often a requirement to plan for a complete separate initial phase of the project to clean the data. The on-going data synchronization will continue to focus on accommodating the initial and final data formats mentioned in previous sections and will have solved the unique link criteria requirements.2.3.6 Phased approach Often times it is necessary to utilize a phased approach when planning your data flows. The need for a phased approach typically occurs when either there is a large amount of data cleanup required or the use case over time plans on24 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 42. changing the data source for specific authoritative attributes. Some common phases in an enterprise data synchronization project are the following: Phase 1 - Initial data cleanup and load. Phase 2 - Synchronization of data from multiple sources to one data store such as a directory server. Phase 3 - The directory server is now the authoritative source for some attributes and the synchronized data flow changes direction.2.3.7 Frequency Determining how often and when the data is to be synchronized for each use case is essential to planning the flow of data as well as an impact upon any sort of guaranteed response times. For example, if the source data is only available or updated once a day, this will determine the configuration of the data flow. Frequency also ties in closely with the format and technology connection for the data. For example, if the use case requires the source data to come from a message queue, the data flow would be planned to frequently check the queue to process incoming requests. Determining the events that trigger the data flows help to identify frequency.2.4 Review results The following excerpt shown in Example 2-1 is a sample document that can be used to build the foundation for documenting a solution. Once completed, the documentation becomes a source for reference, approvals, and communication within the project. Note: Be sure to include time in your project for documentation of your solution. At a minimum, plan on writing a functional specification and test plan. With documentation, you will have a smooth transition into production, increased maintainability and can prevent possible project pitfalls should the data not be as expected. You will also find it vital for maintaining and enhancing your work. Example 2-1 Human Resources to Corporate Directory data flow document sample This paper contains multiple data sources. Let us take a look at data source one: Data Flow Human Resources database to Corporate Directory Data source Human Resources (DB2®) Chapter 2. Architecting an enterprise data synchronization solution 25
  • 43. Connector type JDBC Parser None Connector Mode Iterator Attributes username full name employee ID address MultiValued Attributes None Link Criteria Special Conditions Make username in UID format using username and employee ID Make cn and sn out of full name Security Concerns Use SSL Here is data source two: Data Flow Human Resources database to Corporate Directory Data source Corporate Directory (IBM Tivoli Directory Server) Connector type LDAP Parser None Connector Mode Update Attributes uid cn sn givenname objectclass MultiValued Attributes objectclass Link Criteria uid=username Special Conditions Create multi-valued objectclass attribute Security Concerns Use SSL2.5 Instrument and test a solution In this section we discuss some of the areas on which to focus once you have identified the data to be synchronized for your business use case, planned the corresponding data flows, and reviewed the results of your effort. Often times it helps to keep these items in mind throughout the data identification process. You most certainly want to address some or all of these topics as you move into the design of the enterprise data synchronization solution.26 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 44. 2.5.1 Create workable units As mentioned at the start of this chapter, the key to success is to reduce complexity by breaking the problem up into smaller, manageable pieces. Ideally, you identified a portion of the overall solution prior to this point in the process. Creating smaller workable units is an important part of being able to rapidly integrate and enhance your data synchronization solution. So much so, that you will notice that the theme of simplifying and solving is evident even in the architecture and component structure of Tivoli Directory Integrator. Tip: When implementing your Tivoli Directory Integrator solution, a good practice is to keep the purpose of each AssemblyLine (data flow) as small as feasible while consolidating like functions. This facilitates development and troubleshooting and increases flexibility in implementation. Up to this point, we have walked you through the key integration steps from which to build your data synchronization solution. You have identified the systems involved in the communications, the data flows between these systems and events or frequency of data that trigger the data flows. A common mistake occurs when there is an attempt to integrate too many data stores initially. While you begin to realize the power and flexibility of Tivoli Directory Integrator, keep it in your mind to instrument smaller units of work on which you can build.2.5.2 Naming conventions It is important to establish some naming conventions for your data synchronization solution. Start with creating a consistent way to identify the location of your data. When instrumenting the solution, this can translate into the Tivoli Directory Integrator connector names. For example, if the location of your data is on a directory server, you might place a suffix on your connectors with names such as LdapConn. A connector that updates the directory server might be called UpdateLdapConn. Some choose to identify the data locations based on the name of the software such as Tivoli Directory Server (TDS). Therefore, you might choose UpdateTDSConn. The point is to begin the process of identifying your naming conventions for identification of the location of the data (the connectors) and also the data flows. It is a good idea to name your data flows to include a verb that can help identify your data flow. This translates into the Tivoli Directory Integrator AssemblyLine component that is covered in the following chapter. Chapter 2. Architecting an enterprise data synchronization solution 27
  • 45. Tip: The use of special characters and spaces in naming AssemblyLines or other Tivoli Directory Integrator components is not a good idea, as it might cause problems later when you want to start Tivoli Directory Integrator from a command prompt to run your solution.2.5.3 High availability and failover When planning the data flows, it occurs to most solution architects that there will be requirements for their data synchronization solution to include some level of high availability and/or failover capability. While the ensuing chapter and solution scenarios highlight the capabilities and related components of Tivoli Directory Integrator, it is important at this point to identify your solution requirements as they relate to high availability and failover. High availability typically translates to a data access probability greater than ninety-nine percent of the desired uptime and includes rapid recovery. Uptime, for most enterprises, is represented by a 24x7 around the clock operation. This puts a strong emphasis on the availability of the applications, servers, and interfaces that an enterprise uses to deliver data to their users; applications such as Web servers, directory servers, and databases. Given this definition, it becomes apparent that in order to determine what the high availability requirements are for your data synchronization solution, you must also get an idea of what the corresponding requirements are for the connected systems involved in your solution. For example, if the connected system is only available to receive updates once a day, your synchronization solution would typically have reduced or low requirements for availability of data. The availability requirements of the data synchronization solution will help to determine the Tivoli Directory Integrator components and architecture to instrument. Chapter 3, “Directory Integrator component structure” on page 41 provides more detail of the components and architecture with regard to availability by covering such topics as automatic connection reconnect and checkpoint/restart. When addressing availability, the topic of failover is often raised. The degree for which to plan for failover directly relates the data synchronization solutions’ availability requirements. The goal of failover is to answer the question of what to do if some piece of the solution fails. The following outline provides questions and categories of things to consider when addressing availability and failover capabilities for your solution. 1. Determine the availability requirements for your solution. Most solutions can be categorized as high, medium, or low availability.28 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 46. This list of questions can help identify availability requirements for your solution: a. What are the business requirements for the data synchronization solution? b. How do business requirements translate to availability? There are some fundamental business/availability rules: • Desired availability, cost, and complexity are directly related. • Cost and complexity tend to dictate availability choices. • Every enterprise is different based on their business values. c. What is the availability of the data or connected systems to be synchronized? d. Are there any special data conditions? For example, password synchronization requires high availability while many data feeds from human resources applications occur only once daily.2. Identify which types of failures need to be considered in order to provide adequate failover capability. The availability requirements will determine if your solution needs to address any or all of these types of possible failures. There are two main categories of failures for which to plan. The first category relates to the overall data synchronization infrastructure. The main aspect to focus on is to answer the question of what happens if any or all of the systems your solution connects to go down. Identify what the solution must do when the following occurs: – Connected systems fail. – Power failure. – Network failure. The second category relates to the perspective of the application environment for your solution; specifically the Tivoli Directory Integrator application. The focus is to answer the question of what happens if any piece or part of the Tivoli Directory Integrator solution fails. Note: The highest exposure or risk to your solution is if your data synchronization solution requires high availability and fails while the connected systems remain intact. Identify what the solution is to do when the following fails: – The Tivoli Directory Integrator application goes down. This includes items like power, hard disk, and/or operating system failures. – The data flows (Directory Integrator AssemblyLines) fail. Chapter 2. Architecting an enterprise data synchronization solution 29
  • 47. – The Directory Integrator server looses connectivity to one or more systems. This includes items like loss of network connection, data source, or authorization/access.2.5.4 System administration There are several items to consider when it comes to managing and maintaining your enterprise data synchronization solution. System administration tends to cover a broad range of topics. Some of the topics to be considered when architecting your solution include maintainability, configuration management, archiving and backup, logging and auditing, monitoring, and security of the solution. Maintainability and configuration management Maintainability and configuration management has to do with ensuring you account for items such as archiving and backup, version control, and determining if you will be working with multiple configuration environments for your solution. Note: You can greatly increase the ease of maintainability for your solution by ensuring your solution is properly documented at all stages of its lifecyle. Archiving and backup When addressing archiving and backup needs for your solution, it is important to identify your solution components that contain information important to be maintained. A Tivoli Directory Integrator solution typically consists of an XML formatted configuration file and a text formatted external properties file. Depending on the nature of the solution, the built in state store is utilized in the solution as well. This occurs more often than not. The state store that is typically used is the built-in Cloudscape™ database that comes with IBM Tivoli Directory Integrator. The state store is most commonly used to hold persistent data such as change numbers used when connecting to directory server changelogs or delta information about a particular connection. The state store could also be configured as an external database you may choose to configure separately. Note: There is a recommended way to backup your IBM Tivoli Directory Integrator Cloudscape databases that can be found in “Backing up CloudScape databases” on page 42 of the IBM Tivoli Directory Integrator 6.0: Administrator Guide, SC32-1716.30 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 48. One of the simplest ways to administer the archiving and backup of the TivoliDirectory Integrator XML configuration files is to use file naming conventions thatincrement and determine the status of the configuration based on the file name.It is important to ensure you have at least one backup of the external propertiesfile associated with your solution configuration. Ensuring there is a backup of theexternal properties file is oftentimes easily overlooked.As you instrument and test your solution, the list of solution components that youchoose to backup and archive may grow. An example of this could be if yoursolution utilizes any special drivers such as database drivers or possibly anycustom application interfaces that are required to connect to specialized datasources. It is ideal to establish an archiving and backup plan that meets yourorganization’s requirements prior to deploying your solution.Version controlVersion control can encompass several areas. Most often it involves makingconsiderations for both the software and hardware configurations and versions.In the case of an enterprise data synchronization solution, this can also involvethe versions related to the connected system sources and targets as well as theversion of the IBM Tivoli Directory Integrator software. It is a good idea to identifythings such as which version of some of the software components are beingutilized. In the case of IBM Tivoli Directory Integrator, this can include identifyingwhich version of JavaScript™ is utilized with your version of Tivoli DirectoryIntegrator.Version control of the IBM Tivoli Directory Integrator’s XML configuration filestypically occurs in the same manner as mentioned with archiving and backup.Creating incremental filename descriptions is typically the easiest and mosteffective way to manage version control for this component of your solution.Multiple configuration environmentsWhen architecting an enterprise data synchronization solution, it is ideal to planfor more than one configuration environment. Typically, you will deploy aminimum of two environments consisting of a test environment and a productionenvironment. Ideally, there is also a staging environment that provides for atransition between the test and production environments.Having multiple environments raises several items to consider with your solution.A main item is ensuring migration between the environments is easilymaintained. Migration of your IBM Tivoli Directory Integrator configurationsbetween environments is relatively simple. There are a few ways to considermaintaining this. A common way is to replicate the configuration files from oneenvironment to another while keeping separate install bases of the serversoftware in each environment. Plan on having a separate external properties fileto handle the connection configuration differences between environments. Chapter 2. Architecting an enterprise data synchronization solution 31
  • 49. Monitoring System administration of your solution involves identifying what parts of your solution you have requirements to monitor and how frequently. Monitoring includes real-time monitoring as well as logging and auditing. Real-time Monitoring your solution in real-time is a common requirement. Determining the frequency of the data flows as outlined in previous sections helps to determine your requirements for real-time monitoring. Knowing if the data synchronization solution is up and running is a minimum requirement. If your solution’s requirements are to synchronize data infrequently, then real-time monitoring becomes less critical. IBM Tivoli Directory Integrator provides an Administration and Monitor Console (AMC) which allows for real-time monitoring of your solutions as well as the ability to check logging results. Monitoring requirements have a few levels of access control. It is important to identify what organizational role will be performing which types of monitoring. For example, your solution requirements may be that an operator be able to see if the systems are running and restart them but not be able to make configuration changes. The IBM Tivoli Directory Integrator Administration and Monitor Console provides access levels for monitoring your solution. Logging and auditing Logging and auditing for enterprise solutions can oftentimes involve corporate standards for centralized logging or auditing. An example of this is when there is an enterprise standard for tracking system failures via a common management system that might watch and track Simple Network Management Protocol (SNMP) messages. IBM Tivoli Directory Integrator provides several mechanisms to either utilize a currently installed enterprise standard or provide its own capabilities should there be no corporate direction. Some of the built in logging options include logging to a rolling file, the console, a file, syslog, NT Event Log, or system log. When an enterprise has a management environment that utilizes technology such as SNMP traps or a database with a reporting application associated to it, IBM Tivoli Directory Integrator can be configured to utilize these options as well. When architecting your solution, it is important to identify if there are any enterprise standards for logging and auditing and what they may be. This is especially important to identify when considering any auditing requirements. Auditing tends to encompass compliance. Since each enterprise has unique compliance requirements, it is important to identify if there are any auditing rules for your data integration solution as soon as possible. Data auditing requirements32 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 50. can dictate your data flow and can quite easily expand your solution requirements in all areas.2.5.5 Security The security requirements for you data synchronization solution can be broken down in two main categories. The first involving the security of the data being synchronized and the second covering the security of the server, configurations and system administration interfaces. Data synchronization security It is important to identify the security requirements of the data you will be synchronizing. Most of the requirements become apparent as you identify the nature of your data and plan your data flows. The following two questions can be asked to further identify these requirements. 1. Does the entire data transmission between sources have to be secure for all data? Solutions for securing the data transmission involve utilizing technology such as SSL and HTTPS. Both technologies are provided with Directory Integrator. 2. Are there specific data attributes that must be encrypted? Many times this involves the password attribute. Directory Integrator provides several encryption methods and the ability to encrypt any attribute. It is not limited to just the password attribute. Server, configuration, and system administration security The following questions help to identify the requirements your solution may have relating to the security involved in administering your solution. 1. Does the server and configuration software need to be secure? The answer to this question is typically yes. Consideration needs to be made for the location and security of where you place the server software and how you maintain access to that environment. Directory Integrator provides password level access control to its configurations and encryption. 2. Do you need to have the access control values used for access to remote systems protected? Once again, the answer for this question is typically yes. The values used to access the data sources to be synchronized are usually very sensitive and powerful pieces of enterprise information. Directory Integrator provides an encryption for these values by providing a way to encrypt its external properties file. Chapter 2. Architecting an enterprise data synchronization solution 33
  • 51. Note: It is best to place all the values used for accessing the data sources to be synchronized into an external properties file so it can be encrypted. By encrypting all data source information you substantially contribute to the protection of enterprise sensitive data. 3. Does the remote administration of your solution need to be secure? Answering yes to this question means you have identified that your solution requires remote administration and secure access control to prevent unauthorized users from access. Directory Integrator provides secure connectivity to its administration and monitor console. Secure remote administration is a typical requirement for data synchronization solutions.2.5.6 Password synchronization Password synchronization is specifically mentioned when architecting a data synchronization solution since it tends to have its own set of data and implementation requirements. High availability, failover, and security are on the top of the list. It is important to incorporate the additional solution requirements that are introduced by password synchronization. The specific components of Tivoli Directory Integrator’s password synchronization capabilities are covered in 3.2.11, “Password synchronization” on page 65. When implementing password synchronization, it is ideal to have the passwords only flow in one direction. If your business requirements absolutely require bi-directional password synchronization, it is ideal to keep the number of repositories to be synchronized to a minimum. Bi-directional password synchronization introduces architecture issues such as loop and race conditions. This is covered further in our customer scenario one in Chapter 4, “Penguin Financial Incorporated” on page 91. Below is a list of things to consider when password synchronization is part of your solution: Identify the applications that will require passwords to be intercepted. Determine the application with the most restrictive default password rules. For example, RACF® has a requirement the passwords be eight characters in length and alphanumeric. Design for additional requirements if the password synchronization is multi-directional.34 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 52. 2.6 Who are the players in the solution Just as no two organizations are the same, neither are two different synchronization projects. What is common to both though is a subset of responsibilities that historically are found in every Tivoli Directory Integrator production deployment. This may lead to further training and planning activities, as well as clearing up confusion over who owns what. The assignment of these responsibilities to individuals within an organization is a key part of the success of a production deployment, as is the training of those individuals to a standard where they can comfortably fulfill their duties. For the purposes of outlining these responsibilities, we consider four standard departments that typically exist in most companies with a significant IT infrastructure. The final identified group for this exercise is the vendor. This is not always the case, however, it is relatively easy to map this model to the operations of the individual environment. 1. IT Infrastructure Group This group is commonly responsible for: – Responsible for enterprise directory infrastructure, mapping schemas, and supporting applications. – Evaluating and introducing new technologies into a company. – Be the internal advocate for the components in the software infrastructure. – Providing troubleshooting and internal training services beyond normal operations capabilities. – Provide the interface to vendors when product faults or advanced questions arise. 2. System Administrators / Operations This group is commonly responsible for: – Managing the day-to-day requirements of operating systems and process monitoring. – Backup, restore, and disaster recovery. – First line of troubleshooting. 3. Data Management/Security This group is commonly responsible for: – Determine and implement identity data management policy for applications. Chapter 2. Architecting an enterprise data synchronization solution 35
  • 53. – Determine and implement security policy for applications. – Develop and implement user and group administration tasks. – Understand, implement, and execute security audit procedures. 4. Application owners This group is commonly responsible for: – Implement and manage business applications that rely on the synchronized data infrastructure. – Provide application-level troubleshooting. 5. Software Vendors This group is commonly responsible for: – Provide software components of the infrastructure. – Provide planning and (sometimes) implementation services. – Provide detailed technical support. – Provide information about lifecycles of the software components for customer planning input (for example, release and end-of-service timeframes).2.6.1 Common roles and responsibilities The following charts outline the typical IBM Tivoli Directory Integrator administration roles and responsibilities as well as the groups that typically own and participate in those roles. First let us take a look the systems operations responsibilities.Table 2-1 Systems operations Task/Responsibility Owner / Implementer Other Contributors Define goal of the integration. This IT Infrastructure Group Each organization should provide usually includes the definition of the a representative to provide input business objective and the translation for this task. of the business objective into concrete directory integration definitions. Define the data that must flow and the IT Infrastructure Group Each organization should provide authoritative source for each data a representative to provide input element that will be managed. for this task.36 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 54. Task/Responsibility Owner / Implementer Other ContributorsDefine IBM Tivoli Directory Integrator Data Management / IT Infrastructure Group andAssemblyLine to accomplish specified Security Software Vendor totask. specify/provide procedures. Data Management/Security to provide requirements input. Application owners to assist with application integration requirements.Build prototype IBM Tivoli Directory Data Management / IT Infrastructure Group andIntegrator AssemblyLine to accomplish Security Software Vendor tospecified task. specify/provide procedures. Data Management / Security and Application owners to specify / provide procedures. System administration / operations personnel to over look operation input.Test prototype IBM Tivoli Directory Data Management / IT Infrastructure Group to specifyIntegrator AssemblyLine to accomplish Security / provide procedures. Systemspecified task. administration / operations to provide test specification input.Deploy IBM Tivoli Directory Integrator System Administration / IT Infrastructure Group andAssemblyLine to accomplish specific Operations Software Vendors to specify /task. provide procedures. Application owners to assist with application integration.Monitor deployed IBM Tivoli Directory System Administration / IT Infrastructure Group and DataIntegrator AssemblyLine to ensure Operations Management / Security to provideproper operation and to monitor for any information about monitoring anderror conditions. alerts requirements.Correct any detected IBM Tivoli System Administration / IT Infrastructure Group and DataDirectory Integrator AssemblyLine Operations Management / Security to provideerror conditions that occur. error recovery procedures. Application owners to provide troubleshooting assistance with application integration.Audit running integrated directory Data Management / System administration /infrastructure to ensure compliance to Security operations to assist with auditsbusiness rules. and control review.Monitor and maintain IBM Tivoli System Administration / Software Vendors to provide bestDirectory Integrator server health. Operations practice information. Chapter 2. Architecting an enterprise data synchronization solution 37
  • 55. Task/Responsibility Owner / Implementer Other Contributors Perform software upgrades and System Administration / Software Vendors to provide best software defect resolution. Operations practice information. Perform data backup and restore for System Administration / Software Vendors to provide best disaster recovery. Operations practice information. Next we take a look at the end to end troubleshooting responsibilities.Table 2-2 End to end troubleshooting Role / Responsibility Owner / Implementer Other Contributors Provide initial troubleshooting System Administration / IT Infrastructure Group to provide investigation to determine component Operations internal training. error (this is after helpdesk efforts). Determine if security policy is Data Management / IT Infrastructure Group. adversely affecting user experience. Security Determine if application is faulty. Application Owners IT Infrastructure Group. Provide detailed troubleshooting when IT Infrastructure Group Software Vendors. existing procedures fail. Next we take a look at support operations responsibilities.Table 2-3 Support operations Role / Responsibility Owner / Implementer Other Contributors Own and maintain one or more test All groups are involved It is imperative that all parties are systems for pre-production testing of involved in both test and new applications and regression production environments. testing. Next we take a look at test and design responsibilities.Table 2-4 Future testing Role / Responsibility Owner / Implementer Other Contributors Maintain currency with IBM Tivoli IT Infrastructure Group Data Management / Security. Directory Integrator versions via aggressive planning and regression strategy.38 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 56. 2.7 Conclusion Once again it is important to point out that no two organizations are the same. It is probable that the information stated above will not map universally to all organizations. The goal is still the same, to reduce the complexity of the problem by assigning responsibilities. Thereby clearing up confusion over who owns what. Chapter 2. Architecting an enterprise data synchronization solution 39
  • 57. 40 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 58. 3 Chapter 3. Directory Integrator component structure In Chapter 1, we discussed the business drivers for adopting a consistent identity infrastructure across an enterprise. We point out that in many circumstances companies prefer (or are obliged) to maintain more than one user repository. This is because it is hard to consolidate all user accounts into only one directory. In fact, the traditional approaches to directory infrastructures might no longer handle the growing volume of users, organizations, and resources in an enterprise. Companies are deploying department-specific applications, each with its own application-specific user repository, resulting in many individual repositories. These repositories can be LDAP directories, relational database (Oracle, DB2, and so on) tables, flat files in different formats (CSV, XML, and so on), operating systems, and other. Companies that decide to maintain more than one user repository and to leverage existing data and tools in order to build a consistent identity and data infrastructure have to integrate them by implementing an identity and data management solution. IBM Tivoli Directory Integrator is designed to fit this requirement. Directory Integrator provides an authoritative, enterprise-spanning identity and data infrastructure critical for security and for provisioning applications, such as portals. It enables integration of a broad set of information into the identity and resource infrastructure. There is virtually no limitation on the type of data or© Copyright IBM Corp. 2006. All rights reserved. 41
  • 59. system with which Directory Integrator is able to work. It has a number of built-in connectors to directories, databases, formats, and protocols, as well as an open-architecture Java™ development environment to extend existing connectors or create new ones, and tools to configure connectors and apply logic to data as it is processed. In addition to integrating data between applications or directories, IBM Tivoli Directory Integrator can be helpful for other reasons such as: Eliminate the need for an inflexible centralized database. Capability for distributed data management. Supply of a non-intrusive integration. Business and security rules can be introduced to manage flow, ownership, and structure of information between different systems. Supply of a modular, flexible, and scalable solution. This is possible because any integration task is divided into simple pieces, which are then clicked together. This approach enables introduction of Directory Integrator starting with a portion of the overall solution and then expanding to the whole enterprise. Easy and rapid modifications of the designed solution are always possible. Capability of both timed and real-time integration. With the event-driven engine, data flow can be triggered by many types of events such as database or directory change, e-mail arrival, file creation or modification, or HTTP calls. Capability to intercept password changes and to propagate the new password to multiple accounts. Rapid development, testing, deployment, and maintenance with the graphical interface. Support of most standard protocols, transports, APIs and formats such as JDBC, LDAP, JMS, JNDI, and XML. Support of JavaScript for scripting. Easy integration with other IBM products such as the WebSphere® family and other Tivoli security products such as Access Manager and Identity Manager. Wide platform support. It can run on UNIX® (AIX®, HP-UX, Solaris™), Windows and Linux® (Red Hat, SuSE and United Linux on Intel®, IBM p-series and s/390 platforms). Refer to the IBM Tivoli Directory Integrator 6.0: Administrator Guide, SC32-1716, and the BM Tivoli Directory Integrator 6.0: Release Notes for more information about the supported platforms, versions, and requirements. Figure 3-1 shows a general example of an enterprise architecture using IBM Tivoli Directory Integrator. In the following section, we introduce the Directory42 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 60. Integrator concept and show how information is synchronized and exchanged between the various systems. Directory AIX Integrator Active Lotus Directory Domino WebSphere MQ Main- frame Web Web Directory Services Services Integrator Database Linux Directory .net File Directory Integrator Figure 3-1 A general data integration environment3.1 Concept of integration The IBM approach is to simplify a large integration project by breaking it into individual small components, then solve it one piece at a time. Integration problems typically can be broken down into three basic parts: The systems and devices that have to communicate with each other. The flows of data among these systems. The events that trigger when the data flows occur. These constituent elements of a communications scenario can be described as follows.3.1.1 Data sources These are the data repositories, systems and devices that talk to each other, such as the Human Resources (HR) database, an enterprise directory, the Chapter 3. Directory Integrator component structure 43
  • 61. enterprise resource planning (ERP) system, a customer relationship management (CRM) application, the office phone system, a messaging system with its own address book, or maybe a Microsoft® Access database with a list of company equipment and to whom the equipment has been issued. Data sources represent a wide variety of systems and repositories, such as databases (for example, IBM DB2, Oracle, Microsoft SQL Server), directories (such as Sun™ Java™ System Directory Server, IBM Tivoli Directory Server, Lotus® Domino®, Novell eDirectory, and Microsoft Active Directory), directory services (Microsoft Exchange), files (for example, Extensible Markup Language (XML), LDAP Data Interchange Format (LDIF), or Simple Object Access Protocol (SOAP) documents), specially formatted e-mail, or any number of interfacing mechanisms that internal systems and external business partners use to communicate with information assets and services.3.1.2 Data flows These are the threads of communications and their content and are usually drawn as arrows that point in the direction of data movement. Each data flow represents a dialogue between two or more systems. However, for a conversation to be meaningful to all participants, everyone involved must understand what is being communicated. But data sources likely represent their data content in different ways. One system might represent a telephone number as textual information, including the dashes and parentheses used to make the number easier to read. Another system might store it as numerical data. If these two systems are to communicate about this data, the information must be translated during the conversation. Furthermore, the information in one source might not be complete and might have to be augmented with attributes from other data sources. In addition, only parts of the data in the flow might be relevant to receiving systems. Therefore, a data flow must also include the mapping, filtering, and transformation of information, shifting its context from input sources to that of the destination systems.3.1.3 Events Events can be described as the circumstances that dictate when one set of data sources communicates with another. One example is whenever an employee is added to, updated within, or deleted from the HR system.44 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 62. An event can also be based on a calendar or a clock-based timer (for example, starting communications every 10 minutes or at 12:00 midnight on Sundays). It can also be a manually initiated one-off event, such as populating a directory or washing the data in a system. Events are usually tied to a data source and are related to the data flows that are triggered when the specified set of circumstances arises. In the following section we show how each of these elements is handled by IBM Tivoli Directory Integrator using its base components.3.2 Base components IBM Tivoli Directory Integrator is comprised of two applications: Toolkit Integrated Development Environment (IDE) This program provides a graphical interface to create, test, and debug the integration solutions. The Toolkit IDE is used to create a configuration file (called a config), which is stored as a highly structured XML document and is executed by the run-time engine. The Toolkit IDE executable is called ibmditk. In 3.7, “Administration and monitoring” on page 84 we describe some features of this interface. Run-time Server Using a configuration file you created with the Toolkit IDE, the Run-time Server powers the integration solution. This application is called ibmdisrv, and you can deploy your solution using as many or as few server instances as you want. There are no technical limitations. From a logical point of view the Directory Integrator architecture is divided into two parts: The core system, where most of the system’s functionality is provided. The core handles log files, error detection, dispatching, and data flow execution parameters. This is also where customized configuration and business logic is maintained. The Administration and Monitor Console (AMC) is the interface for working with these core functionalities. Because AMC is a Web console, administration can be done remotely using a Web browser, without the need to physically log on to the Directory Integrator server. AMC is described in more detail in 3.7, “Administration and monitoring” on page 84. The components, which serve to provide an abstraction layer for the technical details of the data systems and formats that you want to work with. There are four main types of components: AssemblyLine, Connectors, Parsers, Function Components, and EventHandlers, and because each is wrapped by Chapter 3. Directory Integrator component structure 45
  • 63. core functionality that handles things such as integration flow control and customization, the components themselves can remain small and lightweight. For example, if you want to implement your own Parser, you only have to provide two functions: one for interpreting the structure of an incoming bytestream, and one for adding structure to an outgoing one. This core/component design allows easy extensibility. It also means that you can rapidly build the framework of your solutions by selecting the relevant components and clicking them into place. Components are interchangeable and can be swapped out without affecting the customized logic and configured behavior of your data flows. This means that you can build integration solutions that are quickly augmented and extended while keeping them less vulnerable to changes in the underlying infrastructure. The key elements of the integration solution are the AssemblyLines. The arrows drawn in Figure 3-1 on page 43 can each represent an AssemblyLine. Each AssemblyLine implements a single uni-directional data flow. A bi-directional synchronization between two or more data sources is implemented by separate AssemblyLines, one for each direction.3.2.1 AssemblyLines Real-world industrial AssemblyLines are made up of a number of specialized machines that differ in both function and construction, but have one significant attribute in common: They can be linked to form a continuous path from input sources to output. An AssemblyLine generally has one or more input units designed to accept whatever raw materials are needed for production (fish fillets, cola syrup, car parts). These ingredients are processed and merged. Sometimes by-products are extracted from the line along the way. At the end of the production line, the finished goods are delivered to waiting output units. If a production crew gets the order to produce something else, they break the line down, keeping the machines that are still relevant to the new order. New units are connected in the right places, the line is adjusted, and production starts again. IBM Tivoli Directory Integrator AssemblyLines work similar to real-world industrial AssemblyLines. The general philosophy of an AssemblyLine is that it processes data (for example, entries, records, items, objects) from one data source, transforms and combines it with data from others sources, and finally outputs it to one or more targets. Figure 3-2 shows an example of a Directory Integrator AssemblyLine.46 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 64. Figure 3-2 AssemblyLineLet us take a closer look at what goes on inside an AssemblyLine.As shown in Figure 3-3 on page 48 an AssemblyLine may consist of manycomponents. The generic part of the component, called the AssemblyLinecomponent, provides kernel functionality like Attribute Maps, Link Criteria, Hooks,and so on. The data-source specific part of the component, called the componentinterface, is connected to some system or device, and has the intelligence towork with a particular API or protocol. These component interfaces areinterchangeable.This AssemblyLine wrapper makes components work in a similar and predictablefashion. It enables AssemblyLine components to be linked together, as well asproviding built-in behaviors and control points for customization. Chapter 3. Directory Integrator component structure 47
  • 65. Figure 3-3 AssemblyLine components How data is organized can differ greatly from system to system. For example, databases typically store information in records with a fixed number of fields. Directories, on the other hand, work with variable objects called entries, and other systems use messages or key-value pairs. As shown in Figure 3-4 on page 49 Directory Integrator simplifies this issue by collecting and storing all types of information in a powerful and flexible Java data container called a work Entry. In turn, the data values themselves are kept in objects called attributes that the entry holds and manages. The work Entry is passed between AssemblyLine components which in turn perform work on the information it contains, for example, joining in additional data, verifying content, computing new attributes and values, as well as changing existing ones, until the data is ready for delivery to one or more target systems. Additional Scripts can also be added to perform these operations. As a result, attribute mapping, business rules, and transformation logic do not have to deal with type conflicts.48 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 66. Figure 3-4 Entry objects and AttributesIn addition to the work Entry object used by the AssemblyLine to move data downthe flow, Figure 3-4 also shows an additional Java bucket nestled in each of theConnectors. These local storage objects are used to cache data during read andwrite operations. A Connector’s local Entry object is called its conn object, andexists only within the context of the Connector. When a Connector reads ininformation, it converts the data to Java objects and stores it in the local connobject. During output, the Connector takes the contents of its conn, converts thisdata to native types and sends it to the target system.However, since each conn object is only accessible by its Connector, anadditional mechanism is needed to move data from these localized caches to theshared work Entry object after Connector input—and the other direction foroutput Connectors. Figure 3-4 shows an arcing arrow that illustrates thismovement of Attributes between the Connectors’ local conn Entries and theAssemblyLines work Entry object. This process is called Attribute Mapping and isdescribed in more detail in 3.2.8, “Attribute Map components” on page 64.Suffice it to say that Attribute Maps are your instructions to a Connector on whichAttributes are brought into the AssemblyLine during input, or included in outputoperations.An AssemblyLine is designed and optimized for working with one item at a time,such as one data record, one directory entry or one registry key. However, if youwant to do multiple updates or multiple deletes (for example, processing morethan a single item at the time) then you must write AssemblyLine scripts to do Chapter 3. Directory Integrator component structure 49
  • 67. this. If necessary, this kind of processing can be implemented using JavaScript, Java libraries and standard IBM Tivoli Directory Integrator functionality (such as pooling the data to a sorted datastore, for example with the JDBC Connector, and then reading it back and processing it with a second AssemblyLine). AssemblyLines should contain as few Connectors as possible (for example, one per data source participating in the flow), while at the same time including enough components and script logic to make them as autonomous as possible. The reasoning behind this is to make the AssemblyLine easy to understand and maintain. It also results in simpler, faster, and more scalable solutions. Another benefit of this can be the reusability of AssemblyLines.3.2.2 Connectors Connectors are like puzzle pieces that click together, while at the same time link to a specific data source. There are basically two categories of Connectors: The first category is where both the transport and the structure of data content is known to the Connector (that is, the schema of the data source can be queried or detected using a well known API such as JDBC or LDAP). The second category is where the transport mechanism is known, but not the content structuring. This category requires a Parser (see 3.2.3, “Parsers” on page 60) to interpret or generate the content structure in order for the AssemblyLine to function properly. Each Connector is characterized by two properties, type and mode. The type is related to the data sources that the Connector links to the AssemblyLine. The mode identifies the role of the Connector in the data flow, and controls how the automated behavior of the AssemblyLine drives the component. Connectors can be in one of the following eight modes. Iterator Lookup AddOnly Update Delete CallReply Server Delta Each Connector mode determines the behavior of a specific Connector, and not all Connectors support all modes of operation. For example, the File System Connector supports only a single output mode, AddOnly, and not Update, Delete or CallReply. When you use a Connector you must first consult the50 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 68. documentation for this component for a list of supported modes. Connectors inIterator or Server mode are automatically placed in the Feed section of theAssemblyLine Detail window, Connectors in other modes end up in the Flowsection. Each of the connector modes is explained in detail in the next section.You can change both the type and mode of a Connector whenever you want inorder to meet changes in your infrastructure or in the goals of your solution. If youplanned for this eventuality, the rest of the AssemblyLine, including datatransformations and filtering, will not be affected. That is why it is important totreat each Connector as a black box that either delivers data into the mix orextracts some of it to send to a data source. The more independent eachConnector is, the easier your solution will be to augment and maintain. Best practice: By making your Connectors as autonomous as possible, you can readily transfer them to your Connector Library and reuse them to create new solutions faster, even sharing them with others. Using the library feature also makes maintaining and enhancing your Connectors easier, because all you have to do is update the Connector template in your library, and all AssemblyLines derived from this template inherit these enhancements. When you are ready to put your solution to serious work, you can reconfigure your library Connectors to connect to the production data sources instead of those in your test environment, and move your solution from lab to live deployment in minutes.Whenever you need to include new data to the flow, simply add the relevantConnector to the AssemblyLine. In the example of Figure 3-5 on page 52 we seethree Connectors: two input Connectors to an RDBMS and an LDAP Directory,and one output Connector to an XML document.Let us examine the different Connector modes. Chapter 3. Directory Integrator component structure 51
  • 69. Figure 3-5 AssemblyLine with connectors, parsers, and data sources Connector modes This section describes, in detail, each of the Connector modes. Iterator mode Connectors in Iterator mode are used to scan a data source and extract its data. The Iterator Connector actually iterates through the data source entries, reads their attribute values, and delivers each work Entry to the AssemblyLine and its non-Iterator Connectors. A Connector in Iterator mode is referred to as an Iterator. Note: It does not matter exactly what the data source is (database, LDAP directory, XML document, and so forth) and how its data is actually stored. Each Connector presents an abstract layer over the particular data source and you access and process data through instances of the work Entry and Attribute classes. AssemblyLines (except the special cases when called with an initial work Entry) typically contain at least one Connector in Iterator mode. Iterators (Connectors in Iterator mode) supply the AssemblyLine with data. If an AssemblyLine has no Iterator, it is often useless unless it gets data from another source (for example, the script or process that started the AssemblyLine, or data created in a Prolog script). AssemblyLine Connectors that are set to any mode except Iterator are powered in order starting at the top of the Connector list. Iterators on the other hand are52 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 70. always run first before other non-Iterator Connectors, regardless of theirplacement in the AssemblyLine. Additionally, if you are using multiple Iterators ina single AssemblyLine, the Iterators are used, one at a time, in their order in theConnectors list.Multiple Iterators in an AssemblyLine: If you have more than one Connector inIterator mode, these Connectors are stacked in the order in which they appear inthe Config (and the Connector List in the Config Editor, in the Feeds section) andare processed one at a time. So, if you are using two Iterators, the first one readsfrom its data source, passing the resulting work Entry to the first non-Iterator,until it reaches the end of its data set. When the first Iterator has exhausted itsinput source, the second Iterator starts reading in data.An initial work Entry is treated as coming from an invisible Iterator processedbefore any other Iterators. This means an initial work Entry is passed to the firstnon-Iterator in the AssemblyLine, skipping all Iterators during the first cycle. Thisbehavior is visible on the AssemblyLine Flow page in Appendix B: AssemblyLineand Connector mode flowcharts of the IBM Tivoli Directory Integrator 6.0:Reference Guide, SC32-1720.Assume you have an AssemblyLine with two Iterators, ItA preceding ItB. The firstIterator, ItA, is used (the AssemblyLine ignoring ItB) until ItA returns no moreentries. Then the AssemblyLine switches to ItB (ignoring ItA). If an initial workEntry is passed to this AssemblyLine, then both Iterators are ignored for the firstcycle, after which the AssemblyLine starts calling ItA.Sometimes the initial work Entry is used to pass configuration parameters into anAssemblyLine, but not data. However, the presence of an initial work Entrycauses Iterators in the AssemblyLine to be skipped during the first cycle. If youdo not want this to happen, you must empty out the work Entry object by callingthe task.setWork(null) function in a Prolog script. This causes the first Iterator tooperate normally.Lookup modeLookup mode enables you to join data from different data sources using therelationship between attributes in these systems. A Connector in Lookup mode isoften referred to as a Lookup Connector. In order to set up a Lookup Connectoryou must tell the Connector how you define a match between data already in theAssemblyLine and that found in the connected system. This is called theConnector’s Link Criteria, and each Lookup Connector has an associated LinkCriteria tab where you define the rules for finding matching entries.AddOnly modeConnectors in AddOnly mode (AddOnly Connectors) are used for adding newdata entries to a data target. This Connector mode requires almost no Chapter 3. Directory Integrator component structure 53
  • 71. configuration. Set the connection parameters and then select the attributes to write from the work Entry. Update mode Connectors in Update mode (Update Connectors) are used for adding and modifying data in a data target. For each work Entry passed from the AssemblyLine, the Update Connector™ tries to locate a matching entry from the data target to modify with the work Entry’s attributes values received. As with Lookup Connectors, you must tell the Connector how you define a match between data already in the AssemblyLine and that found in the connected system. This is called the Connector’s Link Criteria, and each Update Connector has an associated Link Criteria tab where you define the rules for finding matching entries. If no such Entry is found, a new Entry is added to the data target. However, if a matching entry is found, it is modified. If more than one entry matches the Link Criteria, the On Multiple Entries Hook is called. Furthermore, the Output Map can be configured to specify which attributes are to be used during an Add or Modify operation. When doing a Modify operation, only those attributes that are marked as Modify (Mod) in the Output Map are changed in the data target. If the entry passed from the AssemblyLine does not have a value for one attribute, the Null Behavior for that attribute becomes significant. If it is set to Delete, the attribute does not exist in the modifying entry, thus the attribute cannot be changed in the data target. If it is set to NULL, the attribute exists in the modifying entry, but with a null value, which means that the attribute is deleted in the data target. An important feature that Update Connectors offer is the Compute Changes option. When turned on, the Connector first checks the new values against the old ones and updates only if and where needed. Thus you can skip unnecessary updates that can be really valuable if the update operation is a heavy one for the particular data target you are updating. Delete mode Connectors in Delete mode (Delete Connectors) are used for deleting data from a data source. For each work Entry passed to the Delete Connector, it tries to locate matching data in the connected system. If a single matching entry is found, it is deleted, otherwise the On No Match Hook is called if none were found, or the On Multiple Entries Hook is more than a single match was found. As with Lookup and Update modes, Delete mode requires you to define rules for finding the matching Entry for deletion. This is configured in the Connector’s Link Criteria tab.54 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 72. CallReply modeCallReply mode is used to make requests to data source services (such as Webservices) that require you to send input parameters and receive a reply withreturn values. Unlike the other modes, CallReply gives access to both Input andOutput Attribute Maps.Server modeThe Server mode, available in a select number of Connectors is meant to providefunctionality previously handled by EventHandlers that needed to send back areply message to the system originating the event. You can find more informationabout the EventHandler in 3.2.4, “EventHandlers” on page 61.Server mode is configured using parameters similar to those found in thecorresponding EventHandler from previous versions. These components behavein a similar fashion to their EventHandler counterparts, connecting to targetsystems and either polling or subscribing to event notification services.On event detection, the Server mode Connector then either proceeds with theFlow section of this AssemblyLine, or if an AssemblyLine Pool has beenconfigured for this AssemblyLine then it contacts the Pool Manager process torequest an available AssemblyLine instance to handle this event.Once the Server mode Connector has been assigned the AssemblyLine instanceit needs to continue, it spawns an instance of itself in Iterator mode, tied to thechannel/session/connection that will deliver the event data. This Iterator workerobject then operates as any normal Iterator does, including following thestandard Iterator Hook flow, reading the event entries one at a time and passingthem to the other Flow components for processing until there is no more data toread. At this time, the worker Iterator is cleared away, and if necessary, the PoolManager is informed that this AssemblyLine instance is now available again.When an AssemblyLine with a Server mode connector uses the AssemblyLinePool, the AssemblyLine Pool will execute AssemblyLine instances frombeginning to end. Before the AssemblyLine instance in the AssemblyLine Poolcloses the Flow connectors, the AssemblyLine Pool retrieves those connectorsinto a pooled connector set that will be reused in the next AssemblyLine instancecreated by the AssemblyLine Pool (the AssemblyLine Pool uses thetcb.setRuntimeConnector method).There are two system properties that govern the behavior of connector pooling.1. This property defines the timeout in seconds before a pooled connector set is released. Chapter 3. Directory Integrator component structure 55
  • 73. 2. This property defines the connector types that are excluded from pooling. If a connector’s class name appears in this comma separated list it is not included in the connector pool set. When a new AssemblyLine instance is created by the AssemblyLine Pool, it will look for an available pooled connector set, which, if present, is provided to the new AssemblyLine instance as runtime provided connectors. This ensures proper flow of the AssemblyLine in general in terms of hook execution and so on. Note that connectors are never shared. They are only assigned to a single AssemblyLine instance when used. Delta mode The Delta mode is designed to simplify the application of delta information (make the actual changes) in a number of ways. It provides more optimal handling of delta information generated by either the Iterator Delta Store feature (Delta tab for Iterators), or Change Detection Connectors like the IDS/LDAP/AD/Exchange Changelog Connectors, or the ones for RDBMS and Lotus/Domino Changes. Note: A Connector in Delta mode needs to be paired with another Connector which provides Delta information, otherwise the Delta mode has no delta information to work with. The Delta features in Tivoli Directory Integrator are designed to facilitate synchronization solutions. You can look at the system’s Delta capabilities as divided into two sections: Delta Detection and Delta Application. Delta Detection: Tivoli Directory Integrator provides a number of change (delta) detection mechanisms and tools: Delta Store: This is a feature available to Connectors in Iterator mode. If enabled from the Iterator’s Delta tab, the Delta Store feature uses the System Store to take a snapshot of data being iterated. Then on successive runs, each Entry iterated is compared with the snapshot database to see what has changed. Change Detection: These components leverage information in the connected system to detect changes, and are either used in Iterator or Server mode, depending on the Connector. For example, Iterator mode is used for many of the Change Detection Connectors, like those for LDAP, Exchange, and ActiveDirectory Changelog, as well as the RDBMS and Domino/Notes Change Connectors. We now discuss a few features of Change Detection connectors.56 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 74. Iterator State Store featureThis feature uses the System Store to keep track of the starting point for aChange Detection Connector (for example, the changenumber of a directorychangelog).It keeps track of the next change to be processed, even between runs of theAssemblyLine. The value of the Iterator State Store parameter must beglobally unique, so that if you have multiple AssemblyLines that use ChangeDetection Connectors, they will each have their own Iterator state data.The content of the Iterator State Store works in combination with Connectorconfiguration settings provided for selecting the next change to process—theStart at... parameter(s). For example, in the IBMDirectoryServer ChangelogConnector this is the Start at changenumber parameter where you can enterthe changelog number where processing is to begin. This parameter can beset to either a specific value (for example, 42), to the first change (forexample, 1), or to EOD (End of Data). The EOD setting places the cursor atthe end of the change list in order to only process new deltas.As long as no Iterator State Store is specified, the Change DetectionConnector continues to use the Start at... setting each time the Connectorperforms its selectEntries() operation; for example, when the Iterator isinitialized at AssemblyLine startup, or in a Loop. The same happens if there isno value stored for the specified Iterator State Store.So, the very first time you run the AssemblyLine with the Change DetectionConnector there will be no Iterator State Store value yet, so the Start at...parameter(s) will be used. On subsequent executions, the Start at... settingswill be ignored and the Iterator State Store value applied instead.Change notification featureWhere supported a Change Detection Connector registers with the datasource for change notifications, receiving a signal whenever a change ismade. If this parameter is set to false the Connector will poll for new changes.If this parameter is set to true then after processing all unprocessed changesthe Connector will block through the Server Search Notification Control andget notified by the data source when a change occurs. The Connector will notsleep and timeout when the notification mechanism is used. OtherConnectors have to poll the connected system periodically looking for newchanges. Those that rely on polling also provide a Sleep interval option todefine how often polling occurs.Batch retrieval featureWhere supported the batch retrieval feature specifies how searches areperformed in the changelog. When set to false the Connector will performincremental lookup (backward compatible mode). When set to true a query of Chapter 3. Directory Integrator component structure 57
  • 75. type changenumber>=some_value will be executed for batch retrieval of all modified entries with optional retrieving on pages. The System Store based Delta Store feature reports specific changes all the way down to the individual values of attributes. This fine degree of change detection is also available when parsing LDIF files. Other components are limited to simply reporting if an entire Entry has been added, modified or deleted. This delta information is stored in the work Entry object, and depending on the Change Detection component/feature used may be stored as an Entry-Level operation code, at the Attribute-Level or even at the Attribute Value-Level. Delta Application (Connector Delta Mode): The Delta mode is designed to simplify the application of delta information in a number of ways. Firstly, Delta mode handles all types of deltas, adds, modifies and deletes. This reduces most data synch AssemblyLines to two Connectors, one Delta Detection Connector in the Feeds section to pick up the changes, and a second one in Delta mode to apply these changes to a target system. Furthermore, Delta mode will apply the delta information at the lowest level supported by the target system itself. This is done by first checking the Connector interface to see what level of incremental modification is supported by the data source. If you are working with an LDAP directory, then Delta mode performs Attribute value adds and deletes. In the context of a traditional RDBMS (JDBC), then doing a delete and then an add of a column value does not make sense, so this is handled as a value replacement for that Attribute. Note: The only Connector that currently supports incremental modification is the LDAP Connector, since LDAP directories provide this functionality. This is dealt with automatically by the Delta mode for those data sources that support this functionality. If the data source offers optimized calls to handle incremental modifications, and these are supported by the Connector Interface, then Delta mode will use these. On the other hand, if the connected system does not offer intelligent delta update mechanisms, Delta mode will simulate these as much as possible, performing pre-update lookups (like Update mode), change computations and subsequent application of the detected changes. Connector states The state of a Connector determines its level of participation in the operation of the AssemblyLine. In general terms, an AssemblyLine performs two levels of Connector operation:58 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 76. 1. Powering up the Connector at the start of AssemblyLine operation and closing its connection when the AssemblyLine completes.2. Driving the Connector during AssemblyLine operation according to the Connector mode.Enabled stateEnabled is the normal Connector state. In enabled state, a Connector is poweredup and closed, as well as being processed during AssemblyLine operation.Passive statePassive Connectors (Connectors in passive state) are powered up and closedjust like enabled Connectors. However, they are not driven by the AssemblyLineautomated behavior. However, Connectors in passive state can be invoked byscript code from any of the control points for scripting provided by IBM TivoliDirectory Integrator. For example, if you have a passive Connector in yourAssemblyLine called myErrorConnector then you could invoke its add()operation with the following script code:var err = system.newEntry(); // Create new Entry objecterr.merge(work); // Merge in attributes in the work Entry// This next line sets an attribute called Errorerr.setAttribute ( "Error", "Operation failed" );myErrorConnector.add( err ) // Add new err Entry;Disabled stateIn disabled state, the Connector is neither initialized (and closed) nor operatedduring normal AssemblyLine activation. If you want to use it in your scripts, thenyou must initialize it yourself.The name of a disabled Connector is registered but pointing at null, so you canwrite conditional code like the following example to handle the situation whereyou plan on setting myConnector to disabled state.if (myConnector != null) myConnector.connector.aMethod();This state is often used during troubleshooting in order to simplify the solutionwhile debugging, helping to localize any problems.Directory Integrator provides a library of Connectors to choose from, such asLightweight Directory Access Protocol (LDAP), JDBC, Microsoft Windows NT4Domain, Lotus Notes®, and POP/IMAP. If you cannot find the one you need, youcan extend an existing Connector by overriding any or all of its functions usingJavaScript. You can also create your own, either with a scripting language insidethe Script Connector wrapper, or develop with Java or C/C++. Chapter 3. Directory Integrator component structure 59
  • 77. Furthermore, Directory Integrator supports most transport protocols and mechanisms, such as TCP/IP, FTP, HTTP, and Java Message Service (JMS)/message queuing (MQ). It also supports secure connections and encryption mechanisms as shown in 3.3, “Security capability” on page 67. Table 3-1 summarizes the more relevant built-in connectors. However, this list can change with the product version. For more information about available connectors, scripting languages, and how to create your own, see the IBM Tivoli Directory Integrator 6.0: Reference Guide, SC32-1720. Table 3-1 Main available connectors Applications PeopleSoft, SAP, Siebel ERP, IBM Tivoli Access Manager (through database access, scripting, or API calls) Databases (using ODBC, JDBC) Oracle, Microsoft Access and SQL Server, IBM DB2 and Informix® Directories (using LDAP) CA eTrust, Critical Path, IBM Tivoli Directory Server, iPlanet™, Microsoft Active Directory and Exchange, Nexor, Novell, OpenLDAP, Oracle, Siemens Files, Streams and Internet Protocols CSV, XML, DSML, HTTP, LDIF, SOAP, DNS, POP, IMAP, SMTP, SNMP Specific Technologies and APIs Microsoft ADSI, CDO, and other COM; Microsoft NT domains; Lotus Domino directory and databases; Java APIs; system commands Messaging Services IBM MQ Changes & Deltas LDAP Changelog, Active Directory changes, NT/AD Password sync, TCP connections, HTTP gets and posts3.2.3 Parsers Even unstructured data, such as text files and bytestreams coming over an IP port, is handled quickly and simply by passing the bytestream through one or more Parsers. The system is shipped with a variety of Parsers, including LDIF, Directory Services Markup Language (DSML), XML, comma-separated values (CSV), SOAP, and fixed-length field. As with Connectors, you can extend and modify these, as well as create your own. In the example in Figure 3-5 on page 52, a Parser is used to interpret and translate information from an LDIF file. The extracted information is converted to60 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 78. a Java object with a canonical data format so that the LDIF Connector can work with this object and dispatch it along the AssemblyLine.3.2.4 EventHandlers EventHandlers provide functionality for building real-time integration solutions. Figure 3-6 depicts a typical AssemblyLine with an Event handler. Figure 3-6 AssemblyLine with EventHandler As with Connectors, EventHandlers can have data source intelligence that enables them to connect to a system or service and wait for an event notification. Examples are the Mailbox EventHandler, which can detect when new messages arrive in a POP3 or IMAP mailbox, and the LDAP EventHandler, which can catch changes made to a directory. When an event occurs, the EventHandler stores the specifics of the event and then performs logic and starts AssemblyLines according to the condition or action rules that you set up. Sometimes Connectors can also be used to capture events, as is the case with the JMS (MQ) Connector or the LDAP Changelog Connector, both of which can be configured to wait until new data appears and then retrieve it. However, because the EventHandlers operate in their own thread, they can be used to dispatch events to multiple AssemblyLines. This provides a cleaner and more straightforward method of filtering and handling multiple types of events from the same source (such as SOAP or Web services calls). EventHandlers can also be configured for auto start, meaning that if you start up a server, these EventHandlers will be activated immediately. Chapter 3. Directory Integrator component structure 61
  • 79. Figure 3-6 on page 61 shows that a system event can trigger the AssemblyLine. Important: With the availability of Directory Integrator 6.0 the functionality of EventHandlers will more and more be fulfilled by using regular Connectors in Server Mode. When developing new AssemblyLines you should utilize Connectors in Server Mode wherever possible. More information can be found in the Connector section in “Server mode” on page 55. Now that we have introduced the main components of an AssemblyLine, we show how to customize the AssemblyLine in order to add business rules and logic.3.2.5 Hooks Hooks enable developers to describe certain actions to be executed under specific circumstances or at any desired points in the execution of an AssemblyLine. For example, hooks can be placed before or after a Connector, or in consequence of a specific event such as an update failure or a read success. IBM Tivoli Directory Integrator automatically calls these user-defined functions as the AssemblyLine runs. The majority of the scripting in IBM Tivoli Directory Integrator takes place in the Hooks. For example, hooks can be used to build custom logic, to handle Global Variables, and to set specific error processes and hooks. A complete list of all hooks can be found in “Chapter 2 IBM Tivoli Directory Integrator concepts, Hooks, List of Hooks” on page 60 of the IBM Tivoli Directory Integrator 6.0: Users Guide, SC32-1718.3.2.6 Scripts A key capability of IBM Tivoli Directory Integrator is the ability to extend virtually all of its integration components, functions, and attributes through scripts or Java. Scripting can be anywhere in the system to add or modify the components of an AssemblyLine. Connectors, Parsers, EventHandlers, and Hooks can be customized in order to perform requested tasks. Scripts are commonly used to map attributes, transform data, access libraries (for example, to call Java classes), handle errors, control data flow, and in general to add business logic. Directory Integrator supports JavaScript plug-in scripting language and extensive script libraries.62 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 80. 3.2.7 Function components A Function component is an AssemblyLine wrapper around some function or discreet operation, allowing it to be dropped into an AssemblyLine as well as instantiated/invoked from a script. The idea behind Function components is to allow complex components (for example, the Web Services EventHandler) to be split into smaller logical units and then strung together as needed, as well as to provide more visual helper objects where custom scripting was necessary before. Function components also offer the functionality previously provided by EventHandler Actions (for example, launching AssemblyLines, invoking Parsers, and so on). As with all Tivoli Directory Integrator components, the user can easily create their own scripted Function components, turning custom logic into a library of reusable AssemblyLine components. Function components are similar to Connectors in CallReply mode in that they have both Input and Output Maps. The Output Map is used to pass parameters to the Function component, while the Input Map lets you retrieve and manipulate return data. myFunction.callreply( work ) The above example is invoking the AssemblyLine Function called myFunction. Note that calling the AssemblyLine Function method callreply() will cause Attribute Maps and the normal Function component Hook flow to be executed. Like the other components, Function components have a library folder in the Config Browser where you can configure and manage your Function component library. These can be then dragged into AssemblyLines or chosen from the selection drop-down that appears when you press the Add component button under the AssemblyLine Connector List. Also like other components, Function components have an interface part (like the Connector interface or Parser interface, in the case of Function components called the Function interface) that implements the function logic. When a Function component is dropped into an AssemblyLine, it is wrapped in an AssemblyLine Function object that provides the generic functionality necessary for the AssemblyLine to manage and execute it. Also like Connectors, Function components have a State that can be set to active, passive, or disabled. State behavior is identical with that of Connectors. Since Function components are registered as script variables (beans) when the AssemblyLine starts up, you can access them directly from your script using the name given them in the AssemblyLine. Chapter 3. Directory Integrator component structure 63
  • 81. 3.2.8 Attribute Map components This component lets you define Attribute transformations as freestanding Attribute maps that can be stored in your component Library and dropped into your AssemblyLine. Adding new Attributes to the work Entry and other data manipulation can be quickly performed using the Attribute Map component, which defines a mapping from the work Entry to itself, allowing you to create new Attributes as well as change existing ones. And all Attributes defined in Attribute Map components are displayed in the work Entry list as well, easing maintenance and support for the Config.3.2.9 Branch components Analogous to the old EventHandler Conditions, Branches allow the user to define alternate routes in an AssemblyLine. This means that AssemblyLines will no longer necessarily be simple, uni-directional data flows. Branches mean that a single AssemblyLine can handle solutions that previously required a collection of AssemblyLines. The Branch provides an interface that allows you to define Simple Conditions based on Attributes in the work Entry object. Multiple Conditions are ANDed or ORed, depending on the Match Any checkbox setting. After Simple Conditions are processed, there is a script editor window at the bottom of the Branch details page where you can create your own Condition in JavaScript. The syntax here is the same as it was for EventHandler Conditions in that you must populate ret.value with either a true or false value in order to control the outcome of Condition evaluation. Scripted Conditions can be combined with Simple ones, or used exclusively. If a Condition evaluates to true then all components attached to the Branch are executed. Note: Once Branch component execution is complete, control is passed to the first component appearing in the AssemblyLine Component List after the Branch. Since Branches only implement simple IF logic, should you need an IF-ELSE construct then you must use two Branches: one with your IF test, and the other with a complementary set of Conditions (for example, IF NOT...).64 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 82. 3.2.10 Loop components The Loop component provides functionality for adding cyclic logic within an AssemblyLine. Loops can be configured for three modes of operation: 1. Conditional Here you can define Simple and/or Scripted Conditions that control looping. The details window for this type of Loop construct is the same as for the Branch component described in the previous section. 2. Connector This method lets you set up a Connector for Iterator or Lookup mode, and will cycle through your Loop flow for each Entry returned. This is the preferred way of dealing with Multiple Entries found for a Lookup. The Details pane of this type of Loop will contain the Connector tabs necessary to configure it, connect and discover attributes and set up the Input Map. Note that you have a parameter called Init Options where you can instruct the AssemblyLine to either – Do nothing, which means that the Connector will not be prepared in any way between AL cycles. – Initialize and Select/Lookup causing the Connector to be re-initialized for each AL cycle. – Select/Lookup only keeping the Connector initialized, but redoes either the Iterator select or the Lookup, depending on the Mode setting. Note also that there is a Connector Parameters tab that functions similar to an Output Map in that you can select which Connector parameters are to be set from work Attribute values. 3. Attribute Value By selecting any Attribute available in the work Entry, the Loop flow will be executed for each of its values. Each value is passed into the Loop in a new work Entry attribute named in the second parameter. This option allows you to easily work with multi-valued attributes, like group membership lists or e-mail.3.2.11 Password synchronization The password synchronization feature, which is more a module than a component, can be very useful when designing an AssemblyLine that has the goal to synchronize passwords. Password synchronization can be accomplished by treating passwords as any other attributes and using Connectors as shown in the previous sections. However, this module provides enhanced security for this critical data. The Chapter 3. Directory Integrator component structure 65
  • 83. password intercept module is available only for certain platforms, such as Microsoft Active Directory, IBM Lotus Domino, and RACF. When a user attempts to change a password using the traditional tools, this module intercepts password changes before they are completed. While the password change to the target repository is completed with the native methods, the intercepted new password is temporarily stored in a repository such as an LDAP server or an MQ queue. Then Directory Integrator uses an EventHandler to propagate the new password to other repositories that contain user accounts. Because the password is intercepted before it is actually changed, error handling is possible. Figure 3-7 shows what happens when a user changes the Windows Domain password. The password synchronization module hooks an exit provided by the Windows Operating System to intercept and validate password changes. The module stores the two-way-encrypted new password in the LDAP directory in the ibmDIKey attribute for the user’s entry. If no entry for the user exists in the container, one will be created. The LDAP Changelog Event Handler listens to the Directory Server Changelog and starts an AssemblyLine when a change notification is received. password catch Active Active Directory modify Directory password process password store LDAP password store Target Target Target Directory File Database LDAP Directory EventHandler Integrator AssemblyLine Figure 3-7 Password interception with Active Directory Security is a strong point of password synchronization modules: The password interceptor encrypts the new password with a two-way algorithm before sending it to the data store. Furthermore, SSL can be added to this communication. In66 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 84. general, IBM Tivoli Directory Integrator provides high security in this module and in all of its parts. In IBM Tivoli Directory Integrator multiple password synchronization plug-ins can share the same MQ queues simplifying setup and maintenance of multi-domain password synchronization solutions.3.3 Security capability Directory Integrator supports distributed environments through a wide range of communication modes, including TCP/IP, HTTP, LDAP, JDBC, and Java Message Service (JMS)/message queuing (MQ). SSL and other encryption mechanisms can be added to any of these methods to secure the information flow. Additionally, the graphical interfaces (IDE and AMC) can be configured to be accessed by SSL. SSLv3 encrypts communications on the wire. The Java Cryptography Extension (JCE) opens a wide range of security capabilities, such as encrypting information in communications and storage, X.509 certificate, and key management to integrate with PKI efforts in the enterprise. The AMC supports client certificate authentication and access rights to the IBM Tivoli Directory Integrator configuration can be defined per user. The configuration file can optionally be encrypted by IBM Tivoli Directory Integrator server using server certificate. The Configuration Editor accesses such configurations in remote mode. In the previous sections we introduced the base components and showed that a wide range of data sources are supported. We just saw that communication between different systems can be encrypted. With these elements, hundreds of different solutions can be set up to fit different requirements. In the following section we show some general architectural concepts and some examples.3.4 Physical architecture IBM Tivoli Directory Integrator can be presented through a number of use cases that can illustrate the technical capabilities and some of the solutions that can be architected, but we cannot show all possible architectures with all of the different data sources and data flows. So we introduce some general considerations about the use of an enterprise directory and some basic structures of data flow, not as a comprehensive list, but as frameworks or some mental structures to the creative mind for further development. Chapter 3. Directory Integrator component structure 67
  • 85. 3.4.1 Combination with an enterprise directory There are two major metadirectory models or approaches to integrating existing enterprise data stores and building an authoritative source for identity information that exist: Metaview, which introduces one main central directory store where all data is aggregated and then synchronizes and publishes data from there back to all other authoritative repositories. Point-to-Point synchronization, to avoid the central repository and configure event driven automatic data flows and reconciliation between the repositories, based on business rules and technical requirements. Metadirectories are often used to accomplish the following goals: Create a single enterprise view of users from attributes stored in network services. Enforce business rules that define the authoritative source for attribute values. Handle naming and schema discrepancies. Provide data synchronization services between information sources. Enable network and security administrators to manage large, complex networks. Simplify the management of user access to corporate resources. As the foundation for a metadirectory solution, IBM Tivoli Directory Integrator supports both solutions and provides a means of managing information that is stored in multiple directories. It provides Connectors for collecting information from many operating system and application specific sources and services, as well as for integrating the data into a unified namespace. It can provide a central enterprise directory, as well as integrate distributed directories directly. By design IBM Tivoli Directory Integrator seems especially suited for the second approach. As a metadirectory, it extends the directory with services for managing information that is stored in multiple directories. It acts as the hub for making changes between the disparate systems, and it has a number of facilities that enable it to act as the agent for change on these disparate systems. A scenario based on this architecture is shown in Figure 3-1 on page 43. The important design decision is on the authoritative data repository; after that it is a matter of defining the data flows for each AssemblyLine. There are two possibilities for the implementation of a centralized enterprise directory. The architecture can have one directory with different authoritative data sources for different identity information as shown in Figure 3-8 on page 69, or you can define your central directory as the authoritative data source. In this68 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 86. case, all of the data flows have to be configured in a way such that the centraldirectory server is the prime source for all identity information within theintegrated environment. For our scenario depicted in Figure 3-8 we would haveto change the arrows to allow data flows only from the enterprise directory to theother repositories. This means that data is essentially managed only on onedirectory server, and then IBM Tivoli Directory Integrator propagates anychanges to the other repositories. Last Name: Kent First Name: Clark Title: Reporter Emp No:1234 HR Reports to: Perry White Last Name: Kent Database Tel: 555-5555 First Name: Clark City: Metropolis Title: Reporter notesID: Clark Kent/Metropolis Tel: 555-5555 Bidirectional Mail: City: Metropolis entry and Password: jf!9 ….. attribute Enterprise directory ….. flows Notes Appl. NT NAB Specific Domain Directory Directory User Name: Clark Kent/Metropolis Domain: DailyPlanet MailServer: DPXXXX UserID: clarkk ID: ckent …… Password: jf!9 Password: jf!9 serverID: yy01 Role: user Group: reporter ….Figure 3-8 Scenario with an enterprise directoryThe choice between the solutions depends on the company requirements andstructures. There are no technical issues that favor one or the other approach.Mainly it is a matter of choosing the authoritative source for your identityinformation and considering management, security, privacy, economic, and riskissues.Regardless of the choice you make, the basic element for identity dataintegration is data flow. To architect an integrated and reliable identityinfrastructure, several data flows must be implemented. Therefore in typicalsolution design you must determine: How does information flow between systems? When does information flow between systems? What data and schema transformations are required?In the next section we discuss different topologies available for data flows. Chapter 3. Directory Integrator component structure 69
  • 87. 3.4.2 Base topologies In this section we present some topologies that can be used to architect more complex solutions. For every topology, we identify a data source, a flow, and a destination. In the following examples, each element is drawn in separate boxes. This is just a logical separation. From the physical point of view some of these elements might reside on the same machine. For instance, it is quite common to place a Directory Integrator server on the same machine as its data source. The decision of whether to use different servers depends only on performance and availability. One-to-one We begin with the easiest scenario shown in Figure 3-9. Data exists in a file that must be synchronized, transformed, and maintained in a directory. This file could be updated regularly by an HR application or other enterprise systems. Directory File Directory Integrator Figure 3-9 One-to-one integration A wide range of file formats can be accommodated for the input file. The selection on the file format is defined in the input Connector, mostly configured in Iterator mode. Different ways are available to manipulate and filter the input data stream, such as using the Parser or different scripting methods. A separate output Connector is established to the directory. Directory Integrator discovers the attributes in the file and enables mapping to attributes in the directory as well as applying transformation rules to modify the content of the incoming data. The file can be read at regular intervals, or read whenever Directory Integrator discovers that it is available. The outside application may also trigger Directory Integrator to read the file at its own leisure. Many-to-one The second scenario is shown in Figure 3-10 on page 71. Data exists in multiple related systems that have to be synchronized, transformed, and maintained in a directory. Different attributes of data must be joined before an update to the directory can take place.70 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 88. e-mail Directory Directory Directory Integrator Database FileFigure 3-10 Many-to-one integrationConnections are established to each data source using input Connectors.Schemas in databases are automatically detected. Rules may be created thatdescribe how attributes from one source are used with attributes from othersystems to create the desired results. Information from the data sources can becombined in any way and mapped to the directory. Administrators can select theauthoritative source for each piece of information. Data from one system may beused to look up information in another.IBM Tivoli Directory Integrator can detect changes in real-time within certaindirectories, allowing immediate update of other connected systems. Connectionsmay be configured to look up only data that has been modified within a certaintime frame, or data sets that conform to a specific search criteria.One-to-manyA one-to-many scenario is the opposite of that described in the previousexample. Information updated in one source is propagated to many destinations.Directory Integrator can perform exactly the same write, update, delete, andcreate modifications on all connected systems as it does for directories. Therules are simply adapted for the context. Now all systems can share the commonauthoritative data set.In this third scenario, presented in Figure 3-11 on page 72, we introducebidirectional flows. Bidirectional flows can be configured such that there is eitheronly one authoritative data source for each piece of information or concurrentauthoritative sources for the same data. In the second case the data in thedirectory is provisioned from multiple connected systems as well as frompossible modifications done by applications connected to the directory. Theconnected systems could have great interest in this data, especially whenDirectory Integrator ensures that they always operate on the correct informationby updating them whenever the authoritative data changes. Chapter 3. Directory Integrator component structure 71
  • 89. By configuring the connectors, using Hooks and scripting, administrators can apply rules to define and monitor the flows. However, we recommend being careful with multiple data sources for the same piece of information. A good idea is to have only one point where specific data can be modified. This is not a technical issue, because Directory Integrator easily allows multiple data sources. It is a matter of implementing clear processes and data flows. On the other hand, it is common and often advisable to have sources for specific data on different systems. For example, in Figure 3-11, users could modify their e-mail address or preferences only in the e-mail database, while they could change their password only with an application that directly interacts on the Directory. e-mail Directory Directory Directory Integrator Database File Figure 3-11 One-to-many integration Other data resources There are many reasons why data flows through channels such as message queuing, HTTP, e-mail, FTP, and Web Services. Data might need to pass through firewalls that block protocols like LDAP and database access. Security, high-availability, transactions control, and the desire for asynchronous or synchronous data transfer are other reasons. It is important to understand that directory Integrator can both send and receive with these mechanisms. This creates a wide scope of solution opportunities, too wide to describe in simple use cases. Some examples are illustrated in Figure 3-12 on page 73.72 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 90. Directory Directory Database Integrator Directory Integrator Main- frame MQ Web Web Directory Services Services AIX Main- Linux frame .net Figure 3-12 Other data sources integration3.4.3 Multiple servers In the scenarios shown so far, there is only one IBM Tivoli Directory Integrator server. In this section we present some topologies with multiple server instances. Distributed In a distributed architecture, a single point of integration is often undesirable, for reasons such as distance, financial, security, availability or governance. All of the mechanisms described previously, such as IP, HTTP, Web Services, e-mail, MQ and others can be used to communicate between instances of IBM Tivoli Directory Integrator. In Figure 3-13 on page 74 the arrows indicate use of such communications mechanisms in two examples. In the first example the input stream is too fast compared to the business rules that IBM Tivoli directory Integrator has to execute and multiple instances can operate on a queue. In the second example a two-way architecture propagates updates in the directory to the rest of the enterprise and consolidates local modifications back to the central directory. Chapter 3. Directory Integrator component structure 73
  • 91. Source Source Directory Source Integrator Source MQ Directory Directory Integrator Directory Directory Directory Directory Integrator Integrator Integrator Integrator Source Figure 3-13 Distributed integration Federated While similar to the distributed scenario, federated implies that control and management is not entirely centralized. This could be business units or entities that cooperate, but want to retain local control over how and what information is shared with others. By sharing certain parts of the Directory Integrator configuration, Directory Integrator servers have access to shared transports, formats, and business rules. The example scenario shown in Figure 3-14 on page 75 could be that different business units want to retain local control over information shared with others. Local configuration allows administrators to set restrictions on the data sets that are exposed, the attributes that are sent and received, as well as any local transformation rules that need to be applied to the data going to or coming from the other participants. If a company is spread across multiple sites, it could be beneficial to have an IBM Tivoli Directory Integrator server in each location and then to have data flows only between these servers.74 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 92. Applications Portals Enterprise Federated single sign-on identity solutions Source Directory Source Directory Integrator Directory Directory Integrator Integrator Directory Source Integrator IP, HTTP, FTP, e-mail, MQ, Web services Figure 3-14 Federated integration The main message in this section is that IBM Tivoli Directory Integrator enables you to use any topology and different transport mechanisms to integrate data stored in various formats on multiple disparate systems. In the following section we introduce another level of complexity by using multiple servers to implement high availability and increase performance.3.5 Availability and scalability High availability means that the IT service is continuously available to the customer, as there is little or no downtime and rapid service recovery. The achieved availability can be indicated by metrics. The availability of the service depends on: Complexity of the infrastructure architecture Reliability of the components Ability to respond quickly and effectively to fault There are several high availability mechanisms inside IBM Tivoli Directory Integrator on various levels from Connectors and AssemblyLines to the Server itself. Let us take a brief look at some of them. Chapter 3. Directory Integrator component structure 75
  • 93. Automatic connection reconnect AssemblyLines need to access remote servers. Ideally, those remote servers should be online and available for the entire time the AssemblyLine is running. In the real world, however, server and network failures are common. IBM Tivoli Directory Integrator has an automatic reconnect feature. This is sufficient for short term outages, where the AssemblyLine can just try to reconnect until it succeeds. You can configure this in the Connector’s Reconnect sub-tab as shown in Figure 3-15. Figure 3-15 Automatic connection reconnect The parameters you need to provide are: Auto reconnect enable - The master switch for the reconnect functionality for this Connector. Check to enable. Number of retries - The number of times the Connector will try to re-establish the Connection, once it fails. The default is 1. When the number of retries is exceeded, an exception is thrown. Delay between retries - The number of seconds to wait (in seconds) between successive retry attempts. The default is 10 seconds. This also means that AssemblyLine Connectors have a new reconnect() method that can be called from your script as needed. If a connection is lost, control passes to the On Connection Failure Hook if enabled. This Hook is available in all Connector modes. Once the Hook completes (or skipped if not enabled) the system then checks if Auto Reconnect has been enabled for this Connector. If it is, then this feature is invoked, otherwise control is passed to the Error Hooks as normal.76 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 94. Typical use of the On Connection Failure Hook is to write some message to thelog, or even change Connector parameters, for example, pointing it to somebackup data source. However, since reconnect may not be implemented for aConnector you are using, you can simulate reconnect yourself in the OnConnection Failure Hook by terminating and then re-initializing the Connectorwith script code. Note: If you do not want the Connector to Auto Reconnect after invoking the On Connection Failure Hook, you must either disable Auto Reconnect, or redirect flow by throwing an exception (with calls like system.retryEntry() or system.skipEntry()) or by stopping the AssemblyLine itself with system.abortAssemblyLine( message ).Directory Integrator enables the user to checkpoint the operation ofAssemblyLines and restart them from the point where they were interrupted byeither a controlled or uncontrolled shutdown through the Checkpoint/Restartframework.Checkpoint/RestartCheckpoint/Restart is not supported in AssemblyLines containing a Connector inServer mode, an Iterator mode Connector with Delta enabled, an AssemblyLineusing the Sandbox facility, or a conditional component like a Branch or Loop. Theserver will abort the AssemblyLine when/if this is discovered.The Checkpoint/Restart framework stores state information and otherparameters at various points during AssemblyLine execution, enabling the serverto reinstate the running environment of the AssemblyLine so that it can berestarted in a controlled way. This can be on the original server, but potentiallycan also be on a different machine. The ability to restart an AssemblyLine is oneof the building blocks for failover functionality. Note: IBM Tivoli Directory Integrator is not a system that provides general failover functionality straight out-of-the-box. Rather, it has a framework that provides generic building blocks for this kind of functionality, and can in this way reduce the amount of hand-coding that might otherwise be required. Be aware, though, that the framework does not implement full checkpoint and restart functionality at the click of a mouse. Some thought as to how it is applied to the business problem at hand is essential.See the IBM Tivoli Directory Integrator 6.0: Users Guide, SC32-1718 for moreCheckpoint/Restart details. Chapter 3. Directory Integrator component structure 77
  • 95. Failover Services (FoS) Failover Services is an error management mechanism for IBM Tivoli Directory Integrator components. It enables the monitoring of AssemblyLine execution and allows the Administration and Monitor Console (learn more on the AMC in 3.7, “Administration and monitoring” on page 84) administrator to set up alternate actions to be performed on the detection of component failure. You can see an example setup window in Figure 3-16. Figure 3-16 FoS setup For more FoS details see the IBM Tivoli Directory Integrator 6.0: Administrator Guide, SC32-1718. Automatic high availability The basic concept of high availability is to have at least two servers capable of performing the same job and a fail-over mechanism to switch from one server to the other if one of the servers fails. IBM Tivoli Directory Integrator does not provide such fail-over mechanism out-of-the-box. Therefore, one way to provide automatic high availability is to implement an architecture as shown in Figure 3-17 on page 79, where one IBM Tivoli Directory Integrator Server instance is configured to watch the other just-in-case and can take over if the second one fails to respond.78 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 96. Directory Integrator Source Target Directory IntegratorFigure 3-17 Just-in-case high availabilityThe other possible way of high available automatic failover mechanism is toinstall the server in a cluster environment such as HACMP™ for AIX as shown inFigure 3-18. Directory Integrator Source Target Directory IntegratorFigure 3-18 ClusteringHowever, remember that all AssemblyLine definitions and configurations arestored within one highly structured XML file called Config. Therefore, if oneserver fails, it is sufficient to start a separate server with the same Config file inorder to continue the service. IBM Tivoli Directory Integrator’s main goal is toperform data integration, not real-time services. This means that a short period ofunavailability (for example, for maintenance reasons) can be tolerated in mostcases.A fail-over mechanism must be configured between the two servers, dependingon functional requirements of the data integration environment.Scalability is a strong feature of IBM Tivoli Directory Integrator. There is virtuallyno limit to the number of servers that can be added. As it was already shown onFigure 3-13 on page 74, different servers can work on different data flows or ondifferent data of the same data flow. Chapter 3. Directory Integrator component structure 79
  • 97. Considering the AssemblyLine mechanisms, no additional effort is required to integrate multiple servers. Each AssemblyLine is designed to work on different data. Different AssemblyLines integrate different data sources regardless of whether these AssemblyLines reside on the same server or on multiple servers. AssemblyLine Pool With AssemblyLine Pool you can build high performance solutions that won’t incur connection costs to the target systems for each processed event. Also, the AssemblyLine pool will automatically enable an AssemblyLine to service a number of simultaneous requests, and not execute the requests serially. You can configure Pool options from the Show Dialog button next to the Define ALPool Options on the Config tab of an AssemblyLine as shown in Figure 3-19. Figure 3-19 AL Pool The parameters you need to provide are: Number of prepared instances - How many instances of the Flow part of this AssemblyLine to instantiate, power up and then keep in the Pool, ready for use. Maximum concurrent instances - What is the maximum number of current Flow instances that you want created at any one time. Note: pooling is only available if you have a Server mode Connector in the Feeds section of your AssemblyLine. See the IBM Tivoli Directory Integrator 6.0: Users Guide, SC32-1718 for more ALPool details.3.6 Logging IBM Tivoli Directory Integrator enables you to customize and size logs and outputs. It relies on log4j as a logging engine. Log4j is a very flexible framework80 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 98. that lets you send your log output to a variety of different destinations, such asfiles, the Windows EventLog, UNIX Syslog, or a combination of these. It is highlyconfigurable and supports many different types of log appenders and can betuned so it suits most needs. It can be a great help when you want totroubleshoot or debug your solution. In addition to built-in logging, script code canbe added in AssemblyLines to log almost any kind of information. If the loggingfunctionality will not suffice, then there are additional tracing facilities.The log scheme for the server (ibmdisrv) is described by the file log4j.propertiesand elements of the Config file, while the console window you get when runningfrom the Config Editor (ibmditk) is governed by the parameters set Logging for the Config Editor program itself is configuredin the file Note: Any of the aforementioned properties files can be located in the Solutions Directory, in which case the properties listed in these files override the values in the file in the installation directory.You can create your own appenders to be used by the log4j logging engine bydefining them in the file. Additional log4j compliant drivers areavailable on the Internet, for example, drivers that can log using JMS or JDBC. Inorder to use those, they need to be installed into the IBM Tivoli DirectoryIntegrator installation jars directory after which appenders can be defined usingthose additional drivers in the logging of IBM Tivoli Directory Integrator is done globally usingthe files and/or External Properties or specifically, using theibmditk tool, for each AssemblyLine, EventHandler, or Config File as a whole.Logging for individual AssemblyLines and EventHandlers is applied in addition toany specification done at the Config level.To provide this level of flexibility andcustomization, the Java log4j API is used.All log configuration windows operate in the same way: For each one you can setup one or more log schemes. These are active at the same time, in addition towhatever defaults are set in the and files.In Figure 3-20 on page 82 you can see an example of the Syslog scheme, whichenables IBM Tivoli Directory Integrator to log on UNIX Syslog. Chapter 3. Directory Integrator component structure 81
  • 99. Figure 3-20 Syslog scheme See the IBM Tivoli Directory Integrator 6.0: Administrator Guide, SC32-1716 for details on schemes configuration. Key data is logged from the Directory Integrator engine, from its components (Connectors, Parsers, and so on), as well as from user’s scripts. Almost every Connector has a debug parameter called Detailed Log, with which you can turn on and off the Connector’s output to the log file. Seven log levels range from ALL to OFF for sizing the output. ALL logs everything. DEBUG, INFO, WARN, ERROR and FATAL have increasing levels of message filtration. Nothing is logged on OFF. Note: IBM Tivoli Directory Integrator logmsg() calls log on INFO level by default. This means that setting loglevel to WARN or lower silences your logmsg as well as all Detailed Log settings. However, the logmsg() call also has a level parameter that can be used to override the log level for individual logmsg() calls. In order to augment the IBM Tivoli Directory Integrator built-in logging, you can create your own log messages by adding script code in your AssemblyLine. Different information can be dumped, such as the content of an object or attribute, the state of a Connector, or any desired text. This means that you can indicate to the log file or to the console any state of the custom logic of your AssemblyLines. See the IBM Tivoli Directory Integrator 6.0: Users Guide, SC32-1718, for logging details and examples.82 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 100. Note: Errors from Attribute Map Components do not show the name of the Attribute Map Component, only the name of the AssemblyLine, and often (depending on the error), the name of the attribute being mapped. The message will often contain the name of the attribute that is mapped, which should give you a hint as to which Attribute Map it is that fails.DebuggingIn addition, IBM Tivoli Directory Integrator offers you a Flow Debugger (not to beconfused with a script debugger). The Flow Debugger lets you step through yourAssemblyLines and examine and change variables and/or run script directly. Anexample of Flow Debugger usage is shown in Figure 3-21.Figure 3-21 Flow DebuggerThe debugger is started from the Config Editor by clicking the Run Debuggerbutton from the AssemblyLine or EventHandler detail window. Once the selectedtask is started, the debugger pauses processing at specified breakpoints.Whenever execution is paused, you can use the Evaluate button to displayinformation or run script. There is also an Edit watch list button that offers you thesame option, however the resulting watch-list is remembered and evaluated at Chapter 3. Directory Integrator component structure 83
  • 101. each breakpoint. One example of a variable you might want to watch is work (the work Entry object). By entering work in the Evaluate dialog, or adding it to your watch-list, you can see work serialized to the Output pane of the debugger. Note: If you evaluate (or watch) the script task.dumpEntry(work), then the work Entry is dumped to the log output pane instead, just as though you had this code in your solution. Tracing In addition to the user-configurable logging functionality described in previous section, IBM Tivoli Directory Integrator is instrumented throughout its code with tracing statements, using the JLOG framework, a logging library similar to log4j, but which is used inside Directory Integrator specifically for tracing and First Failure Data Capture (FFDC). To which extent this becomes visible to you, the end user, depends on a number of configuration options in the global configuration file, and the Server command line option -T. Tracing is done in using JLOG’s PDLogger object. PDLogger or the Problem Determination Logger logs messages in Logxml format (a Tivoli standard), which IBM Support understands and for which they have processing tools. Note: Normally, you should be able to troubleshoot, debug and support your solution using the logging options. However, when you contact IBM Support for whatever reason, they may ask you to change some parameters related to the tracing functionality described here to aid the support process. See the IBM Tivoli Directory Integrator 6.0: Administrator Guide, SC32-1716 for tracing details, configuration and parameters.3.7 Administration and monitoring Config Editor is a program that gives you a graphical interface to create, test and debug any AssemblyLines with all the components and the optional scripting. It is an Integrated Development Environment (IDE), introduced in 3.2, “Base components” on page 45, used to create a configuration file that describes your solution, and is powered by the runtime Server. This configuration is called a Config, hence the name Config Editor. The Config Editor is started by initiating the ibmditk batch-file or script, which sets up the Java VM environment parameters and then starts the Config Editor. It enables you to work with multiple Configs at the same time. Configs are stored as highly structured XML documents and can be encrypted. When you start the Config Editor, either from your system’s launch interface or from the command line with the ibmditk84 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 102. command you will see the Main Panel. In the default layout, using the Cards layout, the left navigation pane provides a tree view of the current configuration, as well as all the current AssemblyLines, EventHandlers, Connectors, and so forth as shown in Figure 3-22. AssemblyLines can be created easily by selecting components. The Attributes definition in the connected elements is automatically discovered and mapping can be done simply by dragging or renaming attributes.Figure 3-22 Config editor main panel See the IBM Tivoli Directory Integrator 6.0: Users Guide, SC32-1718, for details on Config Editor. When the AssemblyLines are ready and the integration solution is deployed, administration and monitoring can be performed. Once the integration solution is in maintenance mode, operators need to be able to run AssemblyLines manually. One option is to give operators access to the Config Editor. However, since operators should not modify AssemblyLines, this option violates the principle of least privilege. Another possibility is to let Chapter 3. Directory Integrator component structure 85
  • 103. operators run AssemblyLines from the command line. However, unless they need shell access for a different reason, this also violates the principle of least privilege. Also, remembering the commands are not user friendly. The Administration and Monitor Console (AMC2) is an application for the remote administration and monitoring of IBM Tivoli Directory Integrator servers, which allows operators to only perform the actions they are allowed to do, and to do so from a user friendly Web browser environment. Note: The principle of least privilege states that users should only be given those permissions they need to do their jobs. For example, operators who do not need to change IBM Tivoli Directory Integrator AssemblyLines should not be allowed to do it. AMC2 is using the Remote Server API, Java Server Pages, and Apache Struts. In addition to AssemblyLines monitoring, SSL support, TCB (trusted computing base) awareness, Log files cleanup, Console users management, and configuration changes, you may also set up connections to multiple IBM Tivoli Directory Integrator server instances and configuration files running on them. AMC2 communicates with IBM Tivoli Directory Integrator servers over SSL using the Java Security Extensions. It is pre-configured to work with the server that it is bundled with. In order to use AMC2 with servers that use other certificates than the one they were shipped with, the server certificates need to be added to the AMC2 truststore, and the AMC2s certificate needs to be added to the server truststores. AMC2 permissions are assigned per Config. This enables IBM Tivoli Directory Integrator to enforce a separation of roles even when the same server is used for multiple purposes in the organization. For example, a server might be used to synchronize both user accounts and office supply information. If you put all the AssemblyLines related to users in one Config and all the AssemblyLines related to office supplies in another, then operators can have permissions to one but not the other. There are three permission levels in AMC2: Read - This means readu2013only permission. The user cannot change anything or run anything. This level is useful for auditors and operators in training. Execute - This level allows users to execute AssemblyLines and EventHandlers, and view and delete the resulting logs. However, users with execute permissions are not allowed to modify or delete any components or component properties. This permission level is for operators.86 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 104. Admin - This level allows full control of IBM Tivoli Directory Integrator, similar to the control available through the Config Editor. A sample of user to Config mapping is shown in Figure 3-23: Note: The Administration and Monitor Console (v.2) has been included in IBM Tivoli Directory Integrator 6.0 and is fully supported but provides only a US English interface. Figure 3-23 AMC2 user to Config mapping See the IBM Tivoli Directory Integrator 6.0: Administrator Guide, SC32-1716 for details about AMC2 files, setup and configuration.3.8 Conclusion In this chapter we introduced the architecture and components of Tivoli Directory Integrator that can be used to integrate and reconcile data across multiple repositories on different platforms. Directory Integrator focuses on data rather than users, and it solves the complex integration challenges by breaking them into separate, modular, and scalable pieces. IBM Tivoli Directory Integrator enables you to create a consistent infrastructure of enterprise identity data, while permitting local administrators to manage users on each platform and environment with their traditional tools. Chapter 3. Directory Integrator component structure 87
  • 105. 88 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 106. Part 2Part 2 Customer scenarios In part 2 we provide two solution oriented scenarios with technical hands-on details.© Copyright IBM Corp. 2006. All rights reserved. 89
  • 107. 90 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 108. 4 Chapter 4. Penguin Financial Incorporated This chapter examines the business requirements, functional requirements, solution design and implementation phases for a typical directory synchronization scenario brought about by the merger of two financial institutions. Note: All names and references for company and other business institutions used in this chapter are fictional. Any match with a real company or institution is coincidental.© Copyright IBM Corp. 2006. All rights reserved. 91
  • 109. 4.1 Business requirements Monolithic Financial, a 108 year old full services financial institution located in Bangalore India, has agreed to be acquired by Penguin Services, a 12 year old Tulsa Oklahoma based Internet financial services firm. The announcement was preceded with a multi- million dollar advertising campaign touting the new organizations name of Penguin Financial with a motto of “We can lend anything to anybody”. The industry analysts while favorable with the merger, questioned how long it would take for the two companies with vastly different backgrounds, infrastructure and philosophy could be merged together to provide a full suite of services to the general public. To silence the critics, Danny Gooch, founder and CEO of the new company publicly touted the deployment of a new full services banking application that would be available for general use within 12 months. This new application would finalize the merger of the two organizations. At a press conference, Gooch was quoted as saying that the best and the brightest from both organizations have been brought together to successfully integrate the two organizations. When asked by the press how would anybody be able to be founds within the new company, he stated that all individuals within the new company would be able to be reached by a single e-mail address. He also boosted that the new company would actually be able to reduce expenses by allowing end users to use a Web page to manage their own identity information. His final boast before entering his car was that they would even have synchronized passwords across the organization which would further lower user calls to the help desk by the time the merger was complete. The information technology synchronization team has been tasked with: 1. Developing a synchronized LDAP based directory for use with the new application. The directory must reflect real-time changes from both organizations existing infrastructure. 2. Creating a single e-mail account for all employees. 3. Reducing the expected increase in help desk support costs, by providing for users to update user information via the Web. 4. Providing a corporate security policy that can be applied across a new company.4.1.1 Current architecture The current challenges of this business scenario are depicted in Figure 4-1 on page 93.92 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 110. User account information is distributed across two different user repositories. Account information across these systems may not be consistent all the times. Users have to keep track of passwords across multiple systems. New users added to Active Directory will have to be added manually into the Lotus Domino server. Domino server Current situation Company A has acquired Company B Company A Company A uses Lotus Domino and Company B (Penguin Financial) uses Active Directory Problems Issues in deploying a new banking application across different systems. Company A and Company B use different e-mail mechanisms. Account information for same user across different systems has to be updated manually. Users have to keep track of passwords in different systems Active Directory Company B (Monolithic Financial)Figure 4-1 Current banking scenario4.2 Functional requirements We extract functional requirements by mapping business requirements to their underlying reasons. We then expand the reasons in increasing detail. Our functional requirements will tie these low level reasons for a business Chapter 4. Penguin Financial Incorporated 93
  • 111. requirement to the IBM Tivoli Directory Integrator capability that will fulfill that business requirement. Let us examine every business requirement, and search for reasons and the functional requirements: Following are the functional requirements based on the business case. Business requirement 1: Enable a synchronized LDAP directory for use with the new application. After the acquisition there are two user repositories - Active Directory for users acquired from Monolithic Financial and Domino for Penguin Financial users. Development costs for a new full services banking application are expected to be high because access control in the new application needs to be coded for users based on source user repository with no cross-reference information between repositories. With an enterprise directory in place, users can modify their own account information like password, phone numbers, address, and so on. This enterprise directory will be accessed using a centralized Web portal with a consistent user interface, thereby providing a consistent and simple to use user experience irrespective of where the account is located. This leads to our first functional requirement shown in Table 4-1. Table 4-1 Functional requirement for an enterprise directory Requirement Description A All users are to be integrated into one common user repository - an LDAP based enterprise directory. The user account information has to be kept in synchronization across all attached systems that store any user related information. Table 4-2 Functional requirement for synchronization Requirement Description B User information must be in synchronization across all the systems. Business requirement 2: Provide a single e-mail account for all employees. After the merger the users cannot be addressed by a single consistent e-mail address. Moreover, these disjunct e-mail accounts split across different mail systems do not relay an impression of a single large company. All users in Active Directory need to retain their account and will also be given a new Lotus Domino account. The original users in Lotus Domino need not have an Active Directory account.94 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 112. Table 4-3 Functional requirement for unified mail system Requirement Description C All Active Directory users receive a new Lotus Domino server e-mail account. Business requirement 3: Reduce the expected increase in help desk support costs by providing users with the ability to update user information via the Web. The challenge for this situation are the new users from the acquired Monolithic Financial environment, because based on functional requirement C they are receiving new accounts for the Lotus Domino mail system and multiple logins inevitably lead to multiple calls for password resets, which are typically the largest percentage of help desk calls, thus help desk support costs will increase. Users are less likely to forget their passwords if they use the same synchronized password for all of their accounts. The new self-service portlet within the services banking application can reduce the burden on system administrators by delegating the ability to request password resets to the endusers. Regular password change and synchronization can also be achieved via the portlet, which is intercepted by IBM Tivoli Directory Integrator to synchronize the password with both target systems - Microsoft Active Directory and Lotus Domino. This leads to the next functional requirement shown in Table 4-4.Table 4-4 Functional requirement for password synchronization Requirement Description D All users can change and synchronize their passwords via a centralized single self-service portlet. Another expected side effect for Monolithic Financial users is that user productivity and satisfaction is lowered because they have to log into the Domino mail system separately in order to be productive. Based on functional requirement D, users will only need one password for all involved systems. We can even go one step further and allow users from Monolithic Financial to keep changing their user password the common and convenient way they used to do it - by using the Windows ctrl+alt+del mechanism. This leads to the next functional requirement shown in Table 4-5 on page 96. Chapter 4. Penguin Financial Incorporated 95
  • 113. Table 4-5 Functional requirement for Windows password change Requirement Description E Monolithic Financial users can change and synchronize their password via the common Windows mechanism. Business requirement 4: We have to provide a password related corporate security policy that can be applied across a new company. The existing Penguin Financial security policy will be expanded to all new systems including new applications, Enterprise Directory systems, password synchronization solutions, Windows password change mechanism, and so on. The password synchronization solution based on functional requirements D and E can satisfy all corporate security policy requirements, including the ones listed below, though special attention to password related parts from the existing security policy is required: – Password policy defining password history, complexity, minimum and maximum password age and minimum password length is enforced. – Absolutely no passwords are to be stored and maintained outside of their native password stores at any time. – Passwords are always encrypted when sent over the network and/or public key infrastructure technology is used, preferably both. This leads to additional functional requirements listed in Table 4-6. Table 4-6 Functional requirements for corporate security policy Requirement Description F Password policy is enforced at all times. G Passwords are not stored and maintained outside of their native stores. H PKI and/or encryption technology is used for passwords sent over any network. This concludes the functional requirement analysis and allows us to begin designing our technical solution.4.3 Solution design In this section we discuss how solution design objectives can be realized using IBM Tivoli Directory Integrator. Our goal is to produce an implementation plan96 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 114. containing a phased set of implementation steps where the end result satisfies allfunctional requirements, and therefore also satisfies the original businessrequirements.While business and functional requirements are the main parts of the designobjectives, we also have to consider other nonfunctional requirements andconstraints. These may include objectives that are necessary to meet generalbusiness requirements, or practical constraints on constructing sub-systems.IBM Tivoli Directory Integrator implementations often involve nonfunctionalrequirements relating to: High availability and failover Maintainability and configuration management Logging and auditing Archiving and backup Security MonitoringBecause we focus on the architecture of directory synchronization with IBM TivoliDirectory Integrator software in this book, we do not look in detail at all of thesenonfunctional requirements.The steps involved in producing an implementation plan are:1. Prioritize the requirements.2. Map the requirements to IBM Tivoli Directory Integrator features.3. Define the phases involved in using those features to satisfy the requirements.Prioritizing the requirements is important because the priorities are one of theprimary factors used to define phases of the project. It is rare that a directorysynchronization solution can be created as a single deliverable satisfying everyrequirement. It is far more likely that it will be delivered in phases, and the highestpriority requirements should be addressed in the earliest phases.Assigning priorities to the requirements is often difficult because they are allimportant. You can more easily compare the priorities of requirements by askingquestions that gauge the positive and negative impacts of the requirements: How much money can be saved when the requirement is met? Are there penalties if the requirement is not met? Is there a date by which the requirement must be met? Are there other requirements with dependencies on this one? Chapter 4. Penguin Financial Incorporated 97
  • 115. After mapping the requirements to IBM Tivoli Directory Integrator features, the requirement priorities and dependencies can be used to decide how to break up the project into phases. Figure 4-2 on page 99 shows the big picture of the solution design. IBM Tivoli Directory Server is used as the enterprise directory. IBM Tivoli Directory Integrator takes care of user information provisioning and synchronization across different data sources and targets including password synchronization.98 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 116. Scenario description: Company A has acquired Company B 1. Company A uses Lotus Notes and Company B uses Active Directory 2. Add Company B users to Corporate Directory 3. Add Company B users to Domino Server Domino 4. Add company A user to Corporate Directory 5. Allow for password sync from AD and Portal App which uses ITDS to server AD, ITDS and Domino. 6. Allow for selected attribute sync from Portal App to Active Directory and Lotus Notes. Company A 4. ITDI (Penguin Financial) 5. ITDI password sync Lotus Portal Web 3. ITDI Application 6. ITDI User updates selected attributes and password IBM Tivoli Directory Server ITDI Password 6. ITDI Catcher 2. ITDI 5. ITDI bi-directional Company B password sync (Monolithic Financial) Topics covered: · Basic Data Sync (e.g. home phone to/from Corp LDAP to Notes/AD) · Active Directory Changes Active · Tivoli Directory Server Changes Directory · Domino Changes · Directory Server connector (Tivoli Directory Server to/from AD) · Notes connector ITDI Password · ITIM agent connection (Tivoli Directory Server to Domino Server) Catcher · Schema Mapping · Bi-Directional Password Sync · Debugging/Troubleshooting ITIM: IBM Tivoli Identity Manager · Unique Names ITDI: IBM Tivoli Directory Integrator · Connector modes used: Update, Lookup, Iterate ITDI: IBM Tivoli Directory ServerFigure 4-2 Solution design Chapter 4. Penguin Financial Incorporated 99
  • 117. Project phases By analyzing the business requirements again, after the functional requirements have been extracted in 4.2, “Functional requirements” on page 93, it is evident that there is some dependency between individual business requirements. Based on this, and the complexity involved with the use of multiple data sources and synchronization of user passwords across these systems, we have decided for the project to be implemented in two phases: Phase 1: User Integration In the first phase we integrate user account information including user creation and modification. Phase 2: Password synchronization The goal of this phase is to implement password synchronization based on Penguin Financial requirements and policies.4.3.1 Architectural decisions for phase 1 In this section we discuss the architectural decisions made for phase one. In our scenario we have three different data sources as shown in Table 4-11 on page 108. There are multiple ways to establish connections to these data sources. Change detection For detection of changes in Active Directory we use the Active Directory change log Connector. For detection of changes in Tivoli Directory Server we use the IBM Tivoli Directory Server change log Connector. For detection of changes in the Domino server we use the Domino change detection Connector. The Domino change detection Connector must be deployed on a Windows system where a Lotus Notes client is installed. However, the Connector can connect to Domino server on all platforms. User registration User creation in Domino Server consists of two parts: Creation of a user account and registering this user with the Domino Server. Creation of users in Domino Server can be done using the LDAP connector in update mode, but this connector would not be able to register the users with the Domino Server. User registration can be achieved in two ways, by using the Directory Integrator Domino Users Connector or by using the Identity Manager Agent Connector. In order to use the Domino Users Connector requires Tivoli Directory Integrator to be installed on the same system where your Domino Server is running. This involves working directly on systems that are already deployed in production. Many companies, including Penguin Financial, would not prefer this. So we will100 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 118. use the Identity Manager Agent Connector in our solution to create and update users on our Domino Server. Because of the above mentioned reasons we will develop and deploy phase one of our solution on a separate Windows based system.4.3.2 Architectural decisions for phase 2 This section explains our architectural decisions made for phase two based on the Penguin Financial requirements and Directory Integrator capabilities. After analyzing the Penguin Financial functional requirements from 4.2, “Functional requirements” on page 93, the following architectural topics related to Directory Integrator capabilities emerged for consideration: Password policy Password store Loop conditions Password security Let us discuss these topics and their related architectural decisions. Password policy Functional requirement F is not a real issue. It is related to settings we have to apply outside of IBM Tivoli Directory Integrator, namely in Active Directory to be aligned to the existing password policy in the Penguin Financial environment. As we intend to implement the Password Synchronizer on Active Directory, we have to implement the password complexity part of our password policy anyway for the Password Synchronizer to work. We can take advantage of the minimum password age part of the password policy to solve Loop condition issues described later in our timestamp approach part. Password store In “Password Stores” on page 177 we explain the difference between LDAP and MQe password stores in more detail, but related to functional requirement G, it is important to distinguish between a permanent store LDAP is using for passwords and the message queue mechanism MQe is using for temporary password storage. This leads to our first architectural decision shown in Table 4-7 on page 102, that MQe is used as the password store. Chapter 4. Penguin Financial Incorporated 101
  • 119. Table 4-7 Architectural decision for password store Decision Description MQe is used as the password store MQe is defined as the password store for mechanism. security reasons. Note: From an architectural perspective it is important to keep in mind that the FIFO (first-in-first-out) rule applies for entries when using message queuing. The remaining question to answer is: how many password stores are used? Based on functional requirements D and E, there are two sources of password change, thus there are two password stores. We can use separate password stores for each source or only one password store for both password sources. Considering the possibility of inconsistency of password changes, if handled separately for reasons such as time synchronization problems, separate AssemblyLines, difficult control and handling, and so on, it is best to use only one common password store and aggregate all password changes at one place. This leads to our second architectural decision shown in Table 4-8, that a common password store is used for reliability reasons. Table 4-8 Architectural decision for password store Decision Description One Password store exists. A common password store is used for reliability reasons Loop conditions A reason for possible loop conditions in our password synchronization scenario are both functional requirements D and E when combined. In that case we have two password change sources and three possible targets; two of them being sources at the same time. For example, when a user changes a password in Active Directory, password synchronization is triggered and the password is updated in Domino and the enterprise directory. A change in the enterprise directory now triggers a new password synchronization process to update the password in Domino and Active Directory, the initial source, and the loop is closed. Note: Active Directory is not a password synchronization target for original Penguin Financial users.102 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 120. There are several approaches to solve this problem: External password store When using LDAP as an external password store, you can build not just very scalable and replicable solutions but also very flexible by storing additional information in it, which you can use to compare sources, targets, passwords, timestamps and so on and thus break the loop. Based on our password store discussion, this approach is not an option in the situation with Penguin Financial. Flags Flags are common in bidirectional password synchronization using MQ. The theory behind using flags is alternating behavior; in one direction an update is allowed and a flag is set to mark the change, thus the update in the opposite direction is not allowed, but the flag is reset and the flow ends. The problem with flags can be consistency, because any repeated password change before a flag can be reset is skipped. For example, if a user changes the password twice in a row, the second password change might be skipped if the first one is still in progress and the flag has not been reset yet. IBM Tivoli Directory Server internal mechanism IBM Tivoli Directory Server, used in phase 1, checks internally if a new value for an attribute differs from the old one. This feature can be used for password comparison. In theory, there will be a maximum one and a half loops before the flow will stop. If the source of password change is Active Directory, the first password change is propagated to Directory Server, then back to Active Directory, and once again to Directory Server. The problem here is similar to the flags if MQ is used. Any new password change during the initial update process is ignored and the final state is inconsistent. Timestamps Timestamps are very useful for time comparison of events. If there is a policy in place such as minimum password age, then based on time difference between two password changes, we can distinguish user based and process based password changes. The minimum password age setting parameter in Windows is defined in days and the minimum setting is one day. Password processing in IBM Tivoli Directory Integrator occurs in moments, thus any password change for the same user in a time less than the minimum password age can only be process internally. This leads to our third architectural decision as shown in Table 4-9 on page 104. Timestamp is defined and used for breaking loop conditions. Chapter 4. Penguin Financial Incorporated 103
  • 121. Table 4-9 Architectural decision for loop conditions Decision Description Timestamp is defined. Timestamp is used to break loop conditions. Password security Functional requirement H has many side effects, because of a wide influence from password handling to network architecture and server configuration. First we have to check what components we need, what are the security capabilities of these components and are there are any special requirements for their usage. Second we have to determine if we can satisfy at least the minimum requirements. The password synchronization process encompasses two areas, the actual password store and the AssemblyLine that implements the data flow (more details are revealed later in 4.5.4, “Plan the data flows” on page 190): 1. Password store The functionality of a password store is explained in more detail in “Password Stores” on page 177, but related to our architectural concerns it is important to emphasize that all communication needs to be encrypted. 2. AssemblyLine The AssemblyLine picks up the password from the password store and sends it to a target for update. We have to investigate three targets for password updates: – Active Directory SSL is required to send an updated password to Active Directory. The configuration for our scenario is described in 4.5.6, “Instrument and test a solution” on page 200. – IBM Tivoli Directory Server See “Secure Sockets Layer Support” in Chapter 2 of the IBM Tivoli Directory Integrator 6.0: Users Guide, SC32-1718 for details on IBM Tivoli Directory Integrator configuration as an SSL client or server. – Domino A Domino HTTP password can be encrypted using Domino’s encryption routines. The configuration for our scenario is described in 4.5.6, “Instrument and test a solution” on page 200. After a short components analysis, our conclusion is that we can satisfy all minimum security policy requirements for all used components either using SSL or encryption.104 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 122. To make life easier, we can deploy IBM Tivoli Directory Integrator on our newenterprise directory server. In this particular configuration there is no need toconfigure an SSL communication link with IBM Tivoli Directory Server running onthe same server.This leads to our final architectural decisions shown in Table 4-10. IBM TivoliDirectory Integrator is located on our enterprise directory server in order toreduce the solution complexity, SSL is used for password updates to ActiveDirectory, and password updates to Domino are encrypted.Table 4-10 Architectural decision for password security Decision Description IBM Tivoli Directory Integrator is located Complexity is reduced as there is no need on enterprise directory server. for SSL encryption when communication is local. SSL is used for Active Directory updates. SSL is required by Active Directory. Encryption is used for Domino updates. Using Domino encryption routines there is no need for SSL to satisfy minimum functional requirements.The final password synchronization architecture at Penguin Financial based onour business and functional requirements as well as our architectural decisions isshown in Figure 4-3 on page 106. Chapter 4. Penguin Financial Incorporated 105
  • 123. Domino Password Update (encrypted) Password Update Domino Directory (local) IBM Tivoli IBM Tivoli Directory Directory Integrator Server Password Update Password (SSL) Synchronizer Password Encryption Store Password Password change Storage Enterprise Directory Password Store Portal self-service Application Active CTRL+ALT+DEL Directory Password change Password Synchronizer Active Directory Figure 4-3 Final password synchronization architecture4.4 Phase 1: User integration The goal of this phase is to create an enterprise directory, and keep the user account information in synchronization across various data sources.106 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 124. This phase contains the following sections: Detailed data identification Planning the data flows Instrumenting a solution4.4.1 Detailed data identification The authoritative source for user accounts at Monolithic Financial is the Microsoft Windows 2000 server with Active Directory. User accounts at Penguin Financial are located on a Lotus Domino Server. IBM Tivoli Directory Server will be used to create a centralized enterprise directory and to allow user modification through a self-care Portal application. Table 4-11 on page 108 depicts all data sources involved in this phase. Each server in the figure represents one physical system. The Lotus Domino Server and the Tivoli Directory Server may reside on any hardware/OS platform that these products are supported on, Active Directory has to run on a Microsoft Windows Server platform. In all the three cases the user account information is the data we are interested in. The access mechanism is the mechanism that will be used by Tivoli Directory Integrator to access the data from our data sources. There are different ways in which you can access data using Tivoli Directory Integrator. For example, to access data from a Domino Server you can use the LDAP protocol (used by the LDAP Connector), HTTP and IIOP (used by Notes Connector), or JNDI with DAML (used by the Identity Manager Notes Agent Connector). In a Windows domain context, sAMAccountName is an attribute unique to each user. It is used to check the uniqueness of a user account. NotesFullName is an attribute unique to Domino server and it is used to check the uniqueness of a user in Domino Server. We will create a new unique attribute called uid, which will be used to maintain the uniqueness of all user accounts in Tivoli Directory Server. The uid is created whenever a new user is added to our enterprise directory from either Windows Active Directory or Domino Server. Further details on uid are discussed in the next section. Today there are different system administrators responsible for each of these systems. Privileges for adding or updating user accounts is limited to these administrators. Additionally, individual users can update or modify their personal information. For our solution we create an additional user called IDI Admin. It will be used by Tivoli Directory Integrator and will have the required privileges for adding, deleting, and updating user accounts on all relevant data sources. Note that the Portal application is not shown in this table. The development and deployment of a Portal application is out of the scope of this book. For the purpose of updating and viewing user information stored in the enterprise Chapter 4. Penguin Financial Incorporated 107
  • 125. directory (Tivoli Directory Server) any commonly available LDAP browser may be used. Table 4-11 Data sources Domino Server Description Contains user accounts of Penguin Financial System pf-usmai01 Domain Data User account information Unique data “NotesFullName” - this attribute is unique for all the users in the domain Data storage Domino directory Access mechanisms LDAP, Identity Manager Agent, Notes client Windows 2000 Server with Active Directory Description Contains user accounts of Monolithic Financial System mf-root1 Domain Data User account information Unique data “sAMAccountName” - this attribute is unique for all the users in the domain Data storage Microsoft Active Directory Access mechanisms LDAP Tivoli Directory Server Description Contains aggregated user information from Active Directory and Domino Server System pf-used01 Data User account information Unique data “uid” - this attribute is unique for each user and is created whenever a new user is added to Tivoli Directory Server Data storage LDAP DB2 Access mechanisms LDAP108 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 126. In the following Table 4-12 we list all the attributes that will be updated andsynchronized in our solution. Note that this list is a small subset of availableattributes that can be mapped using Tivoli Directory Integrator.Table 4-12 Attributes used in our solution Attribute Attribute name Attribute name Attribute name used with Active used with Tivoli used with Identity Directory Directory Server Manager Notes Connector Connector Agent Connector Distinguished dn $dn $dn name Common name cn cn - First name givenName givenName erNotesFirstName Surname or last sn sn erNotesLastName name E-mail address mail mail erNotesInternetAd dress Title title title erNotesTitle Phone number telephoneNumber telephoneNumber erNotesPhoneNum ber Street streetAddress street erNotesStreet State st st erNotesState Postcal code postalCode postalCode erNotesZip Object Class - objectclass - Unique Tivoli - uid (computed by - Directory Server Directory Integrator) attribute Unique Domino - pfNotesFullName erNotesFullName Server attribute (generated by Domino) Unique Active sAMAccountName pfsAMAccountName - Directory attributeAttribute names for some of the attributes are blank. This means these attributesare not used in the respective connectors by our solution. Chapter 4. Penguin Financial Incorporated 109
  • 127. At this point you only need to know what attributes you want to synchronize. The attribute names used by various connectors can be updated once you have the connectors up and running. Table 4-13 lists attributes that are specific to the Identity Manager Notes Agent Connector. These attributes are used when registering a new user with Domino server. Refer to Identity Manager Notes Agent Connector documentation for more information about these and additional attributes. Table 4-13 Attributes used by the Identity Manager Notes Agent Connector Attribute Attribute name used with Identity Manager Notes Agent Connector Domino domain name erNotesMailDomain Domino server name erNotesMailServer Domino domain name erNotesMailDomain Domino server name erNotesMailServer Domino server certifier ID (including path) erNotesAddCertPath Certifier password erNotesPasswdAddCert Domino mail file system erNotesMailSystem Mail template name (including path) erNotesMailTemplateName Mail file name (including path() erNotesMailFile Mail file owner access erNotesMailFileOwnerAccess Name of ID file (including path) erNotesUserIDfileName Mail quota size erNotesMailQuotaSize Initial password for the user erPassword Notes short name erNotesShortName Now that we have identified the data attributes to be used, we look into what goes inside these attributes. As described in 2.2.4, “Initial data format” on page 21 it is possible that the autoboot value may be null, blank, out-of-range, and valid. It is necessary for us to define what actions need to be taken when an attribute value is one of the above four. For example the value for the telephoneNumber attribute in Tivoli Directory Server is optional and may be null. So if we are adding a user from Active Directory to Tivoli Directory Server, having null or blank for telephoneNumber does not cause any problem. But in the same scenario if the value for the objectclass attribute is null we might get an add error110 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 128. because of a schema violation exception by Tivoli Directory Server. In this particular example we can handle the situation by logging an error in Tivoli Directory Integrator and skipping the current add operation instead of relying on Tivoli Directory Server to throw a simple error. This creates a more robust solution and makes it easier for the developer to debug the individual modules. There may be some attributes that are multi-valued like objectclass or NotesFullName. Care should be taken while using these attributes for say uniqueness, or using them to establish a link criteria between different data sources.4.4.2 Data flows The next step in phase 1 is to plan the two data flow scenarios between the involved data sources. Figure 4-4 depicts the data flow between Microsoft Active Directory and Tivoli Directory Server. Initial load of users Windows 2000 Server Tivoli Directory Server with Active Directory Synchronization of user attributes Unique attribute: uid Unique attribute: pfsAMAccountName sAMAccountName Link Criteria Figure 4-4 Data flow between Active Directory and Directory Server Figure 4-5 on page 112 depicts the data flow between the Domino Server and Tivoli Directory Server. Chapter 4. Penguin Financial Incorporated 111
  • 129. Initial load of users Domino Server Initial load of users Tivoli Directory Server Unique attribute: uid Synchronization of user attributes Unique attribute: pfNotesFullName NotesFullName Link Criteria Figure 4-5 Data flow between Domino Server and Directory Server As a first step we need to identify the authoritative data source for the various data attributes listed in Table 4-12 on page 109. Authoritative attributes Microsoft Active Directory is the authoritative data source for FirstName and LastName attributes of users created by Microsoft Active Directory. Lotus Domino Server is the authoritative data source for FirstName, LastName, and e-mail address attributes for users created by Lotus Domino Server. Lotus Domino Server also is the authoritative data source for the e-mail address attribute for users migrated from Microsoft Active Directory. Tivoli Directory Server is the authoritative data source for Title, Phone number, Street, State, and Postal code. Unique link criteria Let us take a closer look at the link criteria between these data sources. Between Microsoft Active Directory and Tivoli Directory Server The bi-directional arrow with a cross pattern in Figure 4-4 on page 111 shows the unique link criteria between Microsoft Active Directory and Tivoli Directory Server. The attribute sAMAccountName, which is unique in Microsoft Active Directory, is used for establishing the link criteria. This attribute is mapped to a custom attribute called pfsAMAccountName created in the Tivoli Directory Server.112 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 130. Between Lotus Domino Server and Tivoli Directory ServerThe bi-directional arrow with a cross pattern in Figure 4-5 on page 112 shows theunique link criteria between Lotus Domino Server and Tivoli Directory Server.The attribute FullName, which is unique in Domino server for users, is used forestablishing the link criteria. This attribute is mapped to a custom attribute calledpfNotesFullName created in the Tivoli Directory Server.. Note: The Notes field FullName in Domino Server is a multi valued attribute and we have to ensure that we are taking this into consideration while using this attribute for mapping. The FullName field value changes if a user is renamed in Domino. If you want to account for this type of changes, then use the Universal ID (Notes field: UnID) of Domino documents. UnID is associated with every object in Domino and does not change even if the object is modified like a user being renamed. UnID is also required if your application wants to keep track of document deletion.Special conditionsThe attribute uid is a unique attribute in the Tivoli Directory Server. This attributeis computed and created for each user on a successful user add operation toTivoli Directory Server either from Active Directory or from Domino Server. Thevalue of uid created for users from Domino Server is prefixed with a letter A andthe value of uid created for users from Active Directory is prefixed with a letter B.There is no special meaning attached to these prefixes and this approach hasbeen used to keep the implementation simple. You can use any other means ofgenerating a unique ID.To establish the link between the uid and Notes FullName for users added fromActive Directory to Directory Server, we initially populate the pfNotesFullNameattribute with the value of uid.When this user, originally created in Active Directory is added to Domino Server,the Domino Server creates a Notes FullName and this uid is added as anothervalue to the multi valued Notes field FullName in Domino Server.During the synchronization from Domino Server to Directory Server this uid isreplaced by Notes FullName generated by Domino Server during the earlier addoperation.We now describe the various phases for the implementation of these data flows. Chapter 4. Penguin Financial Incorporated 113
  • 131. Initial data cleanup and load phase Data cleanup and initial population of users are one time operations. So these steps need to be executed in sequence and only once during the initial user data migration. 1. As a first step we have to ensure that the user account repository on both the Microsoft Active Directory Server and Notes Domino Server contain the user accounts that we need to provision. For example, accounts in disabled state in Active Directory are not loaded during this operation. Also ensure that the schema in Tivoli Directory Server has been updated as required. For example, we have to create the suffixes for the Penguin Financial domain and the custom attributes like pfsAMAccountname and pfNotesFullName. 2. All the users in Microsoft Active Directory need to be added to Tivoli Directory Server. The right pointed arrow in Figure 4-4 on page 111 shows this step. When a user has to be added, the link criteria checks for the existence of pfsAMAccountName in Tivoli Directory Server with the matching value as specified in sAMAccountName of Microsoft Active Directory. If a matching attribute is found then the add operation fails. If no matching attribute value is found, the add operation succeeds. Additionally the value of uid is copied to the pfNotesFullName attribute. 3. Users who have been added to Directory Server from Active Directory need to be added to Domino Server. The left pointed arrow in Figure 4-5 on page 112 shows this step. To ensure that we are only adding users coming from Microsoft Active Directory, we make sure that each of these users have a pfsAMAccountName attribute set. Before these users are successfully added to Domino their pfNotesFullName attribute contains the value of the uid attribute. Because these users do not exist in Domino Server at this time the Notes FullName field does not yet contain any value. We use this value for uniqueness while adding these users to Domino Server. 4. Users from Domino Server need to be added to Tivoli Directory Server. The right pointed arrow in Figure 4-5 on page 112 shows this step. These users already have a Notes FullName attribute associated with them. When a user is to be added, the link criteria checks for the existence of pfNotesFullName in Tivoli Directory Server with the matching value as specified in NotesFullName of Domino Server. If a matching attribute is found then the add operation fails. If no matching attribute value is found, the add operation succeeds. Additionally the pfNotesFullName attribute for the original Active Directory users who have been added to Domino Server in the previous step are updated. Synchronization of data phase 1. The user account information in Active Directory needs to be kept in synchronization with Directory Server. This also includes the addition of users114 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 132. to Active Directory in the future. The bi-directional arrow in Figure 4-4 on page 111 shows this step. If the synchronization operation is an update of a particular user, then only the attributes for which Active Directory is the authoritative data source are updated. 2. The user account information in Domino Server needs to be kept in synchronization with Directory Server. This also includes the addition of users to Domino Server in the future. The bi-directional arrow in Figure 4-5 on page 112 shows this step. If the synchronization operation is an update of a particular user, then only the attributes for which Domino Server is the authoritative data source are updated. 3. Any updates to user account information from Directory Server (through an external portal application or by any other means) needs to be synchronized to both Active Directory and Domino Server. The plain bi-directional arrows in Figure 4-4 on page 111 and Figure 4-5 on page 112 show this step. Only those attributes for which Directory Server is an authoritative data source are updated. Frequency The initial data cleanup and load phase requires to be executed only once. Subsequent synchronization of data will be performed by monitoring the data sources continuously for any changes.4.4.3 Instrument and test a solution Now that we have completed the detailed data identification and planned the data flows, we look into the more deeper technical aspects of our solution implementation. Required resources and setup For the purposes of demonstrating this solution we use the following setup. Please refer to Table 4-11 on page 108 for the data sources involved. Windows 2000 Server with Active Directory: Microsoft Windows 2000 Server with Service Pack 4 and Active Directory installed and configured. IBM Tivoli Directory Server: SuSE Linux Enterprise Server 8 with IBM Tivoli Directory Server 5.2 installed and configured. IBM Lotus Domino Server: Windows 2000 Server with Service Pack4 and Lotus Domino Server 6 installed and configured. For development and deployment of the solution using Tivoli Directory Integrator we use a system with Windows 2000 Professional with Service Pack4, Tivoli Chapter 4. Penguin Financial Incorporated 115
  • 133. Directory Integrator 6, Lotus Notes Client 6, and Tivoli Identity Manager Notes Agent 6 installed and configured. The data sources may reside on any platform that the product supports. For example, Domino Server can reside on a Windows or Unix platform. Tivoli Directory Integrator supports various implementations of Unix. In our scenario we use the Domino change detection connector and Identity Manager Notes Agent for updates to and from Domino server. The Domino change detection connector requires Lotus Notes Client to reside on the same system as the Tivoli Directory Integrator. Refer to Appendix A, “Tricky connections” on page 415 for more information about available options for connectivity to Domino Server. Please refer to the individual product documentation if you have questions on installing or configuring these products. Note: The IDI Admin user (or any other user used by Tivoli Directory Integrator) should have the required privileges for adding or updating user accounts on Domino Server. This user needs to log on to the Lotus Notes Client using the ID file at least once after the system has been started (or restarted). The default schema on Tivoli Directory Integrator has to be modified to add new suffixes and attributes. Add the following suffix: dc=penguin-fin,dc=com Add the following object class: pfPerson, derived from inetOrgPerson. Add attributes: pfsAMAccountName and pfNotesFullName of string type derived from pfPerson. Edit the configuration and external properties file 1. Start IBM Tivoli Directory Integrator by selecting it from the start menu or executing the ibmditk.bat from the Tivoli Directory Integrator install directory. 2. To create a new configuration file, click File → New... as shown in Figure 4-6 on page 117. Optionally provide a password and click OK. Note: Providing a password protects the configuration file and does not allow you to open the configuration file using other XML editors. So it is a good idea to do this once the solution is ready for deployment.116 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 134. Figure 4-6 Creating a configuration file3. In the left pane of the layout window expand ExternalProperties and click Default.4. In the right pane enter a valid name for an External Properties File as shown in Figure 4-7 on page 118. Chapter 4. Penguin Financial Incorporated 117
  • 135. Figure 4-7 External Properties File configuration Optionally you can encrypt the properties file by checking the Encrypt External Properties and providing a Cipher and Password. Leaving the cipher empty encrypts the file using the default cipher. It is a good idea to encrypt the properties file before or immediately after deploying the solution. 5. Click the Editor tab in the right pane and enter the property variables as shown in Figure 4-8 on page 119. Note: The actual values you use for the properties depend upon your environment like system name, userid/password, LDAP schema, and so on.118 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 136. Figure 4-8 External Properties A description for each of the values in the external properties is provided in Table 4-14.Table 4-14 External properties Property Description ADLoginName This is the login name Directory Integrator uses to bind to Active Directory. The ID must have sufficient permission to create user accounts in Active Directory. ADPassword This is the password for the Active Directory login name. ADSearchBase The subtree in Active Directory from which Directory Integrator is to propagate changes. Only changes to users in this subtree are propagated to Directory Server. Typically, this should be set to the top of the Active Directory tree, so that all users in Active Directory groups are found and copied to Directory Server. Chapter 4. Penguin Financial Incorporated 119
  • 137. Property Description ADSearchFilter The LDAP search filter that is used to select Active Directory user objects for synchronization with Directory Server. Unless the Active Directory schema has been modified, this typically is the objectClass=user. ADURL The LDAP URL and port for the Active Directory Domain Controller. The default non-SSL port number for LDAP directories is 389. ITIMCertificate This is the CA certificate file for access to Identity Manager. ITIMPassword Identity Manager password. ITIMUserName Identity Manager userid. LDAPLoginName The login user ID that Directory Integrator uses to bind to Directory Server. This ID must have been given sufficient access permissions by the Directory Server administrator to create and modify user entries. LDAPObjectClass The structural LDAP object class used to create new user entries in Directory Server. This may be a custom object class that extends the default schema. It must be a structural object class, not an auxiliary or abstract class. This class must exist in the Directory Server schema or the AssemblyLines in this configuration will not be able to create user entries. LDAPPassword The password for the Directory Server login ID. LDAPSearchBase The subtree in Directory Server to search to check if an Active Directory user has an existing entry in Directory Server. LDAPStoreBase The suffix under which users are added in Directory Server. Used for creating a unique ID when users are added to Directory Server. LDAPUrl The URL that Directory Integrator uses to connect to Directory Server. Count A number used for creating unique IDs when users are added to Directory Server. Establish connectivity to data sources Next we establish the connectivity to the various data sources. We need multiple types of Connectors using different Connector modes for each data source. For example, reading entries from Active Directory requires an LDAP Connector in120 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 138. Iterator mode, updating entries in Active Directory requires an LDAP Connectorin Update mode, and synchronization between Active Directory and DirectoryServer requires an Active Directory ChangeLog Connector in Iterator mode. Hereis an overview of the different data source connections: Read Active Directory Update Active Directory Active Directory Changes Read Directory Server Lookup Directory Server Update Directory Server Directory Server Changes Read Domino Server Update Domino Server Domino Server ChangesRead Active DirectoryThis Connector is used for reading user entries from Microsoft Active Directory.This is an LDAP Connector running in Iterator mode.1. In the left pane of the layout window right-click Connectors and select New Connector...2. Select the type of Connector you are going to add. In the Select Connector window select the name ibmdi.LDAP. Enter ReadADCon in the name field and select Iterator mode as shown in Figure 4-9 on page 122. Click OK. Chapter 4. Penguin Financial Incorporated 121
  • 139. Figure 4-9 Select connector 3. A new Connector is added under Connectors in the left pane. The right pane displays the IBM Tivoli Directory Integrator LDAP Connector in the Connection subtab of the Config... tab as shown in Figure 4-10 on page 123. In the Connector configuration on the right pane there is an Inherit from: button on the top right-hand corner. This button shows the Connector template used for creating this particular Connector.122 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 140. Figure 4-10 New connector 4. Let us configure this connector. In the right pane click the LDAP URL label on the left side of the first edit box; some of the labels, displayed in blue, act like a hyperlink and provide another configuration pop-up. The Parameter Information window, as shown in Figure 4-11 on page 124, is displayed. In the External Property drop-down list select ADURL and click OK. The previously defined value specified for the ADURL property in the external properties file, shown in Table 4-14 on page 119, is displayed in the edit box. Chapter 4. Penguin Financial Incorporated 123
  • 141. Figure 4-11 Connector parameter information 5. Repeat the above step for the Login username, Login password, Search Base, and Search Filter properties by selecting ADLoginName, ADPassword, ADSearchBase, and ADSearchFilter in the External Property list box respectively. Your connector window will look similar to Figure 4-12 on page 125.124 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 142. Figure 4-12 Connector details 6. Next we discover the available attributes in the Active Directory data source. Click the Input Map tab in the right pane. This tab contains a row of iconic buttons we use to test the connection to the data source and discover the Connector schema. If you hover over the buttons using the mouse pointer a brief description of the button is displayed. Click the Plug button to connect to the data source. A Connection established message should be displayed next to the row of buttons. If you are not able to connect please verify the Connector configuration information. 7. Once the connection is established, click the Torch button to discover the schema of the data source. A list of available attributes in the data source is displayed. Scroll through the list to look at what attributes are available in the Chapter 4. Penguin Financial Incorporated 125
  • 143. schema. Click the right pointed triangle button to read the next entry. The Connector reads the next entry from the data source as shown in Figure 4-13. Figure 4-13 Active Directory connector schema 8. At this point we have successfully established a connection to the data source. We have an option to map required attributes from the schema in the Connector itself or map them from within the AssemblyLine later. A good idea is to map those attributes here which you think will be used by all the AssemblyLines that use this Connector. In our case this Connector will be used by only one AssemblyLine. So we go ahead and map the attributes.126 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 144. 9. Select the attributes that you want to map and drag them into the Work Attribute window pane. Your window now looks similar to Figure 4-14.Figure 4-14 Connector schema attribute mapping We have now completed the Connector configuration, connection to data source, discovery of schema, and mapping of attributes for this Connector. We need to repeat the above steps for the remaining Connectors. Update Active Directory This Connector is used to update user entries in Microsoft Active Directory. This is an LDAP Connector and it is running in Update mode. Chapter 4. Penguin Financial Incorporated 127
  • 145. 1. In the left pane of the layout window right-click Connectors and select New Connector... 2. Select the type of Connector you are going to add. In the Select Connector window select the name ibmdi.LDAP. Enter UpdateADCon in the name field and select Update for mode and click OK. A new Connector is added under Connectors in the left pane. The right pane displays the IBM Tivoli Directory Integrator LDAP Connector in the Connection subtab of the Config... tab. 3. Let us configure this connector. In the right pane click the LDAP URL label on the left side of the first edit box. The Parameter Information window is displayed. In the External Property drop-down list select ADURL and click OK. The previously defined value specified for the ADURL property in the external properties file is displayed in the edit box. 4. Repeat the above step for the Login username, Login password, Search Base, and Search Filter properties by selecting ADLoginName, ADPassword, ADSearchBase, and ADSearchFilter in the External Property list box respectively. Your connector window will look similar to Figure 4-15 on page 128.Figure 4-15 Update Active Directory connector128 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 146. 5. Click the Input Map tab in the right pane. Connect to the data source, discover the schema, and read the next entry from the data source. Select the attributes that you want to map and drag them into the Work Attribute window pane. Your window now looks similar to Figure 4-16 on page 129.Figure 4-16 Update Active Directory connector schema attribute mapping 6. Because this connector operates in Update mode, you have the Link Criteria tab enabled, which needs to be defined. This specifies the condition under which updates to Active Directory are carried out. This tab has another row of iconic buttons. Click the link button with a white star to add new link criteria. A Link Criteria window lets you specify your values. From the Connector Attribute drop-down list select sAMAccountName, select the Operator value as equals, and enter the Value $pfsAMAccountName as shown in Figure 4-17 on page 130. Click OK. Chapter 4. Penguin Financial Incorporated 129
  • 147. Figure 4-17 Link Criteria 7. You now have the link criteria defined as shown in Figure 4-18. Figure 4-18 Link Criteria for the Update Active Directory connector130 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 148. You have now completed the Connector configuration for the Update ActiveDirectory Connector.Active Directory changesThis Connector monitors the Microsoft Active Directory for any changes. This isan Active Directory Changelog Connector that runs in Iterator mode.1. In the left pane of the layout window right-click Connectors and select New Connector...2. Select the type of Connector you are going to add. In the Select Connector window select the name ibmdi.ADChangeLogv2. Enter ADCLogCon in the name field and select Iterate mode (this is the only mode available for this Changelog Connector) and click OK. A new Connector is added under Connectors in the left pane. The right pane displays the Active Directory Changelog Connector v2 in the Connection subtab of the Config... tab.3. Let us configure this connector. In the right pane click the LDAP URL label on the left of the first edit box. The Parameter Information window is displayed. In the External Property drop-down list select ADURL and click OK. The previously defined value specified for the ADURL property in the external properties file, shown in Table 4-14 on page 119, is displayed in the edit box. Repeat the above step for the Login username, Login password, and LDAP Search Base properties by selecting ADLoginName, ADPassword and ADSearchBase in the External Property list box respectively.4. Enter ADChanges as the name for the Iterator State Store. This property stores the change number that keeps track of the starting point for the change detection connector. Its value is persistent. So if the AssemblyLine is down for a period of time, and then comes up again, changes from the last stored change number are read and processed. The delete button next to this field deletes the entry stored in the Iterator State Store. This property along with the next Start at property gives you a good control about from what point the changes in Active Directory are to be read. Your Connector window looks similar to Figure 4-19 on page 132.5. Select the checkbox Use Notifications if you want the connector to be notified as changes happen in the data source. If this check box is selected the Connector will be blocked until a new change has occurred. Note: You can achieve similar functionality by setting a Timeout value of 0 and specifying a Sleep Interval. This polls the data source at periodic intervals specified by the Sleep Interval value. Polling the data source periodically might not be acceptable in many environments, so utilizing the Use Notifications property should be your preferred method to begin with. Chapter 4. Penguin Financial Incorporated 131
  • 149. Figure 4-19 Active Directory Changelog Connector 6. Click the Input Map tab in the right pane. Connect to the data source, discover the schema, and read the next entry from the data source. Select the attributes you want to update in Active Directory and drag them into the Work Attribute window pane. Your window now looks similar to Figure 4-20 on page 133.132 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 150. Note: Sometimes all the attributes you want may not be listed in the schema, because not all entries have all attributes filled. Click the right pointed triangle button (Read the next entry) multiple times to read a few entries until the attributes you want are listed in the schema.Figure 4-20 Active Directory changelog connector schema attribute mapping We have now completed the Connector configuration for the Active Directory Changelog Connector. Chapter 4. Penguin Financial Incorporated 133
  • 151. Note: Sometimes you may get an error while trying to establish a connection to the data source when using the Changelog Connectors. The error maybe something similar to: IO Exception opening socket to server localhost on port 1527. The DB2 Server may be down. This is likely due to an initialization problem with the Cloudscape database used by Tivoli Directory Integrator. One way this problem can be solved is by running any AssemblyLine; it can be the same AssemblyLine that uses this connector or any other AssemblyLine. Once the AssemblyLine has started go back and try to connect to the Changelog Connector again. Read Directory Server This Connector reads user information from Tivoli Directory Server. It is an LDAP Connector in Iterator mode. The configuration of this connector is similar to the Read Active Directory Connector as both are using the LDAP Connector in Iterator mode. The only difference is the values supplied in the Connector configuration window and the attributes to be mapped. 1. Add a new connector using the ibmdi.LDAP Connector template, and name this Connector ReadTDSCon. Select the mode Iterator. 2. In the Connector configuration window fill in the connection information. Fill in the property values for the LDAP URL, Login username, Login password, and LDAP Search Base properties by selecting LDAPUrl, LDAPLoginName, LDAPPassword, and LDAPSearchBase in the External Property list box respectively. Figure 4-21 on page 135 shows the configuration for this Connector.134 Robust Data Synchronization with IBM Tivoli Directory Integrator
  • 152. Figure 4-21 Connector configuration for reading entries from Directory Server 3. In the Input Map window, connect to the data source, discover the schema, and drag the required attributes into the Work Attribute window. Figure 4-22 on page 136 shows the Connector schema attribute mapping. Chapter 4. Penguin Financial Incorporated 135