Informatica Software Architecture illustratedInformatica ETL product, known as Informatica Power Center consists of 3 main components.1. Informatica PowerCenter Client Tools:These are the development tools installed at developer end. These tools enable a developer to Define transformation process, known as mapping. (Designer) Define run-time properties for a mapping, known as sessions (Workflow Manager) Monitor execution of sessions (Workflow Monitor) Manage repository, useful for administrators (Repository Manager) Report Metadata (Metadata Reporter)2. Informatica PowerCenter Repository:Repository is the heart of Informatica tools. Repository is a kind of data inventory where all thedata related to mappings, sources, targets etc is kept. This is the place where all the metadata foryour application is stored. All the client tools and Informatica Server fetch data from Repository.Informatica client and server without repository is same as a PC without memory/harddisk,which has got the ability to process data but has no data to process. This can be treated asbackend of Informatica.3. Informatica PowerCenter Server:Server is the place, where all the executions take place.Server makes physical connections to sources/targets,fetches data, applies the transformations mentioned in themapping and loads the data in the target system.This architecture is visually explained in diagram below:
Sources TargetsStandard: RDBMS,Flat Files, XML, Standard: RDBMS,ODBC Flat Files, XML, ODBCApplications: SAPR/3, SAP BW, Applications: SAPPeopleSoft, Siebel, JD R/3, SAP BW,Edwards, i2 PeopleSoft, Siebel, JD Edwards, i2EAI: MQ Series,Tibco, JMS, Web EAI: MQ Series,Services Tibco, JMS, Web ServicesLegacy: Mainframes(DB2, VSAM, IMS, Legacy: MainframesIDMS, Adabas)AS400 (DB2)AS400 (DB2)(DB2, Flat File) Remote TargetsRemote SourcesThis is the sufficient knowledge to start with Informatica. So lets go straight to development inInformatica.Informatica >> Beginners >> Informatica Product OverviewInformatica Product LineInformatica is a powerful ETL tool from Informatica Corporation, a leading provider ofenterprise data integration software and ETL softwares.The important products provided by Informatica Corporation is provided below: Power Center Power Mart Power Exchange Power Center Connect Power Channel
Metadata Exchange Power Analyzer Super GluePower Center & Power Mart: Power Mart is a departmental version of Informatica forbuilding, deploying, and managing data warehouses and data marts. Power center is used forcorporate enterprise data warehouse and power mart is used for departmental data warehouseslike data marts. Power Center supports global repositories and networked repositories and it canbe connected to several sources. Power Mart supports single repository and it can be connectedto fewer sources when compared to Power Center. Power Mart can extensibily grow to anenterprise implementation and it is easy for developer productivity through a codelessenvironment.Power Exchange: Informatica Power Exchange as a stand alone service or along with PowerCenter, helps organizations leverage data by avoiding manual coding of data extractionprograms. Power Exchange supports batch, real time and changed data capture options in mainframe(DB2, VSAM, IMS etc.,), mid range (AS400 DB2 etc.,), and for relational databases(oracle, sql server, db2 etc) and flat files in unix, linux and windows systems.Power Center Connect: This is add on to Informatica Power Center. It helps to extract data andmetadata from ERP systems like IBMs MQSeries, Peoplesoft, SAP, Siebel etc. and other thirdparty applications.Power Channel: This helps to transfer large amount of encrypted and compressed data overLAN, WAN, through Firewalls, tranfer files over FTP, etc.Meta Data Exchange: Metadata Exchange enables organizations to take advantage of the timeand effort already invested in defining data structures within their IT environment when usedwith Power Center. For example, an organization may be using data modeling tools, such asErwin, Embarcadero, Oracle designer, Sybase Power Designer etc for developing data models.Functional and technical team should have spent much time and effort in creating the datamodels data structures(tables, columns, data types, procedures, functions, triggers etc). By usingmeta deta exchange, these data structures can be imported into power center to identifiy sourceand target mappings which leverages time and effort. There is no need for informatica developerto create these data structures once again.Power Analyzer: Power Analyzer provides organizations with reporting facilities.PowerAnalyzer makes accessing, analyzing, and sharing enterprise data simple and easilyavailable to decision makers. PowerAnalyzer enables to gain insight into business processes anddevelop business intelligence.With PowerAnalyzer, an organization can extract, filter, format, and analyze corporateinformation from data stored in a data warehouse, data mart, operational data store, or otherdatastorage models. PowerAnalyzer is best with a dimensional data warehouse in a relational
database. It can also run reports on data in any table in a relational database that do not conformto the dimensional model.Super Glue: Superglue is used for loading metadata in a centralized place from several sources.Reports can be run against this superglue to analyze meta data.Note:This is not a complete tutorial on Informatica. We will add more Tips and Guidelines onInformatica in near future. Please visit us soon to check back. To know more about Informatica,contact its official website www.informatica.comInformatica TransformationsA transformation is a repository object that generates, modifies, or passes data. The Designerprovides a set of transformations that perform specific functions. For example, an Aggregatortransformation performs calculations on groups of data.Transformations can be of two types:Active TransformationAn active transformation can changethe number of rows that pass throughthe transformation, change thetransaction boundary, can change therow type. For example, Filter,Transaction Control and UpdateStrategy are active transformations.The key point is to note that Designer does not allow you to connect multiple activetransformations or an active and a passive transformation to the same downstream transformationor transformation input group because the Integration Service may not be able to concatenate therows passed by active transformations However, Sequence Generator transformation(SGT) is anexception to this rule. A SGT does not receive data. It generates unique numeric values. As aresult, the Integration Service does not encounter problems concatenating rows passed by a SGTand an active transformation.Passive Transformation.A passive transformation does not change the number of rows that pass through it, maintains thetransaction boundary, and maintains the row type.The key point is to note that Designer allows you to connect multiple transformations to the samedownstream transformation or transformation input group only if all transformations in theupstream branches are passive. The transformation that originates the branch can be active orpassive.
Transformations can be Connected or UnConnected to the data flow.Connected TransformationConnected transformation isconnected to other transformations ordirectly to target table in the mapping.UnConnected TransformationAn unconnected transformation is not connected to other transformations in the mapping. It iscalled within another transformation, and returns a value to that transformation.Informatica TransformationsFollowing are the list of Transformations available in Informatica: Aggregator Transformation Application Source Qualifier Transformation Custom Transformation Data Masking Transformation Expression Transformation External Procedure Transformation Filter Transformation HTTP Transformation Input Transformation Java Transformation Joiner Transformation Lookup Transformation Normalizer Transformation Output Transformation Rank Transformation Reusable Transformation Router Transformation Sequence Generator Transformation Sorter Transformation Source Qualifier Transformation SQL Transformation Stored Procedure Transformation Transaction Control Transaction Union Transformation Unstructured Data Transformation Update Strategy Transformation XML Generator Transformation XML Parser Transformation XML Source Qualifier Transformation Advanced External Procedure Transformation
External TransformationIn the following pages, we will explain all the above Informatica Transformations and theirsignificances in the ETL process in detail.Informatica >> Beginners >> Informatica TransformationsInformatica TransformationsAggregator TransformationAggregator transformation performs aggregate funtions like average, sum, count etc. on multiplerows or groups. The Integration Service performs these calculations as it reads and stores datagroup and row data in an aggregate cache. It is an Active & Connected transformation.Difference b/w Aggregator and Expression Transformation? Expression transformation permitsyou to perform calculations row by row basis only. In Aggregator you can perform calculationson groups.Aggregator transformation has following ports State, State_Count, Previous_State andState_Counter.Components: Aggregate Cache, Aggregate Expression, Group by port, Sorted input.Aggregate Expressions: are allowed only in aggregate transformations. can include conditionalclauses and non-aggregate functions. can also include one aggregate function nested into anotheraggregate function.Aggregate Functions: AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE,STDDEV, SUM, VARIANCEApplication Source Qualifier TransformationRepresents the rows that the Integration Service readsfrom an application, such as an ERP source, when it runsa session.It is an Active & Connected transformation.Custom TransformationIt works with procedures you create outside the designer interface to extend PowerCenterfunctionality. calls a procedure from a shared library or DLL. It is active/passive & connectedtype.You can use CT to create T. that require multiple input groups and multiple output groups.
Custom transformation allows you to develop the transformation logic in a procedure. Some ofthe PowerCenter transformations are built using the Custom transformation. Rules that apply toCustom transformations, such as blocking rules, also apply to transformations built using Customtransformations. PowerCenter provides two sets of functions called generated and API functions.The Integration Service uses generated functions to interface with the procedure. When youcreate a Custom transformation and generate the source code files, the Designer includes thegenerated functions in the files. Use the API functions in the procedure code to develop thetransformation logic.Difference between Custom and External Procedure Transformation? In Custom T, input andoutput functions occur separately.The Integration Service passes the input data to the procedureusing an input function. The output function is a separate function that you must enter in theprocedure code to pass output data to the Integration Service. In contrast, in the ExternalProcedure transformation, an external procedure function does both input and output, and itsparameters consist of all the ports of the transformation.Data Masking TransformationPassive & Connected. It is used to change sensitiveproduction data to realistic test data for non productionenvironments. It creates masked data for development,testing, training and data mining. Data relationship andreferential integrity are maintained in the masked data.For example: It returns masked value that has a realistic format for SSN, Credit card number,birthdate, phone number, etc. But is not a valid value. Masking types: Key Masking, RandomMasking, Expression Masking, Special Mask format. Default is no masking.Expression TransformationPassive & Connected. are used to perform non-aggregate functions, i.e to calculate values in asingle row. Example: to calculate discount of each product or to concatenate first and last namesor to convert date to a string field.You can create an Expression transformation in the Transformation Developer or the MappingDesigner. Components: Transformation, Ports, Properties, Metadata Extensions.External ProcedurePassive & Connected or Unconnected. It works with procedures you create outside of theDesigner interface to extend PowerCenter functionality. You can create complex functionswithin a DLL or in the COM layer of windows and bind it to external procedure transformation.To get this kind of extensibility, use the Transformation Exchange (TX) dynamic invocationinterface built into PowerCenter. You must be an experienced programmer to use TX and usemulti-threaded code in external procedures.Filter Transformation
Active & Connected. It allows rows that meet the specified filter condition and removes the rowsthat do not meet the condition. For example, to find all the employees who are working inNewYork or to find out all the faculty member teaching Chemistry in a state. The input ports forthe filter must come from a single transformation. You cannot concatenate ports from more thanone transformation into the Filter transformation. Components: Transformation, Ports,Properties, Metadata Extensions.HTTP TransformationPassive & Connected. It allows you to connect to anHTTP server to use its services and applications. With anHTTP transformation, the Integration Service connects tothe HTTP server, and issues a request to retrieves data orposts data to the target or downstream transformation inthe mapping.Authentication types: Basic, Digest and NTLM. Examples: GET, POST and SIMPLE POST.Java TransformationActive or Passive & Connected. It provides a simple native programming interface to definetransformation functionality with the Java programming language. You can use the Javatransformation to quickly define simple or moderately complex transformation functionalitywithout advanced knowledge of the Java programming language or an external Javadevelopment environment.Joiner TransformationActive & Connected. It is used to join data from two related heterogeneous sources residing indifferent locations or to join data from the same source. In order to join two sources, there mustbe at least one or more pairs of matching column between the sources and a must to specify onesource as master and the other as detail. For example: to join a flat file and a relational source orto join two flat files or to join a relational source and a XML source.The Joiner transformation supports the following types of joins: Normal Normal join discards all the rows of data from the master and detail source that do not match, based on the condition. Master Outer Master outer join discards all the unmatched rows from the master source and keeps all the rows from the detail source and the matching rows from the master source. Detail Outer
Detail outer join keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source. Full Outer Full outer join keeps all rows of data from both the master and detail sources.Limitations on the pipelines you connect to the Joiner transformation:*You cannot use a Joiner transformation when either input pipeline contains an Update Strategytransformation.*You cannot use a Joiner transformation if you connect a Sequence Generator transformationdirectly before the Joiner transformation.Lookup TransformationPassive & Connected or UnConnected. It is used to look up data in a flat file, relational table,view, or synonym. It compares lookup transformation ports (input ports) to the source columnvalues based on the lookup condition. Later returned values can be passed to othertransformations. You can create a lookup definition from a source qualifier and can also usemultiple Lookup transformations in a mapping.You can perform the following tasks with a Lookup transformation:*Get a related value. Retrieve a value from the lookup table based on a value in the source. Forexample, the source has an employee ID. Retrieve the employee name from the lookup table.*Perform a calculation. Retrieve a value from a lookup table and use it in a calculation. Forexample, retrieve a sales tax percentage, calculate a tax, and return the tax to a target.*Update slowly changing dimension tables. Determine whether rows exist in a target.Lookup Components: Lookup source, Ports, Properties, Condition.Types of Lookup:1) Relational or flat file lookup.2) Pipeline lookup.3) Cached or uncached lookup.4) connected or unconnected lookup.