• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Optimization
 

Optimization

on

  • 912 views

 

Statistics

Views

Total Views
912
Views on SlideShare
898
Embed Views
14

Actions

Likes
0
Downloads
13
Comments
0

2 Embeds 14

http://www.linkedin.com 7
https://www.linkedin.com 7

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • So, How do you know if DMExpress is the right technology for you? Well, you can start by using the TDWI Checklist report for accelerating data integration….
  • So, How do you know if DMExpress is the right technology for you? Well, you can start by using the TDWI Checklist report for accelerating data integration….
  • The result, is that you can normally achieve much higher performance than the leading DI tools, even with no tuning. As an example, I’m showing 2 benchmarks we ran at a customer site, comparing DMExpress vs. Informatica at the top and AbInitio at the bottom.
  • So we talked about speed and efficiency. Now lets talk a bit more about ease of use. Most DI platforms talk about ease-of-use in terms of a nice GUI. However, Syncsort takes the concept of ease-of-use one step further to attack one of the most complex and time-consuming tasks: fine-tuning. For that, let me tell you a little bit about our technology. Traditional Data Integration is manual and static. Moreover, it was not designed with efficiency in mind, this means there’s a suboptimal use of resources, while they are very CPU and memory intensive, they still run I/O operations well below disk speed. Therefore, scaling requires very expensive hardware and time-consuming tuning.Every time there’s changes, IT has to go back and re-tune the system. Well, DMExpress provides a completely different approach, DMExpress is completely automatic and dynamic.Coming from 40 yrs of performance expertise, the engine minimizes CPU and memory utilization, while running I/O operations at or near disk speed. More importantly, it requires no tuning whatsoever, this means it automatically adapts to changes in real time, providing automatic parallelism and pipelining. This transforms into:Higher performance out of the boxMuch better ease of use at a point where users can design high-performance ETL tasks & jobs with minimum trainingSignificant savings in terms of IT staff hours and hardware.
  • A Task is a basic unit of work: sort, aggregate, join, etc.A DMExpressJob is a collection of TasksEach Task executes on a separate processDMExpress automatically: manages threads for each Task
  • Dun & BradstreetData Sizes: 5 tables of ~1 TB each.Processing need: Bottleneck step in INFA was Join 5 tables and aggregate the output.Application: Weekly Reporting application on millions of DUNS number.Data warehouse: Oracle 10g.Original Approach: ETL using INFA. Not meeting SLAs. SLAs is to run this process in a weeks time.Attempts to improve performance: Tuned the ETL environment to try meeting SLAs. No successConverted the ETL mapping to ELT in INFA. No success. Process would abort with ORA-01555:Snapshot too old error, because the process run in the Database too long and tables are being updated during the processing.Broke up the ELT process into 100,000 record batches to prevent the oracle error. The process ran in 27 days (extrapolate)! DMExpress benchmarked: DMExpress extracted five 1 TB tables in 6 hours and performed the joins and aggregation in 9 hours. The output file was then read by INFA and loaded into the target table. Total run time was 15 hour to run this step in DMExpress.POC environment: 4 core LINUX boxDMExpress is currently in production and used as a performance complement to INFA.Current production environment: 16 core LINUX box.High level flow in production: SOURCES  Oracle  DMX (extract 9 hours)  Flat Files  DMX INFA  TARGET DATA MARTWhen they used DMX:Agg not fast – gotta presort, not enough mem to Agg without DMX, or alternate is push down. However, push down to db table just to order by is not an optionDMX extracted data in 6 hours, filtered on the fly and and landing to disk – 2 to 3 tb – offload the load from dbDetail Trade data mart – transactional very busy, offload really benefited the customerLot of Cognizant folks and lot of time spent over many months.
  • Application is used for: Campaign management, portfolio mgt, product analysis, marketing analytics, customer analytics.SLA: Start Friday at 6 pm, final load is on Monday 6 pm – Data Flow:Flat files sources trickling in 10 source systems, 200 flat files – 500 GB in total (3 customer systems, quote systems) (weekly – Friday night) -> Standardization process (INFA + DMX, Aggregation, preparing data for Trillium – Friday at 6 pm to Sat 3 pm) -> Trillium + DMX plug in (customer house holding and address std – 12 hours, ends 3 am Sunday) -> DI (INFA and DMX – building customer hierarchies, i.e aggregating customers to households, bunch of roll ups – 18 hours, ends at Sunday 9pm) -> Dimensional Model Builds and Loads (sorting, joining, CDC, joining keys back to the fact) -> Dim Data Mart (Teradata load time is good portion of the 18 hours). Some anecdotal info from Jeff (Baax)1. Push down not practical - Flat file to database and back to flat file to do work in Trillium and – network costs, db load/unload costs, load a 40 GB just to sort – not an option!2.Took the engineers only2 weeks by themselves and enabled a 6 month deployment (1/6 of that timewas DMX )3. One of the larger table – 150 mill – original approach was truncate and load (12 to 16 hours). Changed the approach to do a CDC in DMX and just to inserts and updates using TD multiload. Now it takes hours to do the DMX CDC and ½ hour to load the results!4. Machine downtime and maintenance adds to the complexity5. Database Monday IDs get locked on Monday at 8 am (real SLA is 8am, exception needed to extend to hard SLA which is 6 pm, causes a lot of aggravation!)6. Due to data volume growth – customer is looking to optimize all the time – DMX provides a very easy, scalable way to deal with this need and implement the jobs. 7. DMX/INFA hand off:Today it’s a file hand offExploring pipes instead of filesMaestro calls a DMX separate jobWorkflow/Session – command task invokes DMX (landing a file). When and where are 2 tools necessary a.) A huge join – started by building a 50 GB join – 30 hours –Inner join outputs file gets read into infa – do some biz logic b.) A huge Agg - INFA memory agg – do DMX sort – INFA complex agg.8. Ratio of numberof INFA/DMX jobs is 70/30
  • Global Payments a leading electronic transaction processing organization serving millions of customers uses DMExpress as their ETL standard. BUSINESS CHALLENGESGlobal Payments came to us as they were planning to consolidate all of their global operations into their US data center. With this purpose they had some challenges:+ First, they wanted to reduce costs, that was one of the key drivers behind the initiative+ Reduce operational risk and improve customer service, providing a more consistent level of service across the world (The fact that they had to manually script in PL/SQL many of their transformations pushing transformations into strained Oracle database, sometimes resulted in errors that could jeopardize daily operations. In addition, under the existing architecture they had to lock Oracle tables for hours which had a huge impact on all database users)+ GPN wanted to open a new revenue source by offering a new service with more granular reporting to their customers. Because of the reasons above, transformations for the new service had to happen outside the Oracle database+ Cut processing times to allow for future growth. They were experiencing around 50% YoY data growth.+ Global operations also meant shorter batch windows with 24x7 operations+ Consolidation of operations meant that staff of 5 FTEs previously managing US and NA operations would now how to manage all the international operations + They wanted to go into production in less than 60 days, while minimizing any impact to its existing operationsBEFORE / PAIN POINTS: They had an architecture with iWay Data Migrator doing some of the work, but since this tool couldn’t cope with the performance and scalability requirements, they had to hand-code a lot of their transformations in PL/SQL. This resulted in several pain points including: Very complex architecture due to the use of both PL/SQL and data migrator. Constant tuning required with little or no reusability, resulting in very long development cycles and time-to-value Their architect said there was real pain on the limitations of error logging with data migrator. Having a tool like DMExpress helped significantly on this area. Higher Costs: In terms of hardware required by their ETL tool as well as database capacity to execute PL/SQL scriptsOne of their processes, had to dedupe and summarize several tables with some of them exceeding 13M rows in length. Processing time was taking more than 2 hrs to completeBENEFITSWe went on site and conducted a POC and a business value analysis (BVA). The results showed: Processing times improved by almost 9x (from 141 min to 3 min for key processing tasks) Significant savings when compared to other options (including informatica, their existing architecture, and DataStage. In fact, they had prior experience working with DataStage so they were looking heavily at DS). However, dring the BVA we did a thorough analysis of their DI strategy and TCO, evaluating operational as well as capital costs in 3 key categories: Hardware costs, database/staging costs, and IT Staff productivity. Please notice ETL software license costs were not included in the analysis. However, our pricing was still very competitive and in the lower end of the competition. The results of the analysis show savings of nearly US $3M over 3 years (more details about the analysis can be found on the third slide)Global Payments was able to deploy to production in approximately 4 weeks.The new architecture is helping GPN achieve their growth and profitability goals with a technology that can scale cost-effectively to support growing data volumes.DISCOVERY QUESTIONS THAT HELPED QUALIFY THIS OPPORTUNITY How critical is your need to reduce processing time (improve performance)?What is your time frame for getting the problem solved?What solutions have you considered?How many people do you have developing/maintaining PL/SQL?What is the size (type/# cores)of your DB server(s) Would you find it advantages to reduce your DB cost?Do you know what the DB server(s) are costing you?What the impact would be if you could move the DI work off the DB Server(s)? Other discovery questions Transformations taking place (sort, merge, join, look-ups)Data sizesCurrent performance (processing) timesDI/DW/BI environment

Optimization Optimization Presentation Transcript

  • Optimization with DMExpressSteven Haddad – Senior Software Architectshaddad@syncsort.com
  • Introducing DMExpress™ - Fast. Efficient. Simple. Cost Effective. A Family of High-Performance, Purpose-Built Data Integration Tools For core ETL processing & database transformation → High-Performance ETL offload (Oracle PL/SQL, Teradata, and others) Integrate → ETL Optimization For Informatica, DataStage, and others → Hadoop Optimization For Apache, HortonWorks, Cloudera, and others Optimize → Rehosting Optimization For Clerity, MicroFocus, Oracle, and others → High-Performance Sort For z/OS, z/VSE, and Windows/UNIX/Linux Migrate → Sort Optimization For SAS, DFSORT, Trillium, and othersSyncsort Confidential and Proprietary - do not copy or distribute 3
  • Do You Need Data Integration Optimization/Acceleration? ETL is taking longer and longer Large budgets to purchase additional hardware and database A shift in data integration processing to database or hand-coded solutions Data integration environment can’t easily be govern, maintained or expanded Inability to launch or staff initiatives due to lack of resources Long time-to-value Users may lose confidence in dataSyncsort Confidential and Proprietary - do not copy or distribute 4
  • What is Optimization with DMExpress™ ? Better Performance – No Tuning Lower Costs for:  Hardware  Licenses  IT Stuff Improves your Capabilities to deliver Reduces usage of resources More work in less time Secure your already done investmentSyncsort Confidential and Proprietary - do not copy or distribute 5
  • Examples for Optimization with DMExpress™ → 10 * Faster then Major Logistic Company DataStage Parallel IBM DataStage → 26 * Faster then Major Logistic Company DataStage Server → 27 days down to 15 hours Information Service Provider →6 week to production Informatica → 1/20 of disc space Major Insurance Provider → significant less Memory → Costs/TB down from ComScore → 1538 US$ to 46 US$ → Reduce costs by 2.9 Mio $ PL/SQL Global Payments → 2.35h down to 3 min → 4:42 h down to 1:12h AbInitio Financial Service Provider → 360 GB down to 4 GB WSSyncsort Confidential and Proprietary - do not copy or distribute 6
  • DMExpress Delivers Significantly Faster Performance Even Without Any Tuning 35 Elapsed Time (m) 30 25 INFA 20 DMExpress Up to 5x Faster 15 10 → DMExpress: No Tuning 5 → Informatica: Tuned 0 1. Copy 2. Sort 3. Aggregate 300 Elapsed Time (m) 250 Ab Initio 200 DMExpress Up to 4x Faster 150 100 → DMExpress: No Tuning 50 → Ab Initio: Tuned 0 1. Copy / Filter 2. Sort 3. Aggregate / RollupSyncsort Confidential and Proprietary - do not copy or distribute 7
  • DMExpress Seamlessly Scales to Support Growing Requirements Volume & Complexity Seamlessly scale: Business Requirements • No tuning Conventional ETL • No ELT • Defer hardware purchases DMExpress Time Continuously implement performance stop-gap measures: • Manual tuning • Add/upgrade hardware Point of problem • Push-down (ELT) awarenessSyncsort Confidential and Proprietary - do not copy or distribute 8
  • Fast: Intelligent Sort Algorithms High Frequency and Impact Compression Source Extract, Ratio Compress & FTP 6X Sort impacts every aspect of ETL increases Partition Up To Source Extract Data Faster 40% Database Load Compress & FTP Joining Up To Records Faster 60% Merge & Up To Partition Data Transformation Faster 50%Aggregation Aggregation Up To Faster 70% Merging & Joining Records Transformation Database Up To Load & Index Faster 40% Syncsort has been the market leading sort technology since 1968
  • Maximizing Performance with Optimum Resource Utilization The Performance Triangle CPU DMExpress Is Different • Patented Algorithms Dynamically responds to CPU, Memory & disk availability Partition & Buffer • Direct I/O Pipeline Parallelism Management Bypasses file system buffer Instruction Memory Cache accessing data directly at block Cache ETL Process Optimization Optimizer Optimization level for higher performance I/O Optimization Algorithm Selection • Compression Used for read/write & crucially active workspace (minimizes disk I/O Memory touches & transfer volume) Disk & I/O BoundSyncsort Confidential and Proprietary - do not copy or distribute 10
  • DMExpress Dynamically Maximizes Throughput at Run Time Conventional Data Integration Data Integration with DMExpress Automatic and Dynamic Manual and Static Algorithms Algorithms Processing Time Processing Time ■ Scaling requires expensive hardware ■ Extremely efficient in commodity hardware ■ I/O operations well below disk speed ■ I/O operations at near disk speed ■ Requires exhaustive tuning ■ Automatic parallelism and pipelining ■ Sub-optimal consumption of resources ■ Automatic, efficient caching and hashing ■ Uses all memory, overflows to disk ■ Minimizes disk cachingSyncsort Confidential and Proprietary - do not copy or distribute 11
  • Efficient: Dynamic ETL OptimizerResource Analysis Memory Partition & Buffer CPU Pipeline Management Parallelism I/O Instruction Memory Cache ETL Process Cache File System Optimization Optimizer Optimization I/O Algorithm Optimization Selection Data TypeData Analysis Record Format Fully automatic, continuously self-tuning optimizer maximizes #Records / throughput and resource efficiencies Columns – Evaluates hardware, software, and data environment – Determines optimal algorithmic flow at start-up – Begins execution with auto-generated optimizer plan – Continuously adjusts algorithms, memory use, parallelism based on application and run time environment 12 Sy ncs
  • Design Once Inherit Performance Sources Read Join Aggregate Write Targets EDW ETL Job DM Thread Management Tasks Dynamic Optimizations • Each ETL task runs on a separate process • Automatic, dynamic thread management for each task • Automatic parallelism and pipelining • Automatic, dynamic algorithm selectionSyncsort Confidential and Proprietary - do not copy or distribute 13
  • ArchitectureDMExpress – White Boarding the Data Acceleration Sales
  • DMExpress Architecture Delivers Maximum Performance and Data Scalability with Automatic Dynamic Optimizations Integration / Customization (SDK, Open APIs) Graphical Development Environment DMExpress Engine High Performance Transformations User Defined Functions Automatic Continuous Optimization Deployment • Sort • Load Presort Built in Functions:Metadata • Merge • Filter • Numeric • Aggregate • Reformat • Text Algorithms • Join / Lookup • Partition • Date and Time • Copy • Logical • Advanced Text Processing • Data Partitioning Processing Time Source/Target ConnectivitySyncsort Confidential and Proprietary - do not copy or distribute 15
  • Five Simple Steps to Deploy. Tuning Is NOT One of Them. • Single install 1. Install DMExpress • Takes less than 5 minutes • Primary Tasks: Sort, Merge, Aggregate, Join / 2. Choose “Task” Template Lookup, Copy • Secondary Tasks: Filter, Reformat, Partition • Connectivity • Standard Functions 3. Fill-in the blanks • Numeric, Text, Date/Time, Logical • User-defined Functions • Create Complete ETL “Jobs” by Combining 4. Integrate Multiple “Tasks” • Define Flows – from files to direct flows • Schedule 5. Deploy • Parameterize • MonitorSyncsort Confidential and Proprietary - do not copy or distribute 16
  • Syncsort DMExpress Is Simple but powerful Intuitive Graphical Interface enables Development and Maintenance • Graphical → No coding required Development Environment → No tuning required → Easily build/edit jobs and tasks • Expression Builder → Detect differences between development, test, and production environments • Job/Task Diff → Users are fully functional within a few daysSyncsort Confidential and Proprietary - do not copy or distribute 17
  • DMExpress Architecture DmExpress Clients Command Line Job Task Editor Editor Flat File Based 3rd party version Metadata Repository Check-in control tool Check-out Design Services Time View Local Windows / Unix / Linux Remote Data Server Server DMExpress Engine Data Sources / Targets
  • Use CasesDMExpress – White Boarding the Data Acceleration Sales
  • Acceleration POC – Scenario A Processing Time in Minutes of ‘High Load Jobs’ 32 40 19 30 1/2 The time 20 10 0 DataStage DMExpress Parallel 4/6 cores 1 core (Virtual) 1/6 The hardware(Physical/Virt.) Linux Linux 20
  • Acceleration POC – Scenario B Processing Time in Minutes of ‘Scenario B’ 40.00 40.00 21.30 30.00 1/2 The time 20.00 10.00 0.00 DataStage DMExpress Server 14 cores 1 core 1/14 The(Physical) (Virtual) Hardware HP-UX Linux 21
  • Use Case 1: Global Information Service Provider Business Challenge  Severe competitive pressure from Google Finance, Yahoo! Finance, Morningstar, and others forced development of strategic new offerings Environment  Informatica 8.11 SP3, Oracle 10.2 RAC 6 nodes, DMExpress 5.2.15.  16 core LINUX machine Technical Challenge  Weekly Reporting application on 8 million DUNS numbers  Data Sizes: 5 tables of ~1 TB each  Bottleneck step was to join 5 tables and aggregate the output Prior Attempts to Increase Performance  Manual tuning of ETL routines - lots of consultants spent many months and dollars  Converted the ETL mapping to ELT. No success - Process would abort with ORA-01555: Snapshot too old error  Broke up the ELT process into 100,000 record batches to prevent the oracle error. The process ran in 27 days (extrapolated)  Problem existed since February on 2009, many attempts and touch points, production in October. Solution  DMExpress extracted five 1 TB tables in 6 hours and performed the joins and aggregation in 9 hours. Total run time was 15 hour to run this step in DMExpress vs. 27 days.  DMExpress invoked at the command line prior to Informatica Benefits  New offering launched on time  Able to meet SLAs  2 weeks to finish POC  In production in 6 weeks
  • Use case 2: Major Insurance Provider Business Challenge  Unable to complete processing to deliver new highly personalized offers and pricing to their agents via their agent marketing portal over weekend window impacts conversion rates for promotions to policyholders  Need to start the processing on Friday night 6pm, causing data from load to be done only by Wednesday 6 pm Environment  Informatica version 7.x, 8.6.1, Trillium, Teradata, reporting - MicroStrategy, Hyperion/Brio,DMExpress 6.9, Maestro , Sun Solaris Technical Challenge  500 of GB of data, including joins and aggregations, need to be completed during weekend window  Certain jobs would not even not run – need to abort (30 hour + runs). No alternative – no tuning worked  Very slow I/O when joins spill to disk. All of the memory on the system is grabbed! Virtual memory errors  No capacity in Teradata to push down transformations Prior Attempts to Increase Performance  Tuning did not solve the problem  Dynamically adjusting cache did not solve the bottleneck Solution  Output from Trillium is sent to DMExpress and Informatica to integrate and aggregate the data (Joins, and aggregations)  Started out with 10 critical DMExpress jobs and now expanded to 700+ DMExpress tasks, 200 DMExpress jobs  Orchestrated within PowerCenter Workflow Manager – command task and also called separately from Maestro. Benefits  DMExpress completes within weekend batch window  Extremely simple and scalable approach – very short learning curve – 1 month to deploy DMExpress  Significantly less memory used by DMX - more parallel jobs due to efficiency. DMExpress takes 1/20th the disk space
  • Case Study: Enabling Up to $3M in Data Integration Cost Savings Before After PL/SQL Scripts (ELT) DMExpress (ETL)Avg. 13.5M rows per file/table Avg. 13.5M rows per file/table ETLTL Vertica Oracle Oracle DMExpress Oracle Oracle Data Migrator Analytics Analytics Read files Load into staging Load into the Oracle Read files Dedupe, summarize Analysis & reporting area, dedupe, and production data and load into Oracle summarize using warehouse for data warehouse PL/SQL scripts and analysis & reporting iWay Data Migrator • Est. TCO over 3 years: $4.4M Est. TCO over 3 years: $1.5M • Total processing time: 2.35 hrs Total processing time: 3 min • Complex architecture with PL/SQL, iWay Data One tool. One ETL engine. No staging Migrator and lots of Oracle staging No coding. No tuning. Reusable objects • Manual coding. Manual tuning. No reusability Scalable architecture supports business growth • No scalability to support business goals and profitability objectives Syncsort Confidential and Proprietary - do not copy or distribute 24
  • POC Results – Informatica Max I/O Ave I/O Max I/O Ave I/O Utilization - Utilization Utilization Utilization Memory Peak Approximate Read – Read – Write – Write Elapsed time (Mb) CPU Time MB/Sec (Meg/s) (MB/Sec MB/Sec PowerCenter 0:28:10 11,875 1:06:29.2 53 12 82 39 DMExpress 0:13:26 9,438 0:16:53.9 154 33 101 66 DMExpress (Linux) 0:05:43 9,957 0:16:21 N/A 83 N/A 142 Elapsed Time Memory (Gb) CPU Time00:36:00 14.0 1:12:00 12.0 1:04:4800:28:48 0:57:36 10.0 0:50:2400:21:36 8.0 0:43:12 0:36:0000:14:24 6.0 0:28:48 4.0 0:21:3600:07:12 0:14:24 2.0 0:07:1200:00:00 0.0 0:00:00 PC DMX DMX (Linux) PC DMX DMX (Linux) PC DMX DMX (Linux)
  • Benchmark Details DMExpress vs. Informatica Current DMX Task Time Task Time Saving Copy 4mins 09 seconds Copy 0mins 50 seconds 80%5 Gb Sort 7mins 26 seconds Sort 1mins 19 seconds 82%File – Aggregate 9mins 37 seconds Aggregate 1mins 9 seconds 88%45 M Sort & Aggregate 3mins 43 seconds Sort & Aggregate 1mins 37 seconds 57%Records Task Time Task Time Saving Copy 20mins 53 seconds Copy 4mins 12 seconds 80% Sort 31mins 48 seconds Sort 6mins 17 seconds 80%25 Gb Aggregate 20mins 45 seconds Aggregate 4mins 30 seconds 78%File –225 M Sort & Aggregate 14mins 53 seconds Sort & Aggregate 6mins 38 seconds 55%Records
  • Ab Initio Benchmark Scenario1 (copy/filter) Elapsed time CPU time Temp Workspace Records read Record written Data read Data written (bytes)DMExpress 47 minutes 3 hours 44 min 0 GB 2,926,155,265 452,375,411 383,326,339,715 59,261,178,841 Ab Initio 66 minutes 4 hours 38 min 0 GB 2,926,155,265 452,375,411 383,326,339,715 59,261,178,841 Scenario2 (Sort) Elapsed time CPU time Temp Workspace Records read Record written Data read Data written (bytes)DMExpress 1 hour 12 min 7 hours 26 min 60 GB 2,926,155,265 2,926,155,265 383,326,339,715 383,326,339,715 Ab Initio 4 hours 42 min 9 hours 48 min 360 GB 2,926,155,265 2,926,155,265 383,326,339,715 383,326,339,715 Scenario3 (Aggregation/Rollup) Elapsed time CPU time Temp Workspace Records read Record written Data read Data written (bytes)DMExpress 1 hour 21 min 7 hour 10 min 4 GB 2,926,155,265 27,179,924 383,326,339,715 4,022,628,752 Ab Initio 2 hours 10 hours 14 min 360 GB 2,926,155,265 27,179,924 383,326,339,715 4,022,628,752 Ab Initio tuned 8 ways DMExpress with no tuningSyncsort Confidential and Proprietary - do not copy or distribute 27
  • Metadata with MitiDMExpress – White Boarding the Data Acceleration Sales
  • ETL to DMExpress acceleration / conversion Automatic Conversion Utility Conversion Utility Cognizant Migration / Optimization COE Parsing UNIX shell scripts Informatica workflows • Informatica Informatica mappings Spreadsheets identifying the production • IBM DataStage workflows and mappings • PL/SQL Timing information of the job executions over a two month period • Etc… Resource data points for the workflows Processing • Flow analysis • Expression & type analysis • Optimization Output Generation • DMExpress • DocumentationSyncsort Confidential and Proprietary - do not copy or distribute 29
  • DMX Live DemoDMExpress – White Boarding the Data Acceleration Sales P