This document discusses managing data across its lifecycle from requirement to retirement. It outlines the challenges of data growth, inefficient testing practices, and poorly planned enterprise changes. It promotes an information governance approach and leveraging IBM InfoSphere solutions to optimize the data lifecycle through four phases: discover and define, develop and test, optimize and archive, consolidate and retire. Specific solutions discussed include test data management, database workload capture and replay, and archiving to reduce storage costs and risks of non-compliance. The overall message is that holistic data lifecycle management can improve application quality, accelerate delivery, reduce costs, and better manage risks.
This presentation provides a technical overview of IBM Optim and its benefits.
Three areas of focus:
Mitigate Risk: Much of the “data related” risk that an organization carries is related to keeping sensitive data private, preventing data breaches, and safely storing and retiring data that is no longer required on the online systems. Companies must comply to regulations and policies, and lack of proper data protection can lead to penalties, including damaging a company’s reputation.
Deal with Data Growth: Another challenge is dealing with the explosive data growth for many applications. Without properly managing the data volume, companies will see the impact in the performance of their system over time. This is particularly a problem when service level agreements (SLA’s) are in place that mandate set response times.
Control Costs: The costs of managing data spans across initial design of the data structure throughout all lifecycle phases - until ultimately retiring the data. IT staff is under constant pressure to deliver more for less. Some major costs for managing data include storage hardware costs, storage management costs (archiving, storing, retrieving, etc.), and costs of protecting the data per compliance regulations.
Databricks + Snowflake: Catalyzing Data and AI InitiativesDatabricks
"Combining Databricks, the unified analytics platform with Snowflake, the data warehouse built for the cloud is a powerful combo.
Databricks offers the ability to process large amounts of data reliably, including developing scalable AI projects. Snowflake offers the elasticity of a cloud-based data warehouse that centralizes the access to data. Databricks brings the unparalleled utility of being based on a mature distributed big data processing and AI-enabled tool to the table, capable of integrating with nearly every technology, from message queues (e.g. Kafka) to databases (e.g. Snowflake) to object stores (e.g. S3) and AI tools (e.g. Tensorflow).
Key Takeaways:
How Databricks & Snowflake work;
Why they're so powerful;
How Databricks + Snowflake symbiotically catalyze analytics and AI initiatives"
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. John Pignata, AWS Startup Solutions Architect, will discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. He will provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
In this webinar you'll learn how to quickly and easily improve your business using Snowflake and Matillion ETL for Snowflake. Webinar presented by Solution Architects Craig Collier (Snowflake) adn Kalyan Arangam (Matillion).
In this webinar:
- Learn to optimize Snowflake and leverage Matillion ETL for Snowflake
- Discover tips and tricks to improve performance
- Get invaluable insights from data warehousing pros
This presentation provides a technical overview of IBM Optim and its benefits.
Three areas of focus:
Mitigate Risk: Much of the “data related” risk that an organization carries is related to keeping sensitive data private, preventing data breaches, and safely storing and retiring data that is no longer required on the online systems. Companies must comply to regulations and policies, and lack of proper data protection can lead to penalties, including damaging a company’s reputation.
Deal with Data Growth: Another challenge is dealing with the explosive data growth for many applications. Without properly managing the data volume, companies will see the impact in the performance of their system over time. This is particularly a problem when service level agreements (SLA’s) are in place that mandate set response times.
Control Costs: The costs of managing data spans across initial design of the data structure throughout all lifecycle phases - until ultimately retiring the data. IT staff is under constant pressure to deliver more for less. Some major costs for managing data include storage hardware costs, storage management costs (archiving, storing, retrieving, etc.), and costs of protecting the data per compliance regulations.
Databricks + Snowflake: Catalyzing Data and AI InitiativesDatabricks
"Combining Databricks, the unified analytics platform with Snowflake, the data warehouse built for the cloud is a powerful combo.
Databricks offers the ability to process large amounts of data reliably, including developing scalable AI projects. Snowflake offers the elasticity of a cloud-based data warehouse that centralizes the access to data. Databricks brings the unparalleled utility of being based on a mature distributed big data processing and AI-enabled tool to the table, capable of integrating with nearly every technology, from message queues (e.g. Kafka) to databases (e.g. Snowflake) to object stores (e.g. S3) and AI tools (e.g. Tensorflow).
Key Takeaways:
How Databricks & Snowflake work;
Why they're so powerful;
How Databricks + Snowflake symbiotically catalyze analytics and AI initiatives"
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
In this session, we simplify big data processing as a data bus comprising various stages: ingest, store, process, and visualize. John Pignata, AWS Startup Solutions Architect, will discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. He will provide reference architecture, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost.
In this webinar you'll learn how to quickly and easily improve your business using Snowflake and Matillion ETL for Snowflake. Webinar presented by Solution Architects Craig Collier (Snowflake) adn Kalyan Arangam (Matillion).
In this webinar:
- Learn to optimize Snowflake and leverage Matillion ETL for Snowflake
- Discover tips and tricks to improve performance
- Get invaluable insights from data warehousing pros
IBM Cloud Object Storage: How it works and typical use casesTony Pearson
This session covers the general concepts of object storage and in particular the IBM Cloud Object Storage offerings. Presented at IBM TechU in Johannesburg, South Africa September 2019
High-speed Database Throughput Using Apache Arrow Flight SQLScyllaDB
Flight SQL is a revolutionary new open database protocol designed for modern architectures. Key features in Flight SQL include a columnar-oriented design and native support for parallel processing of data partitions. This talk will go over how these new features can push SQL query throughput beyond existing standards such as ODBC.
Making Apache Spark Better with Delta LakeDatabricks
Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake offers ACID transactions, scalable metadata handling, and unifies the streaming and batch data processing. It runs on top of your existing data lake and is fully compatible with Apache Spark APIs.
In this talk, we will cover:
* What data quality problems Delta helps address
* How to convert your existing application to Delta Lake
* How the Delta Lake transaction protocol works internally
* The Delta Lake roadmap for the next few releases
* How to get involved!
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...Amazon Web Services
Learn about architecture best practices for combining AWS storage and database technologies. We outline AWS storage options (Amazon EBS, Amazon EC2 Instance Storage, Amazon S3 and Amazon Glacier) along with AWS database options including Amazon ElastiCache (in-memory data store), Amazon RDS (SQL database), Amazon DynamoDB (NoSQL database), Amazon CloudSearch (search), Amazon EMR (hadoop) and Amazon Redshift (data warehouse). Then we discuss how to architect your database tier by using the right database and storage technologies to achieve the required functionality, performance, availability, and durability—at the right cost.
Quick iteration and reusability of metric calculations for powerful data exploration.
At Looker, we want to make it easier for data analysts to service the needs of the data-hungry users in their organizations. We believe too much of their time is spent responding to ad hoc data requests and not enough time is spent building, experimenting, and embellishing a robust model of the business. Worse yet, business users are starving for data, but are forced to make important decisions without access to data that could guide them in the right direction. Looker addresses both of these problems with a YAML-based modeling language called LookML.
This paper walks through a number of data modeling examples, demonstrating how to use LookML to generate, alter, and update reports—without the need to rewrite any SQL. With LookML, you build your business logic, defining your important metrics once and then reusing them throughout a model—allowing quick, rapid iteration of data exploration, while also ensuring the accuracy of the SQL that’s generated. Small updates are quick and can be made immediately available to business users to manipulate, iterate, and transform in any way they see fit.
Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...Amazon Web Services
Learning Objectives:
- Understand how to build a serverless big data solution quickly and easily
- Learn how to discover and prepare all your data for analytics
- Learn how to query and visualize analytics on all your data to create actionable insights
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Databricks
The Data Lake paradigm is often considered the scalable successor of the more curated Data Warehouse approach when it comes to democratization of data. However, many who went out to build a centralized Data Lake came out with a data swamp of unclear responsibilities, a lack of data ownership, and sub-par data availability.
In this session we take an in-depth look into the Apache Atlas open metadata and governance function.
Open metadata and governance is a moon-shot type of project to create a set of open APIs, types, and interchange protocols to allow all metadata repositories to share and exchange metadata. From this common base, it adds governance, discovery, and access frameworks to automate the collection, management, and use of metadata across an enterprise. The result is an enterprise catalog of data resources that are transparently assessed, governed, and used in order to deliver maximum value to the enterprise.
Apache Atlas is the reference implementation of the Open Metadata and Governance standards and framework (https://cwiki.apache.org/confluence/display/ATLAS/Open+Metadata+and+Governance). This function will enable an Apache Atlas server to synchronize and query metadata from any open metadata-compliant metadata repository.
In this session we will cover how Open Metadata and Governance works. This includes: (1) the key components in Atlas, (2) the different integration patterns and APIs that vendors can use to integrate their technology into the open metadata ecosystem, and (3) how common metadata use cases such as searching for data sets, managing security (through Atlas/Ranger integration), and automated metadata discovery work in the active ecosystem.
Speaker
Mandy Chessell, Distinguished Engineer, IBM
Manufacturers have an abundance of data, whether from connected sensors, plant systems, manufacturing systems, claims systems and external data from industry and government. Manufacturers face increased challenges from continually improving product quality, reducing warranty and recall costs to efficiently leveraging their supply chain. For example, giving the manufacturer a complete view of the product and customer information integrating manufacturing and plant floor data, with as built product configurations with sensor data from customer use to efficiently analyze warranty claim information to reduce detection to correction time, detect fraud and even become proactive around issues requires a capable enterprise data hub that integrates large volumes of both structured and unstructured information. Learn how an enterprise data hub built on Hadoop provides the tools to support analysis at every level in the manufacturing organization.
Watch full webinar here: https://bit.ly/2N1Ndz9
How is a logical data fabric different from a physical data fabric? What are the advantages of one type of fabric over the other? Attend this session to firm up your understanding of a logical data fabric.
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...Databricks
Stateful processing is one of the most challenging aspects of distributed, fault-tolerant stream processing. The DataFrame APIs in Structured Streaming make it very easy for the developer to express their stateful logic, either implicitly (streaming aggregations) or explicitly (mapGroupsWithState). However, there are a number of moving parts under the hood which makes all the magic possible. In this talk, I am going to dive deeper into how stateful processing works in Structured Streaming.
In particular, I’m going to discuss the following.
• Different stateful operations in Structured Streaming
• How state data is stored in a distributed, fault-tolerant manner using State Stores
• How you can write custom State Stores for saving state to external storage systems.
Data Quality Patterns in the Cloud with Azure Data FactoryMark Kromer
This is my slide presentation from Pragmatic Works' Azure Data Week 2019: Data Quality Patterns in the Cloud with Azure Data Factory using Mapping Data Flows
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaKai Wähner
If there were a buzzword of the hour, it would certainly be "data mesh"! This new architectural paradigm unlocks analytic data at scale and enables rapid access to an ever-growing number of distributed domain datasets for various usage scenarios.
As such, the data mesh addresses the most common weaknesses of the traditional centralized data lake or data platform architecture. And the heart of a data mesh infrastructure must be real-time, decoupled, reliable, and scalable.
This presentation explores how Apache Kafka, as an open and scalable decentralized real-time platform, can be the basis of a data mesh infrastructure and - complemented by many other data platforms like a data warehouse, data lake, and lakehouse - solve real business problems.
There is no silver bullet or single technology/product/cloud service for implementing a data mesh. The key outcome of a data mesh architecture is the ability to build data products; with the right tool for the job.
A good data mesh combines data streaming technology like Apache Kafka or Confluent Cloud with cloud-native data warehouse and data lake architectures from Snowflake, Databricks, Google BigQuery, et al.
In this session we will explore the world’s first cloud-scale file system and its targeted use cases. Session attendees will learn about EFS’s benefits, how to identify applications that are appropriate for use with EFS, and details about its performance and security models. The target audience is file system administrators, application developers, and application owners that operate or build file-based applications.
Amazon FreeRTOS: IoT Operating System for Microcontrollers (IOT208-R1) - AWS ...Amazon Web Services
In this presentation, we take a deeper look at Amazon FreeRTOS. As OEMs work to squeeze more functionality onto cheaper and smaller IoT devices, they face a series of challenges in development and operations that results in security vulnerabilities, inefficient code, compatibility issues, and unclear licensing. With Amazon FreeRTOS, it is now easier to build, deploy, and update connected microcontroller-based devices quickly and economically, while retaining confidence that the devices are secure. Also, learn how Pentair, a leading water treatment company, is developing an IoT solution with the help of Amazon FreeRTOS and Espressif Systems, a hardware partner.
IBM Cloud Object Storage: How it works and typical use casesTony Pearson
This session covers the general concepts of object storage and in particular the IBM Cloud Object Storage offerings. Presented at IBM TechU in Johannesburg, South Africa September 2019
High-speed Database Throughput Using Apache Arrow Flight SQLScyllaDB
Flight SQL is a revolutionary new open database protocol designed for modern architectures. Key features in Flight SQL include a columnar-oriented design and native support for parallel processing of data partitions. This talk will go over how these new features can push SQL query throughput beyond existing standards such as ODBC.
Making Apache Spark Better with Delta LakeDatabricks
Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake offers ACID transactions, scalable metadata handling, and unifies the streaming and batch data processing. It runs on top of your existing data lake and is fully compatible with Apache Spark APIs.
In this talk, we will cover:
* What data quality problems Delta helps address
* How to convert your existing application to Delta Lake
* How the Delta Lake transaction protocol works internally
* The Delta Lake roadmap for the next few releases
* How to get involved!
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...Amazon Web Services
Learn about architecture best practices for combining AWS storage and database technologies. We outline AWS storage options (Amazon EBS, Amazon EC2 Instance Storage, Amazon S3 and Amazon Glacier) along with AWS database options including Amazon ElastiCache (in-memory data store), Amazon RDS (SQL database), Amazon DynamoDB (NoSQL database), Amazon CloudSearch (search), Amazon EMR (hadoop) and Amazon Redshift (data warehouse). Then we discuss how to architect your database tier by using the right database and storage technologies to achieve the required functionality, performance, availability, and durability—at the right cost.
Quick iteration and reusability of metric calculations for powerful data exploration.
At Looker, we want to make it easier for data analysts to service the needs of the data-hungry users in their organizations. We believe too much of their time is spent responding to ad hoc data requests and not enough time is spent building, experimenting, and embellishing a robust model of the business. Worse yet, business users are starving for data, but are forced to make important decisions without access to data that could guide them in the right direction. Looker addresses both of these problems with a YAML-based modeling language called LookML.
This paper walks through a number of data modeling examples, demonstrating how to use LookML to generate, alter, and update reports—without the need to rewrite any SQL. With LookML, you build your business logic, defining your important metrics once and then reusing them throughout a model—allowing quick, rapid iteration of data exploration, while also ensuring the accuracy of the SQL that’s generated. Small updates are quick and can be made immediately available to business users to manipulate, iterate, and transform in any way they see fit.
Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...Amazon Web Services
Learning Objectives:
- Understand how to build a serverless big data solution quickly and easily
- Learn how to discover and prepare all your data for analytics
- Learn how to query and visualize analytics on all your data to create actionable insights
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Databricks
The Data Lake paradigm is often considered the scalable successor of the more curated Data Warehouse approach when it comes to democratization of data. However, many who went out to build a centralized Data Lake came out with a data swamp of unclear responsibilities, a lack of data ownership, and sub-par data availability.
In this session we take an in-depth look into the Apache Atlas open metadata and governance function.
Open metadata and governance is a moon-shot type of project to create a set of open APIs, types, and interchange protocols to allow all metadata repositories to share and exchange metadata. From this common base, it adds governance, discovery, and access frameworks to automate the collection, management, and use of metadata across an enterprise. The result is an enterprise catalog of data resources that are transparently assessed, governed, and used in order to deliver maximum value to the enterprise.
Apache Atlas is the reference implementation of the Open Metadata and Governance standards and framework (https://cwiki.apache.org/confluence/display/ATLAS/Open+Metadata+and+Governance). This function will enable an Apache Atlas server to synchronize and query metadata from any open metadata-compliant metadata repository.
In this session we will cover how Open Metadata and Governance works. This includes: (1) the key components in Atlas, (2) the different integration patterns and APIs that vendors can use to integrate their technology into the open metadata ecosystem, and (3) how common metadata use cases such as searching for data sets, managing security (through Atlas/Ranger integration), and automated metadata discovery work in the active ecosystem.
Speaker
Mandy Chessell, Distinguished Engineer, IBM
Manufacturers have an abundance of data, whether from connected sensors, plant systems, manufacturing systems, claims systems and external data from industry and government. Manufacturers face increased challenges from continually improving product quality, reducing warranty and recall costs to efficiently leveraging their supply chain. For example, giving the manufacturer a complete view of the product and customer information integrating manufacturing and plant floor data, with as built product configurations with sensor data from customer use to efficiently analyze warranty claim information to reduce detection to correction time, detect fraud and even become proactive around issues requires a capable enterprise data hub that integrates large volumes of both structured and unstructured information. Learn how an enterprise data hub built on Hadoop provides the tools to support analysis at every level in the manufacturing organization.
Watch full webinar here: https://bit.ly/2N1Ndz9
How is a logical data fabric different from a physical data fabric? What are the advantages of one type of fabric over the other? Attend this session to firm up your understanding of a logical data fabric.
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...Databricks
Stateful processing is one of the most challenging aspects of distributed, fault-tolerant stream processing. The DataFrame APIs in Structured Streaming make it very easy for the developer to express their stateful logic, either implicitly (streaming aggregations) or explicitly (mapGroupsWithState). However, there are a number of moving parts under the hood which makes all the magic possible. In this talk, I am going to dive deeper into how stateful processing works in Structured Streaming.
In particular, I’m going to discuss the following.
• Different stateful operations in Structured Streaming
• How state data is stored in a distributed, fault-tolerant manner using State Stores
• How you can write custom State Stores for saving state to external storage systems.
Data Quality Patterns in the Cloud with Azure Data FactoryMark Kromer
This is my slide presentation from Pragmatic Works' Azure Data Week 2019: Data Quality Patterns in the Cloud with Azure Data Factory using Mapping Data Flows
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaKai Wähner
If there were a buzzword of the hour, it would certainly be "data mesh"! This new architectural paradigm unlocks analytic data at scale and enables rapid access to an ever-growing number of distributed domain datasets for various usage scenarios.
As such, the data mesh addresses the most common weaknesses of the traditional centralized data lake or data platform architecture. And the heart of a data mesh infrastructure must be real-time, decoupled, reliable, and scalable.
This presentation explores how Apache Kafka, as an open and scalable decentralized real-time platform, can be the basis of a data mesh infrastructure and - complemented by many other data platforms like a data warehouse, data lake, and lakehouse - solve real business problems.
There is no silver bullet or single technology/product/cloud service for implementing a data mesh. The key outcome of a data mesh architecture is the ability to build data products; with the right tool for the job.
A good data mesh combines data streaming technology like Apache Kafka or Confluent Cloud with cloud-native data warehouse and data lake architectures from Snowflake, Databricks, Google BigQuery, et al.
In this session we will explore the world’s first cloud-scale file system and its targeted use cases. Session attendees will learn about EFS’s benefits, how to identify applications that are appropriate for use with EFS, and details about its performance and security models. The target audience is file system administrators, application developers, and application owners that operate or build file-based applications.
Amazon FreeRTOS: IoT Operating System for Microcontrollers (IOT208-R1) - AWS ...Amazon Web Services
In this presentation, we take a deeper look at Amazon FreeRTOS. As OEMs work to squeeze more functionality onto cheaper and smaller IoT devices, they face a series of challenges in development and operations that results in security vulnerabilities, inefficient code, compatibility issues, and unclear licensing. With Amazon FreeRTOS, it is now easier to build, deploy, and update connected microcontroller-based devices quickly and economically, while retaining confidence that the devices are secure. Also, learn how Pentair, a leading water treatment company, is developing an IoT solution with the help of Amazon FreeRTOS and Espressif Systems, a hardware partner.
1- Lower total cost of ownership
2- A platform for rapid reporting and analytics
3- Increased scalability and availability
4- Support for new and emerging applications
5- Flexibility for hybrid environment
6- Greater simplicity
(Original share from Francisco González Jiménez)
1- Lower total cost of ownership
2- A platform for rapid reporting and analytics
3- Increased scalability and availability
4- Support for new and emerging applications
5- Flexibility for hybrid environment
6- Greater simplicity
How companies are managing growth, gaining insights and cutting costs in the ...Virginia Fernandez
6 reasons to upgrade your database:
Reason 1: Lower total cost of ownership
Reason 2: A platform for rapid reporting and analytics
Reason 3: Increased scalability and availability
Reason 4: Support for new and emerging applications
Reason 5: Flexibility for hybrid environments
Reason 6: Greater simplicity
How companies are managing growth, gaining insights
and cutting costs in the era of big data.
Top reasons to change your database:
1. Lower total cost of ownership
2. A platform for rapid reporting
and analytics
3. Increased scalability and
availability
4. Support for new and emerging
applications
5. Flexibility for hybrid environments
6. Greater simplicity
Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...Denodo
Watch full webinar here: https://bit.ly/3g9PlQP
It is no news that Oil and Gas companies are constantly faced with immense pressure to stay competitive, especially in the current climate while striving towards becoming data-driven at the heart of the process to scale and gain greater operational efficiencies across the organization.
Hence, the need for a logical data layer to help Oil and Gas businesses move towards a unified secure and governed environment to optimize the potential of data assets across the enterprise efficiently and deliver real-time insights.
Tune in to this on-demand webinar where you will:
- Discover the role of data fabrics and Industry 4.0 in enabling smart fields
- Understand how to connect data assets and the associated value chain to high impact domain areas
- See examples of organizations accelerating time-to-value and reducing NPT
- Learn best practices for handling real-time/streaming/IoT data for analytical and operational use cases
Application Consolidation and RetirementIBM Analytics
Originally Published: Feb 04, 2015
Multiple, disconnected systems or an outdated application infrastructure can negatively impact your business and increase your costs. Consolidating applications, retiring outdated databases and modernizing systems can streamline your infrastructure and free resources to focus on important new projects.
ATAGTR2017 Performance Testing and Non-Functional Testing Strategy for Big Da...Agile Testing Alliance
The presentation on Performance Testing and Non-Functional Testing Strategy for Big Data Applications was done during #ATAGTR2017, one of the largest global testing conference. All copyright belongs to the author.
Author and presenter : Abhinav Gupta
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationDenodo
Watch full webinar here: https://bit.ly/3sumuL5
Join KashTech and Denodo to discover how Data Virtualization can help accelerate your time-to-value from data while reducing the costs at the same time.
Gartner has predicted that organizations using Data Virtualization will spend 40% less on data integration than those using traditional technologies. Denodo customers have experienced time-to-deliver improvements of up to 90% within their data provisioning processes and cost savings of 50% or more. As Rod Tidwell (Cuba Gooding Jr.) said in the movie 'Jerry Maguire', "Show me the money!"
Register to attend and learn how Data Virtualization can:
- Accelerate the delivery of data to users
- Drive digital transformation initiatives
- Reduce project costs and timelines
- Quickly deliver value to your organization
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...DATAVERSITY
Thirty years is a long time for a technology foundation to be as active as relational databases. Are their replacements here?
In this webinar, we look at this foundational technology for modern Data Management and show how it evolved to meet the workloads of today, as well as when other platforms make sense for enterprise data.
Analytics in a day is designed to simplify and accelerate your journey towards using a modern data warehouse to power your business. Attend this one-day, hands-on workshop to get started with your very own cloud analytics today.
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA
Curtis ODell, Global Director Data Integrity at Tricentis
Join me to learn about a new end-to-end data testing approach designed for modern data pipelines that fills dangerous gaps left by traditional data management tools—one designed to handle structured and unstructured data from any source. You'll hear how you can use unique automation technology to reach up to 90 percent test coverage rates and deliver trustworthy analytical and operational data at scale. Several real world use cases from major banks/finance, insurance, health analytics, and Snowflake examples will be presented.
Key Learning Objective
1. Data journeys are complex and you have to ensure integrity of the data end to end across this journey from source to end reporting for compliance
2. Data Management tools do not test data, they profile and monitor at best, and leave serious gaps in your data testing coverage
3. Automation with integration to DevOps and DataOps' CI/CD processes are key to solving this.
4. How this approach has impact in your vertical
Data Ninja Webinar Series: Realizing the Promise of Data LakesDenodo
Watch the full webinar: Data Ninja Webinar Series by Denodo: https://goo.gl/QDVCjV
The expanding volume and variety of data originating from sources that are both internal and external to the enterprise are challenging businesses in harnessing their big data for actionable insights. In their attempts to overcome big data challenges, organizations are exploring data lakes as consolidated repositories of massive volumes of raw, detailed data of various types and formats. But creating a physical data lake presents its own hurdles.
Attend this session to learn how to effectively manage data lakes for improved agility in data access and enhanced governance.
This is session 5 of the Data Ninja Webinar Series organized by Denodo. If you want to learn more about some of the solutions enabled by data virtualization, click here to watch the entire series: https://goo.gl/8XFd1O
Systems Management 2.0: How to Gain Control of Unruly & Distributed NetworksKaseya
You’d think that your networks are impossible to manage... But, we’ve seen worse.
We live in a new and ever-changing world of IT. This year has brought many advances in technology, however these new benefits have also created a plethora of challenges for you as the IT professional:
All of your organization’s devices are no longer safely under the same network, making it difficult to manage devices inside and outside the firewall
You still rely on siloed solutions, which hamper your efforts to collaborate and treat all devices equally
Your users have drastically evolved, increasing the need for 100% uptime and secure access despite their location or device
Join Jim Frey, Vice President of Research Network Management for analyst firm Enterprise Management Associates (EMA), on September 12th at 11am PDT and discover:
The diversity and complexity in IT: The big picture
Cross-team collaboration and the drive to service orientation in IT operations
Integration and convergence across management tools, technologies, and practices
Unifying infrastructure management: Objectives and requirements for success
Similar to IBM InfoSphere Optim Solutions - Highlights (20)
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
This is the sales enablement presentation for Information Lifecycle Management Solutions which covers InfoSphere Optim.
Challenges managing the lifecycle of application data What’s at Stake Leveraging an Information Governance Approach Optimizing the data lifecycle Discover & Define Develop & Test Optimize & Archive Consolidate & Retire IBM InfoSphere Solutions for Data Lifecycle Management
This is the information supply chain slide, that many of you maybe familiar with from the prior sessions. InfoSphere Optim is represented under “Lifecycle” in the red box. Additional notes explaining the slide: There are typically hundreds or even thousands of different systems throughout an organization. Information can come in from many places (transaction systems, operational systems, document repositories, external information sources), and in many formats (data, content, streaming). Wherever it comes from, there are often meaningful relationships between various sources of data. We manage all this information in our systems, integrate to build warehouses and master the data to get single views and analyze it to make business decisions. This is a supply chain of information, flowing throughout the organization. Unlike a traditional supply chain, an information supply chain is a many-to-many relationship. With information, the same data about a person can come from many places – he may be a customer, an employee and a partner -- and that information can end up in many reports and applications. Different systems may define the information differently as well. This makes integrating information, ensuring its quality and interpreting it correctly crucial to using the information to make better decisions. Information must be turned into a trusted asset, and governed to maintain the quality over its lifecycle. The underlying systems must be cost effective, easy to maintain and perform well for the workloads they need to handle, even as information continues to grow at astronomical rates. These are the needs that have driven IBM’s strategy and investments in this area.
InfoSphere is your trusted platform for managing trusted information that is comprehensive, integrated and intelligent. InfoSphere Optim is part of the InfoSphere Platform under Information Lifecycle Management.
To stay competitive, organizations look to improve business processes and better manage their information. But there are road blocks along the way. So, what are some of those challenges when managing this lifecycle? New application functionality to meet business needs is not deployed on schedule e.g. Organizations are often challenged by application releases. I like to say that software development and airline industry have a lot in common – they are always on time, until they are late. One reason for these delays is the creation & management of test and development environments. Often this is done by simply cloning production data to create test data. But depending upon the size of production, this method can impede progress. How long does it take to create them? Refresh them? Are developers waiting around? In addition challenges include: No understanding of relationships between data objects delays projects Greater data volumes take longer to clone, test, validate and deploy which equates to longer test cycles Inability to replicate production conditions in test Disclosure of confidential data kept in test/development environments And if you’re using production data to create the test/dev environments, how are you keeping track of sensitive data? Are you in compliance with industry regulations? Does the developer upgrading the HR/payroll system *really* need to see everyone’s salary information? Or can that person use realistic – but fictional data - to complete the work? Application defects or database errors are discovered after deployment Are you to easily validate test data to ensure application errors are caught in the test/dev/QA process? Or is much of the test & development time spent sifting through and fixing the data vs. the application? Costs to resolve defects in production can be 10 – 100 times greater than those caught in the development environment Inability to meet SLAs for responsiveness and availability Increased operational and infrastructure costs impact IT budget: As data volumes increase year after year – compounded by the cloning needed for test & development - Do you have enough disk storage – or enough db licenses - to create the needed non-production environments? How does this impact your IT staff resources? How does this impact system uptime for your business users? Cloning databases requires more storage hardware Larger databases impact staff productivity and leads to additional license costs Load simulators and complex database test scripts require highly skilled staff As this quote from Forrester indicates, many organizations today are keeping a lot of infrequently used data in production databases. And this can have a negative impact on both performance & IT costs, as we’ll see when we discuss more on the challenges of managing growing amounts of data.
Most large enterprises have petabytes of data stored in various data repositories across the organization — and this is likely to grow to exponentially in the coming years. As a result, enterprises store more data every year in production systems. With these increasing data volumes come increasing costs as well as increasing challenges in securing and managing online data to deliver high performance and availability. Forrester’s November 2010 Global Database Management Systems Online Survey evidenced this trend: Respondents indicated that the top five challenges they face are delivering improved performance, delivering higher availability, dealing with high data volume growth, increasing data management costs, and database upgrades.
Some organizations take a “reactive” approach to managing the lifecycle of their data, taking action after the “headaches” begin impacts application efficiency. From a recent IBM survey, here are some key areas that our clients identified as catalysts to their search for a better data lifecycle management approach . High capital expenditures: In a “reactive mode”, often the solution was to add more high-performance hardware to ensure there was enough storage and to improve the declining performance of their existing infrastructure. Decreased productivity: Poor application performance impacted the business users’ ability to perform daily tasks. Batch processes were creeping into working hours. IT staff was frantically trying to tune databases or add more storage in response. And this lead to… Missed service level agreements: Which can impact revenue and customer satisfaction if databases and applications are not responding as they should Ad hoc performance management: Without a clear strategy in place to manage and optimize application performance, inefficient ad hoc fixes were leveraged, draining IT resources As organizations strive to stay ahead of the competition, a more “proactive” approach to data lifecycle management is needed to ensure the data is accessible and trusted.
Let’s take a closer look at the requirements needed to manage the data lifecycle across the information supply chain. There are four main areas organizations should focus on for streamlining management of the data lifecycle and making applications more efficient: Discover & Define: Understanding where data resides, what domains of information exist, how its related across the enterprise and define the policies and standards for management of it. Develop & Test : Creating the database structures and re-useable database code to enhance productivity & team collaboration, efficiently creating the test & development environments (and protecting sensitive data within), and leveraging actual production workloads through database workload capture and replay. Optimize & Archive -Ensuring optimal application performance, archiving historical data to manage data growth, and ensuring business users have effective access to the data – both production and archived. Consolidate & Retire: Rationalizing the application portfolio, consolidating and decommissioning applications that are redundant or no longer align with current IT technology – but maintain access to the data per data retention rules, long after the application has been retired. DETAIL The core products for each stack are as follows: Discover & Define : InfoSphere Discovery, InfoSphere Business Glossary, InfoSphere Data Architect Develop & Test : InfoSphere Optim Development Studio, InfoSphere Optim pureQuery, InfoSphere Optim Test Data Management Solutions, InfoSphere Optim Privacy (Masking) Solutions Optimize, Archive & Access : InfoSphere Optim Performance Manager, InfoSphere Optim PureQuery, InfoSphere Optim Data Growth Solution Consolidate & Retire : InfoSphere Optim Application Consolidation (including ERP systems)
In order to define a governance strategy and a process to achieve your organization’s goal, you first have to understand what you have. Without this, you cannot create an effective plan that will support your organization. This process begins with understanding the web of information represented in your enterprise applications and databases. You must understand: - where the data exists and what data elements there are - what complex relationships exists within and across sources - where the historical and reference data is for archiving - what test data is needed to satisfy the test cases - where is sensitive data located Many organizations rely on documentation (which is often out-dated) or on system/application experts for this information. Sometimes, this information is built into application logic and is not apparent to anyone the hidden relationships that might be enforced behind the scenes. It’s all about time, cost and risk. Trying to manually understand this information (or using the ‘spot check’ approach) can you lead you down the wrong path resulting in many lost hours in the future including potentially delays for project deployment. More information for speaker: The solutions necessary for the process by which we locate and understand the data relationships: Locate and inventory the databases across the enterprise Again, you can’t govern data if you don’t know where it resides. So ensure your solution can help you discover and document the data entities and the databases that reside in the enterprise. Define business objects* across heterogeneous databases & applications Understand how data is related across the enterprise to better deploy new functionality and ensure that the complete business object is captured when archiving data. Define enterprise-standard data models For example, set up in your data model to estimate database growth capacity to determine when to archive historical data Understand transformation rules to discover data relationships For example if you ever were to retire an application, you need to understand the underlying business logic to ensure you capture the needed related data to ensure your archived files make sense (See example of this in slide 15) Understand relationships required for identifying sensitive data – simple, embedded or compound. How is sensitive data related to other areas across the enterprise? Ensure it’s protected everywhere, consistently. Define and document the privacy & masking rules and propagate to ensure sensitive data will be protected How is that data going to be used? Who should have access to it and why? And as you mask sensitive data in one table, how do you ensure all related data elements are masked with the same information, keeping the referential integrity of the test data? Leverage unified scheme builder to create prototypes before deployment When you think about managing data across it’s lifecycle, at some point, you may need to retire applications and consolidate the data. By pre-testing the data that needs to be consolidated, you can ensure developers can update and/or deploy applications or new functionality with confidence.
What is a “complete business object?” Why is it important to capture a complete business object? Business objects represent the fundamental building blocks of your application data records. From a business perspective, for example, a business object could be a payment, invoice, paycheck or customer record. From a database perspective, a business object represents a group of related rows from related tables across one or more applications, together with its related “metadata” (information about the structure of the database and about the data itself). Data lifecycle mgmt solutions that capture and process the complete business record thus create a valid snapshot of your business activity at the time the transaction took place – an “historical reference snapshot.” For example, when you archive a complete business object, you create a standalone repository of transaction history. If you are ever asked to provide proof of your business activity (for example, if you receive an audit or e-discovery request), your archive represents a “single version of the truth.” You can simply query the archive to locate information or generate reports. Another example in which a complete business object is important is, when you create test data, this allows organizations to identify complete business objects that can be used to create right sized referentially preserved test environments. Federated object support means the ability to capture a complete business object from multiple related applications, databases and platforms. For example, Optim can extract a "customer" record from Siebel, together with related detail on purchased items from a legacy DB2 inventory management system. Federated data capture ensures that your data management operations accurately reflect a complete, end-to-end business process. Only Optim provides federated object support.
Let’s take a closer look at the requirements needed to manage the data lifecycle across the information supply chain. There are four main areas organizations should focus on for streamlining management of the data lifecycle: Discover & Define: Understanding where data resides, what domains of information exist, how its related across the enterprise and define the policies and standards for management of it. Develop & Test : Creating the database structures and re-useable database code to enhance productivity & team collaboration, efficiently creating the test & development environments (and protecting sensitive data within), and leveraging actual production workloads through database workload capture and replay. Optimize & Archive: Ensuring optimal application performance, archiving historical data to manage data growth, and ensuring business users have effective access to the data – both production and archived. Consolidate & Retire: Rationalizing the application portfolio, consolidating and decommissioning applications that are redundant or no longer align with current IT technology – but maintain access to the data per data retention rules, long after the application has been retired. DETAIL The core products for each stack are as follows: Discover & Define : InfoSphere Discovery, InfoSphere Business Glossary, InfoSphere Data Architect Develop & Test : InfoSphere Optim Development Studio, InfoSphere Optim pureQuery, InfoSphere Optim Test Data Management Solutions, InfoSphere Optim Privacy (Masking) Solutions Optimize, Archive & Access : InfoSphere Optim Performance Manager, InfoSphere Optim PureQuery, InfoSphere Optim Data Growth Solution Consolidate & Retire : InfoSphere Optim Application Consolidation (including ERP systems)
Organizations continue to be challenged with building and delivering quality applications. The cost go spiraling up as defects are caught late in the cycle where they are expensive to correct. They are challenged with increasing risks associated with protecting data and complying with regulations. Time to market of these applications is critical to success of the businesses yet long testing cycles and resources often delay delivery of the software on time …. inadequate test environments, lack of realistic test data are contributing reasons for this challenge.
The challenges are real…. $300 Billion is the annual costs of software related downtime. An FAA server used for application development & testing was breached, exposing the personally identifiable information of 45,000+ employees. 62% of companies use actual customer data to test applications exposing sensitive information to testers and developers 30-50% time testing teams spend on setting up test environments instead of testing. .
Let’s look at some specific challenges related to the impact of inefficient test practices and what our customers are saying… -creating realistic test data for their testing efforts. -Lack of insight into the data environment so developers and testers don’t understand how to work with data -SLA missed due to lack of development and DBA communication -Simply cloning entire production creates duplicate copies of large test databases -Data masking requirements are not addressed Quote References 1st Quote: http://www-01.ibm.com/software/success/cssdb.nsf/CS/JHAL-7ZLTW7?OpenDocument&Site=dmmain&cty=en_us 2ndQuote: http://www-01.ibm.com/software/success/cssdb.nsf/CS/LWIS-7E2S6V?OpenDocument&Site=dmmain&cty=en_us 3rd Quote: http://www-01.ibm.com/software/success/cssdb.nsf/CS/LWIS-7F66X2?OpenDocument&Site=default&cty=en_us
For generating the test data, it’s critical to productivity to create “right sized” subsets for all your testing needs, allowing testers and developers to easily extract, refresh and create properly sized data sets. After running tests, relationally compare results sets on the new data set and the actual production data to see the exact differences – and only the differences. This can help resolve application defects faster. Part of effective test data management is the ability to protect the sensitive data within these non-production environments. Ensure sensitive test data is masked while maintaining the referential integrity of the data, while ensuring this data transformation is appropriate to the context of the application. That is, the results of data transformation have to make sense to the person reviewing the test results. For example, if an address is needed, you would like to use a street address that actually exists as opposed to using something meaningless like XXXXXX as a street name.
With InfoSphere Optim Test Data Management Solution, organizations can make test data management a best practice, helping them to reduce costs, lower risk, and expedite delivery in three key ways: Automate the creation of realistic, “right-sized” test data to reduce the size of test environments Mask sensitive information for compliance and protection Refresh test data, thereby speeding up testing and application delivery. Key differentiators include: Understand what test data is needed for test cases Create “right-sized” test data by subsetting Ensure masked data is contextually appropriate to the data it replaced, so as not to impede testing Easily refresh & maintain test environments by developers and testers Automate test result comparisons to identify hidden errors Support for custom & packaged ERP applications in heterogeneous environments
How will a database configuration change affect the enterprise? What about an application change? With today’s complex enterprises, its not easy to anticipate if a change will disrupt business operations. Often organizations resort to finger pointing. Change windows are very tight and most shops squeeze in as many changes as they can during these short blocks of time. This makes it difficult to isolate a particular change as the problem source.
Business Pains Inability to deliver required functionality to customers Missed services level agreements Loss of customer satisfaction Inability to process transactions resulting in lost revenue Cost of adding additional HW or SW to relieve immediate pain Cost of issue remediation IT Pains Time consuming, labor intensive process of rolling back changes Inability to identify and correct the source of the problem Tedious manual process of modifying test scripts Disruptions Business applications becomes unavailable Business application response time degrades Costs Business Lost – Unable to process business transactions Opportunity Lost – Unable to deliver competitive functionality to market IT Budget Lost – Extra costs needed to roll back changes and start over Revenue Drain – as employees wait for the system Hardware Costs – new hardware needed to solve capacity issues
With more realistic testing scenarios, organizations can: Manage life-cycle events such as changes in hardware, workloads, databases or applications efficiently without production impact Develop accurate, streamlined tests to speed product and service delivery Skip laborious test script creation and load emulators Identify and quickly correct potential problems from enterprise changes Help ensure optimal SQL performance even as enterprise changes are deployed Complement existing regression, functional and performance tests with deeper analysis of the data layer Meet service-level agreements (SLAs) for application and database responsiveness and availability
Reduce the cost of lifecycle changes (upgrades, migrations, consolidations, retirements, new application deployment or growth) Limit laborious database test script creation by leveraging actual production workloads for testing Limit load simulators with the capability to speed up and slow down the replay Deploy a single, repeatable process for database testing transparently across heterogeneous systems with minimal performance overhead Lower risk of lifecycle changes (upgrades, migrations, consolidations, retirements, new application deployment or growth) Develop more streamlined, accurate tests by leveraging actual production workloads for testing Identify and correct potential problems sooner with validation reports Accelerate project delivery with deep diagnostics of potential database problems Meet SLAs for availability, reliability and performance Ensure a well tuned and high performing workload before production deployment with performance tuning Extend quality testing efforts to include tailored database testing Integrate with existing database tools to get a complete view of production workloads Using actual production workloads for testing gives you insight into the best way to tune the database. This leads to better SQL performance. Using realistic testing give you real results. This is better than: • Guessing • Estimates • Rules of thumb When you know how actual production SQL is going to behave, you can tune the database better and be able to find and resolve problems sooner! We all know at the end of the day, better testing means: • Better user experience • Improved employee productivity • Allows for company growth with existing IT resources • Lower total cost of ownership
Let’s take a closer look at the requirements needed to manage the data lifecycle across the information supply chain. There are four main areas organizations should focus on for streamlining management of the data lifecycle: Discover & Define: Understanding where data resides, what domains of information exist, how its related across the enterprise and define the policies and standards for management of it. Develop & Test: Creating the database structures and re-useable database code to enhance productivity & team collaboration, efficiently creating the test & development environments (and protecting sensitive data within), and leveraging actual production workloads through database workload capture and replay. Optimize & Archive : Ensuring optimal application performance, archiving historical data to manage data growth, and ensuring business users have effective access to the data – both production and archived. Consolidate & Retire: Rationalizing the application portfolio, consolidating and decommissioning applications that are redundant or no longer align with current IT technology – but maintain access to the data per data retention rules, long after the application has been retired. DETAIL The core products for each stack are as follows: Discover & Define : InfoSphere Discovery, InfoSphere Business Glossary, InfoSphere Data Architect Develop & Test : InfoSphere Optim Development Studio, InfoSphere Optim pureQuery, InfoSphere Optim Test Data Management Solutions, InfoSphere Optim Privacy (Masking) Solutions Optimize, Archive & Access : InfoSphere Optim Performance Manager, InfoSphere Optim PureQuery, InfoSphere Optim Data Growth Solution Consolidate & Retire : InfoSphere Optim Application Consolidation (including ERP systems)
Organizations today have been increasingly challenged with successfully managing data growth. They have large volumes of data stored in various data repositories across the organization — and this is likely to grow to exponentially in the coming years. As a result of this rampant growth, these companies store more and more data every year in production systems. If your client has enterprise applications such HR, Finance, Customer Support systems – and they likely have many – then they’re dealing with the effects of increasing data volumes. Today’s organizations are met with 3 primary challenges: Increasing costs : As the volumes of data increase, the “natural” response it to simply buy more storage for the enterprise application. After all, “storage is cheap”, right? However, while the acquisition of storage hardware is cheap, the operational costs associated with it are often underestimated. Said differently, for every $1 spent on storage, organization spend $4 on the operational elements of managing that stored data. Poor Application Performance : Over time, as data volumes increase, application performance will be an issue. And what often occurs is that your clients “self-diagnose” this performance issue with simply the hardware performance – “let’s buy faster systems”; or they may task their DBA’s to tune and re-tune the database. However, without properly managing the data volumes, companies will see the impact in the performance of their system over time. This is particularly a problem when application performance impacts employee productivity. Risks associated with data retention and compliance : So, the “keep everything” in production systems can also create a risk associated with accessing this data for the long-term. So, how can companies safely store data that is no longer needed in production systems, but that must be retained per data retention policies and for compliance purposes. How can organizations store this data so that it’s audit-ready, easily accessible, and available for any e-discovery requests? .
And the challenges are real: The cost of *managing* storage can be 3-10 times the cost of procuring it And IDC estimated that last year, organizations spent about $1.1 billion on storage costs Ask the DBA how much time they’re spending each week on hardware capacity-related performance issues – it can be up to 80% One InfoSphere Optim client had about 19,000 batch processes that took 250 hours to run – that’s more than 10 days. Archiving helped them reduce that time by over 75% Talk to your clients about how long they’re keeping data in their production systems. How long are they *supposed* to keep it at all? At least 50% of companies are keeping data for 7 or more years. And 57% of companies are leveraging their back-up copies for “data retention”. That’s a lot of data to sift through if you have an e-discovery request.
Let’s first define what we mean by archiving. It should be an intelligent process for MOVING inactive/infrequently accessed data that still has value (e.g. for data retention needs, etc.). [Recall: Forrester’s statistic of 85% inactive data!] PLUS, the archive process should provide the ability to search and retrieve the data in a manner that functional users need to consume the data. This is a typical example of a Production environment prior to archiving. Initially, both Active and Inactive data is stored in the Production environment, taking up most of the space on the Production Server. Safely move the inactive or historical data to an archive, capturing the complete business object for application independent access. This data can then be stored in a variety of environments. The data can then also be easily retrieved to an application environment when additional business processing is required. You then have universal access to this data through multiple methods, including Report Writers such as Cognos and Crystal Reports, XML, ODBC/JDBC, application-based access (Oracle, Siebel, etc.)
Let’s take a closer look at the requirements needed to manage the data lifecycle across the information supply chain. There are four main areas organizations should focus on for streamlining management of the data lifecycle: Discover & Define: Understanding where data resides, what domains of information exist, how its related across the enterprise and define the policies and standards for management of it. Develop & Test: Creating the database structures and re-useable database code to enhance productivity & team collaboration, efficiently creating the test & development environments (and protecting sensitive data within), and leveraging actual production workloads through database workload capture and replay. Optimize & Archive: Ensuring optimal application performance, archiving historical data to manage data growth, and ensuring business users have effective access to the data – both production and archived. Consolidate & Retire : Rationalizing the application portfolio, consolidating and decommissioning applications that are redundant or no longer align with current IT technology – but maintain access to the data per data retention rules, long after the application has been retired. DETAIL The core products for each stack are as follows: Discover & Define : InfoSphere Discovery, InfoSphere Business Glossary, InfoSphere Data Architect Develop & Test : InfoSphere Optim Development Studio, InfoSphere Optim pureQuery, InfoSphere Optim Test Data Management Solutions, InfoSphere Optim Privacy (Masking) Solutions Optimize, Archive & Access : InfoSphere Optim Performance Manager, InfoSphere Optim PureQuery, InfoSphere Optim Data Growth Solution Consolidate & Retire : InfoSphere Optim Application Consolidation (including ERP systems)
We’ve come full-circle now, talking about the requirements for your data & applications, how to deploy & optimize those applications, and how to access the data effectively. Now we come to the retirement phase – or to put it another way, application rationalization. Looking around the data center, there is likely 1 or 2 – or more – systems that are merely “kept alive” because there might be important data stored within – but how will you access it? Have you really looked at what’s in the data there? How old is that application? Is the system still supported? Does anyone know what it does? Is it redundant to more current systems? How much is this older application costing – licenses, power consumption, extended support agreements, data center footprint, etc? If the solution is to consolidate/retire the application, the next big concern is what to do with the data . We don’t want to move it all into the consolidated application, that will cause performance issues, may not all be appropriate in the new context and could grow our data to be too large, but for business, governance and regulatory reasons, we must keep it. So, the best scenario is to provide access to the data without relying on the cost of database software or servers and doesn’t rely on the application to access it. All of that said, in most cases, we still need to get to the data. Analyst Quote: Organizations facing application retirement projects should look…to provide a way to get data that must be retained into a format that can be accessed independently of the retired application. Source: Carolyn Dicenzo, Gartner “Database-Archiving Products Are Gaining Market Traction” , October 2008
So, the best scenario is to provide access to the data without relying on the cost of database software or servers and doesn’t rely on the application to access it – that is, application independent access of this data. Examples of application retirement benefits: For example, once data from similar business applications is consolidated and redundant applications are retired, a skilled DBA can redirect productive time toward implementing an ERP package, rather than maintaining a patchwork of databases that support outdated legacy applications. Another benefit: When you rationalize your infrastructure, you also reduce its complexity and therefore reduce business risk. For example, by consolidating a dozen homegrown general ledger applications into a packaged ERP solution, you can provide business-critical support and reduce the risk of missing key processing deadlines, such as a month-end close.
So, how can IBM help clients with these challenges? Through effective data growth management. InfoSphere Optim Data Growth Solution can help clients REDUCE COSTS, IMPROVE APPLICATION PERFORMANCE AND MINIMIZE THE RISKS associated with managing application data over its lifetime. Reduce Costs : By archiving infrequently used data from production environments, that data can be stored on less expensive, tier 2 storage, and can be compressed to save even more storage space. Improve Performance : With less data, application performance improves, searches and batch processes run faster, back-up processes run more efficiently. If your client is considering an application upgrade, archiving can streamline the associated data conversion process – less data to convert to the upgraded version, the less time the application is offline. Minimize Risks : Intelligently archiving data out of production systems allows for data retention compliance, but also supports a better long-term solution for storing the application data, providing application independent access to that data. InfoSphere Optim Data Growth solution is the proven, marketing leading solution that: With InfoSphere Discovery, discovers & identifies data record types to archive across heterogeneous environments Intelligently archives data, capturing & storing historical data in its original business context Defines & maintains data retention policies consistently across the enterprise Ensures long-term, application-independent access of archived data via multiple access methods, including third party reporting tools Data Find, a web based search engine Supports for custom & packaged ERP applications in heterogeneous environments
These are some recent quotes from industry analysts on InfoSphere Optim e.g. “ IBM’s Optim product line led the database archiving and ILM segment in 2010 with a 52.3% share, and showed nearly 18% growth in 2010.” Forrester said: Today, IBM continues to lead the industry with the most comprehensive data archiving solution and the largest installed base … IBM’s customers spoke highly of the Optim solution’s reliability and strong performance. Organizations can realize benefits in the form of improved operational and capital cost savings, improved IT and end user efficiency, as well as higher levels of data protection and application performance [with InfoSphere Optim].
“IBM enjoys the largest market share of all vendors profiled in this research…” “[InfoSphere] Optim supports both mainframe and open-system environments, and many of IBM’s customers are using it in heterogeneous environments.” “Customers cite a ‘small company feel’ when asked to describe their interaction with [InfoSphere] Optim sales and support
Governing the data lifecycle with IBM® InfoSphere software improves application efficiency by better managing data growth, managing test data and enabling efficient application upgrades, consolidation and retirement: Reduce the cost of data storage, software and hardware Improve application efficiency and performance Reduce risk and support compliance with retention requirements Speed time to market and improve quality
Why IBM? IBM InfoSphere Optim provides proven test data management, data masking, and database archiving capabilities that enable organizations to improve application reliability, mitigate risk and control costs. About Allianz The Allianz Group provides its more than 60 million clients worldwide with a comprehensive range of services in the areas of property and casualty insurance, life and health insurance, asset management and banking. Its subsidiary, Allianz Seguros, ranks second in the market, with over 2,400 million Euros in premiums. The company offers a comprehensive range of insurance products and services, including individual and group life, health, home, casualty, auto, boating and more to meet the needs of nearly three million clients. Allianz Seguros has contributed to the “3 + One” business model, a successful strategic program initiated by the Allianz Group to achieve sustainable and profitable growth. Primary objectives of the “3 + One” program are to fortify the capital base, improve operations, reduce complexity and increase sustainable competitiveness and value. An increase in the number of insurance premiums, a reduction in the claims ratio and continuous management improvement has contributed to continued business growth. Allianz Seguros relies on several mission-critical mainframe insurance applications, developed in-house, to manage operations in all areas of its business activities. Insurance agents, claims representatives and accounting staff in all branch offices rely on these applications to manage information for policy management, claims processing and premium billing, among other activities. Delivering application enhancements presents challenges Allianz Seguros has its own internal application development and quality assurance teams that develop and enhance application functionality. The ability to deliver new insurance products and services is important to promote continued business growth, which presented several challenges. “Our primary challenge was to improve the efficiency of the development and quality assurance processes by reducing the size of the development and testing environments,” said Ramon Lasurt, Director of Development at Allianz Seguros. “Next, we wanted a way to ensure accuracy by preserving the integrity of the test data, and finally, we needed to mask client information in the development and testing environments to protect privacy.” There were often at least three development and testing mainframe environments in use at the same time. The quality of these environments degraded quickly because they were used for multiple tests. Every few months, the development team would use its in-house “subsetting” program to copy test data from its large application production environment, comprising about 700 GB with tables that contained over 200 million rows. “We would refresh these development and testing environments as needed,” said Xavier Mascaró, Senior DBA at Allianz Seguros. “However, the refresh process, using data cloned from large production databases, was very complex and time consuming, and the results often affected the integrity of the data. We estimated that reducing the testing environments to only 10 percent of the production environment would provide significant time and cost savings.” Because Allianz needed to protect confidential client information to comply with the Spanish Law of Protection of Personal Data (LOPD), privacy protection remained a high priority. In fact, recent revisions to the LOPD held individual staff members responsible for protecting client records. To address this need, the DBA team had to write special programs to mask client names, tax ID numbers (Número de Identificatión Fiscal or NIF) and national identifiers (Documento Nacional de Identidad or DNI) and then move data into the application development and testing environments. Preserving the integrity of the test data presented another challenge because the data structures that supported the insurance applications included dozens of complex relationships. Although the in-house subsetting program offered some of the needed functionality, to ensure valid test results, the development team needed a test data management solution that would accurately preserve the referential integrity of the data for even the most complex data relationships. InfoSphere Optim improves test data management After researching to find a solution, senior management at Allianz Seguros decided to evaluate IBM InfoSphere Optim. Members of the evaluation team included the Senior DBA, as well as the Chief of Technology and Production, the Director of Development and the Director of Systems, who both report to the Director General. “After attending a demonstration, the members of our evaluation team agreed that InfoSphere Optim provided the capabilities we needed to improve application development and testing processes and protect privacy,” said Xavier Mascaró. Immediately after purchasing InfoSphere Optim, the development team focused on defining the relationships and criteria for subsetting, based on their complex relational database environments. In addition to using relationships defined to the database, InfoSphere Optim offered flexibility for defining and managing the complex relationships defined within the application logic. Next, the DBA team implemented InfoSphere Optim in its integration testing environment. In-house archiving presents challenges A few years after the positive experience of implementing InfoSphere Optim for test data management, Allianz Seguros turned its attention to InfoSphere Optim’s database archiving capabilities. Applications, such as Vida (life insurance) and Contabilidad de Agentes (agent accounts), continued to collect historical records, and this information was never deleted. In Spain, there is no law that states that insurance records must be retained for a specific number of years. However, in the insurance business, a policy can be in effect for a lifetime. It was important to retain access to these historical records. Although the IT department had been using an in-house archiving program for years to manage data growth, their methods for accessing and retrieving archived information were time consuming. For example, if an agent requested specific archived insurance claims records, it was necessary to recover one or several files from a backup tape and send a printed copy to the requestor. This process could take between one and four days, which had a negative impact on policy management, claims processing and other insurance service activities. InfoSphere Optim ensures access to historical records InfoSphere Optim’s proven database archiving capabilities offered Allianz Seguros several advantages over its in-house archiving program. First, InfoSphere Optim archives application transaction records in complete business context, in effect creating historical reference snapshots of the business. Archives can be saved to a variety of storage media for easy retrieval. In addition, using InfoSphere Optim delivers more capabilities and eliminates the need to maintain the in-house archiving program.
Challenges Improve development and testing strategies to deploy a new Pension Earnings and Accrual System within 30 months. Protect confidential employee salary and pension information in non-production (development, testing and training) environments to satisfy data privacy and TyEL compliance requirements. Why IBM? IBM InfoSphere Optim provides proven test data management and data privacy capabilities that support the Pension Earnings and Accrual System architecture and satisfy automated testing requirements. Solution IBM InfoSphere Optim Test Data Management Solution IBM InfoSphere Optim Data Masking Solution Benefits Improved development and testing efficiencies, enabling Arek Oy to promote faster deployment of new pension application functionality and enhancements. Protecting confidential data to strengthen public confidence and support TyEL compliance requirements. Headquartered in Finland, Arek Oy, Ltd. was established and is owned by the Finnish Centre for Pensions (ETK) and the country’s authorized pension insurance providers. Arek Oy manages the development of information systems and provides system services to the pension insurance community. Arek provides services to ETK and other pension providers, including Etera, Pension-Fennia, Ilmarinen, the Social Insurance Institution, the Central Church Fund, the Local Government Pensions Institution, the Seamen’s Pension Fund, Pensions-Alandia, Silta, Tapiola, the State Treasury, Varma and Veritas. All employment in Finland is covered by a statutory and compulsory earnings-related pension scheme that is funded by employer and employee contributions. Supporting a large-scale information management project TyEL dictates that it is the employer’s responsibility to arrange for pension insurance and to provide the insurance company with relevant information about employees, including identification, personal information, employment history and salary data. The pension insurance company, in turn, registers the data for employees and self-employed individuals, administers the funds and investments, awards and pays the pensions. All data must be handled in the strictest confidence. Among its various information technology and application development projects, Arek Oy maintains the Pension Earnings and Accrual System that manages all the data that supports earnings-related pensions. This new system is considered Finland’s single largest information management project. Services connected to the pension application are available directly from ETK. All employment records are also available electronically both via pension provider Internet pages and via the pension portal, Tyoelake.fi, which is maintained by ETK. Business challenges and demanding deadlines Arek Oy’s primary business challenge was to manage one of the largest Java 2 Platform, Enterprise Edition (J2EE) custom development efforts in Finland. Specifically, Arek Oy had to develop and deliver a thoroughly tested and reliable Pension Earnings and Accrual System, within 30 months. Earnings related pensions are crucial for each citizen’s well being and financial security. For an implementation project a failure to meet deadlines would result in an implementation sanction and, at a minimum, countless customer complaints in many cases. For Arek Oy the impact would be reclamations and serious loss of both customer good will and future business opportunities. Losses could range in the millions of euros. Tasked with the deployment of the new enhancements and functionality for the Pension Earnings and Accrual System within release deadlines, Arek Oy had to improve the efficiency of the application development and testing processes and procedures to support that business initiative. Managing test data poses technical challenges The primary technical challenge was ensuring the applicability of chosen technology for the selected solution architecture and processing needs. Arek Oy had to complete development efforts within a relatively short timeframe and had to ensure the expected application quality. This meant investing in solutions and methodologies that would enable the controlled delivery of 60,000 man-days within the given timeframe. Test data management and protecting privacy The Arek Oy development team needed a test data management solution with capabilities for creating realistic test data to satisfy specific application testing criteria. Processes for creating these testing environments had to be flexible and repeatable to ensure consistency and accuracy for system development projects. In addition, because of the nature and sensitivity of the personal pension information, Arek Oy needed a solution that would allow for de-identifying the personal data, such as names, addresses, national identifiers, salary and pension amounts, used in the development and testing environments. Next, the selected test data management solution had to support the proposed pension application architecture and satisfy automated testing requirements, as well as provide capabilities to enable developing the Service Oriented Architecture (SOA) and business applications concurrently. Because of the tight development deadlines, capabilities for coordinating the delivery of several concurrent projects were critical. That is, in addition to supporting concurrent development and testing processes, the selected solution had to ensure repeatability and transferability of tests and test data. Adapting a successful solution Since it was founded in 2004, Arek Oy had neither a previous solution nor the IT infrastructure to support the planned development activities. However, ETK had previously built an industry-wide Distributed Test Data Management solution for its own test data requirements. The existing ETK solution, called “Testimaha” was based on the IBM InfoSphere Optim Test Data Management Solution for the open systems environment. To speed the deployment of an effective solution, Arek Oy enlisted the expertise of Mainsoft Corporation. As an advanced IBM business partner, Mainsoft is a leading provider of cross-platform services and support. “The further utilization of the ETK concept was a natural solution for Arek Oy because we had to synchronize our test data with the data in the ETK data storage to produce a correct starting point for testing,” said Katri Savolainen, Project Manager at Arek Oy. “Since ETK was utilizing the mainframe and workstation versions of Optim, we knew that many of the test planners and testers would be familiar with Optim’s capabilities. Therefore, we decided to build our own test data management concept based on the one ETK was utilizing.” Arek Oy decided to implement a version of the ETK system using InfoSphere Optim and worked with professionals from Mainsoft to complete the implementation and ensure success. InfoSphere Optim integrated with the existing ETK solution, adapted easily to user-defined working principles, and was easy for the DBA to manage and support. Using InfoSphere Optim’s subsetting capabilities, rather than cloning large production databases, made it possible to create robust, realistic test databases that supported faster iterative testing cycles. In addition, InfoSphere Optim offered proven capabilities for performing complex data masking routines, while preserving the integrity of the pension data for development and testing purposes. Meeting these requirements would ensure accuracy and build confidence in the Pension Earnings and Accrual System, while protecting privacy in the development and testing environments. “We are currently in our second iteration of implementing Optim and we are very pleased with the quality of service and support provided by Mainsoft. At first, we completed and rolled out a mainframe version, in which we used a Java component to call Optim (mainframe). This was in production use within five months,” said Katri Savolainen, at Arek Oy. “In December 2006, we switched our production database server to AIX. On this occasion, we re-factored the previous solution into a UNIX script, which utilizes Optim.” Technical and business results Arek Oy has a Distributed Data Management Solution for Test Data, called TAHS, which is used throughout the application development cycle. TAHS supports all phases of application testing, including integration testing, system testing, acceptance testing and customer testing. TAHS is also used in conjunction with automated regression testing tools, which enable developers to prepare automated test scripts. “In June of 2007, we did a maintenance release, and using Optim, we were able to secure the availability of test data for development, even when the production database was undergoing a lengthy conversion,” said Katri Savolainen at Arek Oy. “Optim is now fully implemented and is used extensively by the in-house development team that built the integration to the ETK system and our DBA. In addition, there are a number of developers, test planners and testers who are using Optim through the TAHS application.” “We were able to give our development projects realistic, but masked test data, which satisfied requirements for providing appropriate test data and protecting privacy. We were also able to empower all system test and acceptance test planners to create and maintain their own test data,” said Katri Savolainen, at Arek Oy. “The test planners were able to concentrate on searching for the correct set of data and the best way of utilizing it instead of worrying about the technical transition of the data from our production environment to one of the several test environments.”
Overview Toshiba TEC Europe is a leading subsidiary of the Toshiba Group, a world technology leader that manufactures a wide range of electronic and high-technology products for personal and institutional use. Business need: Proactively manage application data growth to support business expansion and deployment of Oracle® E-Business Suite across business units . Increase application availability by reducing the time to complete 19,000 daily batch processing jobs exceeding 250 hours . Integrate and consolidate data and processes with the other Toshiba European entities to improve service levels and operational efficiencies. Solution: IBM InfoSphere Optim Data Growth Solution for Oracle® E-Business Suite Benefits: Managed continued data growth and deployed Oracle E-Business Suite across business units by archiving to reduce database size by 30 percent. Increased application availability by archiving historical transactions to shorten time to complete 19,000 daily batch processes by 75 percent. Improved service levels and operations by implementing InfoSphere Optim to provide access to current and historical transactions.
The need: The Virginia Community College System wanted an out-of-the box archiving solution for PeopleSoft Enterprise Campus Solutions that would help manage data growth without expensive server upgrades, support compliance requirements, and reduce the time spent on performance tuning and related issues. The solution: Using IBM® InfoSphere™ Optim™ software, VCCS can archive complete historical student records in batches for students who have graduated or been inactive for at least 10 years; access archived data for reporting and analysis; process requests for transcripts against archived student data without having to restore the data to the production environment; and selectively restore a complete record for a single student on demand. The benefits: Effectively manages data growth to improve service levels Offers flexibility to archive 10 or more years of inactive student data Enables staff to selectively restore student records as needed Lowered infrastructure costs by eliminating frequent expensive server upgrades
You want to point customers to the InfoSphere Optim ibm.com page, solution sheet, But let your clients get their own statistics. There is a self-service business value assessment that a client – alone or with your guidance – can leverage to do a quick assessment of how InfoSphere Optim can help their business. We encourage you to leverage this link as you start to talk to your clients about business benefits.
IBM InfoSphere Optim Solutions allows you to manage data through its lifecycle in heterogeneous environments. You may have a lot of data scattered around the organization – how do you find it? How do you know how it relates to other enterprise data? IBM InfoSphere Optim provides a solution to Discover the data and the relationships as information comes into the enterprise. You need to develop applications and functionality that can best maintain your data – and you need to effectively test those applications. We provide a solution for DBAs, testers and developers to effectively create and manage right size test data while protecting sensitive test data in development and test environments. The day-to-day challenges of managing the lifecycle of your data are intensified by the growth of data volumes. IBM InfoSphere Optim provides intelligent archiving techniques so that infrequently accessed data does not impede application performance, while still providing access to that data .IBM InfoSphere Optim provides a Data Growth solution that helps reduce hardware, storage and maintenance costs. Over time, the applications managing your data will need to be upgraded, consolidated and eventually retired – but not the data. Many organizations today are over burdened with redundant or legacy applications – e.g. as organizations are merged/acquired, so are their IT systems.. By leveraging InfoSphere Optim’s Application Retirement solution and archiving best practices you can ensure access to business-critical data for long-term data retention, long after an application’s life-expectancy.
Enterprises are generally made up of many technologies. IBM InfoSphere Optim solutions span support for these technologies as well. We start with storage platforms including on-line, near-line and off-line. Depending on where the data is in its lifecycle, the data may be stored on these different platforms and be related across them. As we move up the chart, most organizations have systems that span multiple operating systems whether being z/Series, i/Series, Linux, Unix or Windows and InfoSphere Optim solutions spans support across the different operating systems to manage data across them. Few organizations have just 1 database management system as well. They will generally have data that runs on the same database platform on multiple operating systems, but also different DBMS's as well and it is critical that an data lifecycle solution supports heterogeneous environments. In a similar light, most organizations have adopted a combination of ERP and CRM packaged systems like SAP, PeopleSoft, Siebel and others along with the creation of custom applications. These system do not stand alone either, they are integrated together, share information and need to be managed in a consistent manner. Lastly, you can see the different capabilities that we see as critical to managing data across the enterprise and therefore the solutions available today from InfoSphere Optim. These items include the capability to Discover, Test Data Management, Data Masking, Manage Data Growth, and Application Retirement. With this, you can see and understand how IBM InfoSphere Optim is a single, scalable, heterogeneous information lifecycle management solution.
Workgroup Edition Targets mid to low end market Less functionality Restricted to less than 6 terabytes of data and single server Trade up available to Enterprise Edition Enterprise Edition Targets mid to high end of the market Unrestricted use
In order to be able to govern information effectively, you have to understand where that information exists and how its related to the organization. Data discovery is the process of analyzing data values and data patterns to identify the relationships that link disparate data elements into logical units of information, or “business objects” (such as customer, patient or invoice). A business object represents a group of related attributes (columns and tables) of data from one or more applications, databases or data sources. Discovery is also used to identify the transformation rules that have been applied to a source system to populate a target such as a data warehouse or operational data store. Once accurately defined, these business objects and transformation rules provide the essential input into information-centric projects like data integration, MDM and archiving. IBM InfoSphere Discovery™ provides market-leading capabilities to automate the identification and definition of data relationships across the complex, heterogeneous environments prevalent in IT today. Covering every kind of data relationship, from simple to complex, InfoSphere Discovery provides a 360° view of data assets. InfoSphere Discovery analyzes the data values and patterns from one or more sources, to capture these hidden correlations and bring them clearly into view. InfoSphere Discovery applies heuristics and sophisticated algorithms to perform a full range of data analysis techniques: single-source and cross-source data overlap and relationship analysis, advanced matching key discovery, transformation logic discovery, and more. It accommodates the widest range of enterprise data sources: relational databases, hierarchical databases, and any structured data source represented in text file format. InfoSphere Discovery’s automated capabilities accurately identify relationships and define business objects, speeding deployment of information- centric projects by as much as ten times.
Creating realistic application development and testing environments is critical to delivering the right solutions for the business. However, cloning large production databases for development and testing purposes extends cycle times, increases the amount of data propagated across the organization, and significantly raises costs and governance control issues. The Optim Test Data Management Solution offers proven technology to optimize and automate processes that create and manage data in non-production (testing, development and training) environments. Development and testing teams can create realistic, “right-sized” test databases, made up of one or more business objects, for targeted test scenarios. The Optim Test Data Management Solution also allows teams to easily compare the data from “before” and “after” testing with speed and accuracy. Optim’s capabilities for creating and managing test data enable organizations to save valuable processing time, ensure consistency and reduce costs throughout the application lifecycle.
InfoSphere Optim Data Masking Solution protects an organization’s data in non-production environments by de-identifying (or masking) sensitive/personal identifiable date. The InfoSphere Optim solution doesn’t keep the data from being stolen, but rather render the data unusable and of no value if stolen. This protects the business both financially and from loss of information and provides IT with a simple-to-use solution that supports a common way of protecting data leveraged in non-production (test, development) environments, or by third-party contractors. The InfoSphere Optim Data Masking Solution comes with a multitude of built in masking functions, as well as the ability to define your own transformations. There is no longer a reason to needlessly expose your sensitive data in your test environments ever again.
The InfoSphere Optim Data Growth Solution solves the data growth problem at the source - by managing your enterprise application data. IBM Optim enables you to archive historical transaction records, controlling data growth and improving application performance. Historical data is archived securely and cost-effectively, and can be easily accessed for analysis or audit/e-discovery requests. And with less data to sift through, you speed reporting and complete mission-critical business processes on time, every time. Having a defined policy for managing the retention requirements for historical data is a requirement for enterprise governance frameworks to ensure compliance with regulatory mandates. As a recognized best practice, archiving segregates inactive application data from current activity and safely moves it to a secure archive. Streamlined databases reclaim capacity and help improve application performance and availability. With InfoSphere Optim, you can establish distinct service levels for each class of application data – for example, current data, reporting data and historical data – and consistently achieve performance targets. Policy-driven archive processes allow you to specify the business rules for archiving. For example, you may choose to archive all closed orders that are two years old or more. InfoSphere Optim identifies all transactions that meet these criteria and moves them into an accessible archive. InfoSphere Optim manages application data at the business object level. Business objects are comprised of a group of related columns and tables from one or more application databases, along with their associated metadata. By managing data at the business object level, InfoSphere Optim preserves both the relational integrity of the data and its original business context. Each archived record represents a historical reference snapshot of business activity, regardless of its originating application.
The InfoSphere Optim solution for Application Retirement enables you to archive historical data securely and cost-effectively, and in a way that the data can be easily accessed for analysis or audit/e-discovery requests, long after the original application has been retired. InfoSphere Optim manages application data at the business object level. Business objects are comprised of a group of related columns and tables from one or more application databases, along with their associated metadata. By managing data at the business object level, Optim preserves both the relational integrity of the data and its original business context. Each archived record represents a historical reference snapshot of business activity, regardless of its originating application.
SUMMARY: InfoSphere Optim System Analyzer for SAP Applications is a powerful web-based tool that automatically identifies SAP system changes for key application lifecycle events, and understands how those systems are impacted. Provides critical impact analysis information before applying changes into production Provides ability to compare metadata between multiple systems, applications, and data dictionary Presents results automatically in generated reports Provides further drill-down Recommends testing executables & identifies testing gaps End result: Reduce time, cost, complexity & risk for SAP application & system changes.
And while System Analyzer looks at the impact of change from the data structure and customer code level, InfoSphere Optim Business Process Analyzer for SAP applications looks at these changes at the SAP business process level. InfoSphere Optim Business Process Analyzer is a component of InfoSphere Optim System Analyzer that automatically captures business process from your SAP data structures and helps SAP business analysts visualize how changes will impact the business process. The picture depicted here is a great example – InfoSphere Optim Business Process Analyzer is analyzing the impact of change across this SAP module, leveraging the business process – or “flow chart”. The processes impacted are in red, but if you look at the inset, you can see that the proposed changes only impact a small portion of the module. Now testers know where to focus. And this business process view of change impact provides greater collaboration between the business analyst and technical manager on an SAP project team.
SUMMARY: InfoSphere Optim Test Data Management Solution for SAP Applications offers proven technology to optimize and automate the process to create and manage data in non-production (testing, development and training) environments, with no performance impact to production systems. Development and testing teams can create realistic, “right-sized” test environments, made up of one or more business objects, for targeted test scenarios. This SAP-Certified solution is invoked within SAP, providing the user with a familiar interface and easy point-and-click environment. This solution includes pre-built business objects for key SAP modules, that can be modified with user-defined criteria to extract the needed data. These extracts can then be saved as Variants (ABAP program routines) to be leveraged as a repeatable process, speeding the testing process. Optim’s capabilities for creating and managing test data enable organizations to save valuable processing time, ensure consistency and reduce costs throughout the application lifecycle.
SUMMARY: InfoSphere Optim Application Repository Analyzer explores and analyzes your application repository information to identify the complex customizations of data model within your Oracle® applications, including Oracle E-Business Suite, PeopleSoft Enterprise and Siebel CRM. This helps you reduce time and improve the quality of lifecycle events, such as archiving, sub-setting and masking projects. InfoSphere Optim can understand your custom implemented modules to capture and identify the parent-child table relationships, thus minimizing manual efforts that would be needed, and improving accuracy. InfoSphere Optim can also identify data model differences across application versions and releases by comparison, to assess and anticipate the impact to customizations in upgrade and enhancement projects
IBM® InfoSphere® Optim™ Query Capture and Replay enables organizations to fully assess the impact of life-cycle changes in the testing environment before production deployment. InfoSphere Optim Query Capture and Replay complements existing functional, regression and performance tests by giving IT teams deeper insight into the database layer. The result is a significantly streamlined and more realistic testing process. Compared to legacy testing techniques, enterprise changes can now be tested more rigorously, and replays can be tailored to meet a variety of test objectives, such as capability planning, performance and function testing, and more. This approach ultimately helps shorten testing cycles and save precious resources. In addition to capturing and replaying the production workload, InfoSphere Optim Query Capture and Replay provides reports that allow IT teams to accurately analyze the impact of changes made. Comprehensive before-and-after reporting includes both high-level summaries and detailed drill-downs. As a result, IT teams gain deep insight into potential problem areas, enabling them to resolve issues before production development begins. Available reports include: • Summary Comparison—Provides a high-level look into the differences between the capture and replay, comparing average execution time, SQL exceptions, rows retrieved and SQL failures • Workload Aggregate Match—Aggregates statistics to allow quick comparisons of selected workloads • Workload Exceptions—Shows which SQL generated exceptions during replay • Workload Match—Provides a side-by-side comparison of each SQL statement and a statistical comparison between the workloads Unique product differentiators: Robust capture of full production workloads including DMBS details – Not just SQL Minimal production impact across heterogeneous environments large and small Transparent deployment Adjustable replays for testing, capacity planning and problem diagnosis Validation reports
Optim market share: This is IDC recent analysis of the Database Archiving and Information Life-cycle Management segment. IBM's Optim product line led the database archiving and ILM segment in 2010 with a 52.3% share, and showed nearly 18% growth in 2010. IDC suggests a market growth between 2009 to 2010 of almost 26%, the highest of all the categories in the database development & management market. Note: Database archiving and ILM include subsetting, masking and test data generation in this category.
What is a CVE? Programmatic approach : CVE provides clients a defined, structured and easy to follow process designed to help all key stakeholders evaluate IBM Information Management solutions Economic Benefits: CVE determines the benefits in financial terms decision makers can understand such as 5-Yr Summary, ROI, TCO and many other metrics to evaluate the proposed solution Technical Solution Blueprint: CVE provides a tailored architectural blueprint comparing your current state process/architecture against the recommended/proposed future state CVE Final Results: CVE seeks to uncover all of the cost savings that can be realized as a result of an investment in the IBM recommended software and a solution blueprint Why do a CVE Analysis? Lightweight Process: CVE is a flexible process that is not time or resource intensive Clear Objectives: CVE is run programmatically defining clear objectives, priorities, timelines Speedy Time-to-Completion: CVE elapsed time is a few short weeks (Introduction to Final Presentation) Professional Deliverables: CVE final results are built by trained and qualified experts Personalized Process: CVE is planned around giving each client a individual experience A CVE is a value-added IBM Information Management program that provides clients the expertise, structure and programmatic process to help all key stakeholders understand & evaluate IBM Information Management solutions. Within the CVE we provide the business/technical experts & business case specialist who will facilitate the program. The CVE process uncovers the business/technical challenges, quantifies the economic benefits and constructs a detailed technical solution blueprint that compares the current state against the recommended IBM solution. The result of a CVE is a detailed business case that summarizes both technical & economic justifications, solution differentiators and architecture. 3 Steps for a Successful Info Lifecycle Management CVE: Step 1: We work with your key management staff to understand the technical & business challenges in managing the lifecycle of your database data Step 2: We conduct in-depth interviews and data collection to determine the cost of operations for the in-scope systems and applications selected by you Step 3: We develop a technical solution blueprint that describes how the IBM solution fits in your environment Analysis Validation: Our CVE Lead will validate all findings to ensure its accuracy, completeness before finalizing Final Presentation & Results: In financial and technical terms, we present & deliver to your organization a final CVE business case that provides the cost savings, strategy and solution blueprint. Sample ILM CVE Offerings: Data Growth: Calculate the value and benefits related to managing excessive data growth Test Data Management: Calculate the value & benefits related to more effective mgt of your test data in relation to your development projects & processes Application Retirement: Calculate the value & benefits of removing app from your portfolio Data Privacy: Risk reduction & TCO of privatizing nonproduction in your test data processes
Information Management Software Services brings the following to each and every engagement: Deep product and industry expertise - the “heart surgeons” of a project Certified professionals Enablers, driving clients to be self-sufficient WW track record of project success Access to the Information Management Software Development Labs Skills, experience and standard practices; critical for early adopters of our technology Strategic partners