Top 5 Tips to Cut the Effort of your Oracle EBS R12 Project by a ThirdOriginal Software
In this Webinar presentation Jonathan Pearson, Solution Consultant, explored the impact an Oracle EBS R12 project will have on your business, the top 5 tips to help you with the project and how technology can help cut the project effort by a third!
You will also learn:
- What is involved with the R12 upgrade process
- Who is involved with the upgrade and when
- How companies underestimate the time impact
- What the biggest challenges are
- A best practice process for managing the upgrade project
- See more at: http://www.origsoft.com/webinars/oracle_software_testing/on-demand/
Modern Data Warehousing with the Microsoft Analytics Platform SystemJames Serra
The Microsoft Analytics Platform System (APS) is a turnkey appliance that provides a modern data warehouse with the ability to handle both relational and non-relational data. It uses a massively parallel processing (MPP) architecture with multiple CPUs running queries in parallel. The APS includes an integrated Hadoop distribution called HDInsight that allows users to query Hadoop data using T-SQL with PolyBase. This provides a single query interface and allows users to leverage existing SQL skills. The APS appliance is pre-configured with software and hardware optimized to deliver high performance at scale for data warehousing workloads.
This document provides an overview of a SQL-on-Hadoop tutorial. It introduces the presenters and discusses why SQL is important for Hadoop, as MapReduce is not optimal for all use cases. It also notes that while the database community knows how to efficiently process data, SQL-on-Hadoop systems face challenges due to the limitations of running on top of HDFS and Hadoop ecosystems. The tutorial outline covers SQL-on-Hadoop technologies like storage formats, runtime engines, and query optimization.
This document provides an overview of Hadoop and its ecosystem. It discusses the evolution of Hadoop from version 1 which focused on batch processing using MapReduce, to version 2 which introduced YARN for distributed resource management and supported additional data processing engines beyond MapReduce. It also describes key Hadoop services like HDFS for distributed storage and the benefits of a Hadoop data platform for unlocking the value of large datasets.
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
This document discusses using Hadoop and the Hortonworks Data Platform (HDP) for big data applications. It outlines how HDP can help organizations optimize their existing data warehouse, lower storage costs, unlock new applications from new data sources, and achieve an enterprise data lake architecture. The document also discusses how Talend's data integration platform can be used with HDP to easily develop batch, real-time, and interactive data integration jobs on Hadoop. Case studies show how companies have used Talend and HDP together to modernize their data architecture and product inventory and pricing forecasting.
Big Data Hoopla Simplified - TDWI Memphis 2014Rajan Kanitkar
The document provides an overview and quick reference guide to big data concepts including Hadoop, MapReduce, HDFS, YARN, Spark, Storm, Hive, Pig, HBase and NoSQL databases. It discusses the evolution of Hadoop from versions 1 to 2, and new frameworks like Tez and YARN that allow different types of processing beyond MapReduce. The document also summarizes common big data challenges around skills, integration and analytics.
Building a Turbo-fast Data Warehousing Platform with DatabricksDatabricks
Traditionally, data warehouse platforms have been perceived as cost prohibitive, challenging to maintain and complex to scale. The combination of Apache Spark and Spark SQL – running on AWS – provides a fast, simple, and scalable way to build a new generation of data warehouses that revolutionizes how data scientists and engineers analyze their data sets.
In this webinar you will learn how Databricks - a fully managed Spark platform hosted on AWS - integrates with variety of different AWS services, Amazon S3, Kinesis, and VPC. We’ll also show you how to build your own data warehousing platform in very short amount of time and how to integrate it with other tools such as Spark’s machine learning library and Spark streaming for real-time processing of your data.
Top 5 Tips to Cut the Effort of your Oracle EBS R12 Project by a ThirdOriginal Software
In this Webinar presentation Jonathan Pearson, Solution Consultant, explored the impact an Oracle EBS R12 project will have on your business, the top 5 tips to help you with the project and how technology can help cut the project effort by a third!
You will also learn:
- What is involved with the R12 upgrade process
- Who is involved with the upgrade and when
- How companies underestimate the time impact
- What the biggest challenges are
- A best practice process for managing the upgrade project
- See more at: http://www.origsoft.com/webinars/oracle_software_testing/on-demand/
Modern Data Warehousing with the Microsoft Analytics Platform SystemJames Serra
The Microsoft Analytics Platform System (APS) is a turnkey appliance that provides a modern data warehouse with the ability to handle both relational and non-relational data. It uses a massively parallel processing (MPP) architecture with multiple CPUs running queries in parallel. The APS includes an integrated Hadoop distribution called HDInsight that allows users to query Hadoop data using T-SQL with PolyBase. This provides a single query interface and allows users to leverage existing SQL skills. The APS appliance is pre-configured with software and hardware optimized to deliver high performance at scale for data warehousing workloads.
This document provides an overview of a SQL-on-Hadoop tutorial. It introduces the presenters and discusses why SQL is important for Hadoop, as MapReduce is not optimal for all use cases. It also notes that while the database community knows how to efficiently process data, SQL-on-Hadoop systems face challenges due to the limitations of running on top of HDFS and Hadoop ecosystems. The tutorial outline covers SQL-on-Hadoop technologies like storage formats, runtime engines, and query optimization.
This document provides an overview of Hadoop and its ecosystem. It discusses the evolution of Hadoop from version 1 which focused on batch processing using MapReduce, to version 2 which introduced YARN for distributed resource management and supported additional data processing engines beyond MapReduce. It also describes key Hadoop services like HDFS for distributed storage and the benefits of a Hadoop data platform for unlocking the value of large datasets.
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
This document discusses using Hadoop and the Hortonworks Data Platform (HDP) for big data applications. It outlines how HDP can help organizations optimize their existing data warehouse, lower storage costs, unlock new applications from new data sources, and achieve an enterprise data lake architecture. The document also discusses how Talend's data integration platform can be used with HDP to easily develop batch, real-time, and interactive data integration jobs on Hadoop. Case studies show how companies have used Talend and HDP together to modernize their data architecture and product inventory and pricing forecasting.
Big Data Hoopla Simplified - TDWI Memphis 2014Rajan Kanitkar
The document provides an overview and quick reference guide to big data concepts including Hadoop, MapReduce, HDFS, YARN, Spark, Storm, Hive, Pig, HBase and NoSQL databases. It discusses the evolution of Hadoop from versions 1 to 2, and new frameworks like Tez and YARN that allow different types of processing beyond MapReduce. The document also summarizes common big data challenges around skills, integration and analytics.
Building a Turbo-fast Data Warehousing Platform with DatabricksDatabricks
Traditionally, data warehouse platforms have been perceived as cost prohibitive, challenging to maintain and complex to scale. The combination of Apache Spark and Spark SQL – running on AWS – provides a fast, simple, and scalable way to build a new generation of data warehouses that revolutionizes how data scientists and engineers analyze their data sets.
In this webinar you will learn how Databricks - a fully managed Spark platform hosted on AWS - integrates with variety of different AWS services, Amazon S3, Kinesis, and VPC. We’ll also show you how to build your own data warehousing platform in very short amount of time and how to integrate it with other tools such as Spark’s machine learning library and Spark streaming for real-time processing of your data.
Near real-time, big data analytics is a reality via a new data pattern that avoids the latency and overhead of legacy ETL–the 3 T’s of Hadoop: Transfer, Transform, and Translate. Transfer: Once a Hadoop infrastructure is in place, a mandate is needed to immediately and continuously transfer all enterprise data, from external and internal sources and through different existing systems, into Hadoop. Previously, enterprise data was isolated, disconnected and monolithically segmented. Through this T, various source data are consolidated and centralized in Hadoop almost as they are generated in near real-time. Transform: Most of the enterprise data, when flowing into Hadoop, is transactional in nature. Analytics requires data be transformed from record-based OLTP form to column-based OLAP. This T is not the same T in ETL as we need to retain the granularity in the data feeds. The key is to transform in-place within Hadoop, without further data movement from Hadoop to other legacy systems. Translate: We pre-compute or provide on-the-fly views of analytical data, exposed for consumption. We facilitate analysis and reporting, for both scheduled and ad hoc needs, to be interactive with the data for analysts and end users, integrated in and on top of Hadoop.
3 Things to Learn About:
-How Kudu is able to fill the analytic gap between HDFS and Apache HBase
-The trade-offs between real-time transactional access and fast analytic performance
-How Kudu provides an option to achieve fast scans and random access from a single API
This document provides an overview of real-time processing capabilities on Hortonworks Data Platform (HDP). It discusses how a trucking company uses HDP to analyze sensor data from trucks in real-time to monitor for violations and integrate predictive analytics. The company collects data using Kafka and analyzes it using Storm, HBase and Hive on Tez. This provides real-time dashboards as well as querying of historical data to identify issues with routes, trucks or drivers. The document explains components like Kafka, Storm and HBase and how they enable a unified YARN-based architecture for multiple workloads on a single HDP cluster.
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
How do you turn data from many different sources into actionable insights and manufacture those insights into innovative information-based products and services?
Industry leaders are accomplishing this by adding Hadoop as a critical component in their modern data architecture to build a data lake. A data lake collects and stores data across a wide variety of channels including social media, clickstream data, server logs, customer transactions and interactions, videos, and sensor data from equipment in the field. A data lake cost-effectively scales to collect and retain massive amounts of data over time, and convert all this data into actionable information that can transform your business.
Join Hortonworks and Informatica as we discuss:
- What is a data lake?
- The modern data architecture for a data lake
- How Hadoop fits into the modern data architecture
- Innovative use-cases for a data lake
The document discusses accelerating enterprise adoption of Apache Hadoop through a capability-driven approach. It outlines four core tenets for a Hadoop journey: having a capability-driven framework, using a heterogeneous set of technologies, choosing the right fit of open source and commercial solutions, and developing a flexible operating model. Case studies show how following these tenets can help reduce data processing times and give business users improved analytics capabilities.
Presentation from Data Science Conference 2.0 held in Belgrade, Serbia. The focus of the talk was to address the challenges of deploying a Data Lake infrastructure within the organization.
Predictive Analytics and Machine Learning…with SAS and Apache HadoopHortonworks
In this interactive webinar, we'll walk through use cases on how you can use advanced analytics like SAS Visual Statistics and In-Memory Statistic with Hortonworks’ data platform (HDP) to reveal insights in your big data and redefine how your organization solves complex problems.
The document discusses Seagate's plans to integrate hard disk drives (HDDs) with flash storage, systems, services, and consumer devices to deliver unique hybrid solutions for customers. It notes Seagate's annual revenue, employees, manufacturing plants, and design centers. It also discusses Seagate exploring the use of big data analytics and Hadoop across various potential use cases and outlines Seagate's high-level plans for Hadoop implementation.
Today, when data is mushrooming and coming in heterogeneous forms, there is a growing need for a flexible, adaptable, efficient and cost effective integration platform which will take minimum on-boarding time and interact and entertain n number of platforms. Talend fits just perfect in this space with a proven track record, so learning talend makes lot of sense for anybody associated with data world.
If you understand how to manage, transform, store your organisation data (retail, banking, airlines, research, insurance, cards etc.) and effectively represent it which is the backbone behind any successful MIS system/reporting/dash board then you are a key person that organisation most sought after.
This webinar discusses tools for making big data easy to work with. It covers MetaScale Expertise, which provides Hadoop expertise and case studies. Kognitio Analytics is discussed as a way to accelerate Hadoop for organizations. The webinar agenda includes an introduction, presentations on MetaScale and Kognitio, and a question and answer session. Rethinking data strategies with Hadoop and using in-memory analytics are presented as ways to gain insights from large, diverse datasets.
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit
This document discusses how organizations can leverage data and analytics to power their business models. It provides examples of Fortune 100 companies that are using Attunity products to build data lakes and ingest data from SAP and other sources into Hadoop, Apache Kafka, and the cloud in order to perform real-time analytics. The document outlines the benefits of Attunity's data replication tools for extracting, transforming, and loading SAP and other enterprise data into data lakes and data warehouses.
Open innovation and collaboration between IBM and other technology companies is fueling advances in cloud computing, big data analytics, and software development. This includes contributions to open source projects like Linux as well as partnerships through organizations like the OpenPOWER Foundation. New systems based on IBM's Power architecture and optimized for Linux are helping customers improve the performance and efficiency of their analytics, database, and application workloads.
This document provides an overview of installing and programming with Apache Spark on the Hortonworks Data Platform (HDP). It discusses how Spark fits within HDP and can be used for batch processing, streaming, SQL queries and machine learning. The document outlines how to install Spark on HDP using Ambari and describes Spark programming with Resilient Distributed Datasets (RDDs), transformations, actions and caching/persistence. It provides examples of Spark APIs and programming patterns.
Attunity Efficient ODR For Sql Server Using Attunity CDC Suite For SSIS Slide...Melissa Kolodziej
This slidedeck focuses on how to leverage your SQL Server skills & software to reduce cost & accelerate SQL Server data replication, synchronization, & real-time integration while enabling operational reporting, business intelligence & data warehousing projects. It also highlights CDC concepts & benefits and how CDC can assist you with data replication projects. Screenshots are included to demonstrate Attunity\'s CDC Suite for SSIS.
Dynamic DDL: Adding structure to streaming IoT data on the flyDataWorks Summit
At the end of day the only thing that data scientists want is one thing. They want tabular data for their analysis.
They do not want to spend hours or days preparing data. How does a data engineer handle the massive amount of data
that is being streamed at them from IoT devices and apps and at the same time add structure to it so that data scientists
can focus on finding insights and not preparing data? By the way, you need to do this within minutes (sometimes seconds).
Oh... and there are a bunch more data sources that you need to ingest and the current providers of data are changing their structure.
At GoPro, we have massive amounts of heterogeneous data being streamed at us from our consumer devices
and applications, and we have developed a concept of "dynamic DDL" to structure our streamed data on the fly using
Spark Streaming, Kafka, HBase, Hive, and S3. The idea is simple. Add structure (schema) to the data as soon as possible.
Allow the providers of the data to dictate the structure. And automatically create event-based and state-based tables (DDL)
for all data sources to allow data scientists to access the data via their lingua franca, SQL, within minutes.
Transforming Data Architecture Complexity at Sears - StampedeCon 2013StampedeCon
At the StampedeCon 2013 Big Data conference in St. Louis, Justin Sheppard discussed Transforming Data Architecture Complexity at Sears. High ETL complexity and costs, data latency and redundancy, and batch window limits are just some of the IT challenges caused by traditional data warehouses. Gain an understanding of big data tools through the use cases and technology that enables Sears to solve the problems of the traditional enterprise data warehouse approach. Learn how Sears uses Hadoop as a data hub to minimize data architecture complexity – resulting in a reduction of time to insight by 30-70% – and discover “quick wins” such as mainframe MIPS reduction.
Empowering you with Democratized Data Access, Data Science and Machine LearningDataWorks Summit
Data science with its specialized tools and knowledge has been a forte of data scientists. However, it is not easy even for data scientists to get access to data that could be in different data stores in the organization. To unleash the power of data and gain valuable insights, machine learning needs to be made easily consumable by various stake holders and access to data made simpler. As an organization's data volumes continue to grow, delivering these insights real time is a complex challenge to solve.
This session will provide on overview of an approach to building a scalable solution where machine and deep learning and access to data is made much more consumable and simpler by the fastest SQL on Hadoop engine on the planet, a rich data scientist toolset and an infrastructure that can deliver the responsiveness needed for production environments.
Speakers:
Pandit Prasad, Program Director, IBM
Ashutosh Mate, Global Senior Solutions Architect, IBM
The document discusses culture and empathy and includes prompts for reflection on personal cultural influences and experiences putting oneself in someone else's shoes. It then introduces a guessing game activity and mentions an upcoming personal project in a future well-being lesson. The rest of the document provides autobiographical details about the author, including their birthplace, family, childhood, education experiences, and career path.
This document discusses tools for rapidly implementing Oracle E-Business Suite, including Oracle Business Accelerators (OBA) for automated configuration, User Productivity Kit (UPK) for recording training sessions, and Data Load Professional for automated data conversion. It provides an example project for KU Children's Services where OBA configured the system in 2 weeks, UPK created training materials, and Data Load Professional streamlined data migration. The tools helped reduce the implementation time from several months to just 3 months for KU to go live on the new system.
Hexaware implemented Oracle E-Business Suite for a leading manufacturer to optimize processes across multiple locations. The client needed to consolidate infrastructure and standardize systems to improve efficiency. Hexaware implemented Oracle modules for financials, purchasing, manufacturing, inventory, order management, and HR using its methodology to accelerate the process. This helped reduce cycle times and support optimization goals. The client appreciated Hexaware's guidance and commitment to the project schedule.
This document discusses using a Work Breakdown Structure (WBS) to organize an ERP implementation project. It describes the typical phases of an ERP project based on the ASAP methodology, including project preparation, business blueprint, realization, final preparation, and go-live and support. It then proposes expanding these phases into 11 work packages in a WBS to further break down the project into independent tasks and address potential challenges at each stage. These include activities like system architecture design, configuration and development, data conversion, testing, training, and cutover. The WBS provides a framework for managing the project by assigning responsibilities, estimating costs, and tracking progress for each defined work package and task.
Near real-time, big data analytics is a reality via a new data pattern that avoids the latency and overhead of legacy ETL–the 3 T’s of Hadoop: Transfer, Transform, and Translate. Transfer: Once a Hadoop infrastructure is in place, a mandate is needed to immediately and continuously transfer all enterprise data, from external and internal sources and through different existing systems, into Hadoop. Previously, enterprise data was isolated, disconnected and monolithically segmented. Through this T, various source data are consolidated and centralized in Hadoop almost as they are generated in near real-time. Transform: Most of the enterprise data, when flowing into Hadoop, is transactional in nature. Analytics requires data be transformed from record-based OLTP form to column-based OLAP. This T is not the same T in ETL as we need to retain the granularity in the data feeds. The key is to transform in-place within Hadoop, without further data movement from Hadoop to other legacy systems. Translate: We pre-compute or provide on-the-fly views of analytical data, exposed for consumption. We facilitate analysis and reporting, for both scheduled and ad hoc needs, to be interactive with the data for analysts and end users, integrated in and on top of Hadoop.
3 Things to Learn About:
-How Kudu is able to fill the analytic gap between HDFS and Apache HBase
-The trade-offs between real-time transactional access and fast analytic performance
-How Kudu provides an option to achieve fast scans and random access from a single API
This document provides an overview of real-time processing capabilities on Hortonworks Data Platform (HDP). It discusses how a trucking company uses HDP to analyze sensor data from trucks in real-time to monitor for violations and integrate predictive analytics. The company collects data using Kafka and analyzes it using Storm, HBase and Hive on Tez. This provides real-time dashboards as well as querying of historical data to identify issues with routes, trucks or drivers. The document explains components like Kafka, Storm and HBase and how they enable a unified YARN-based architecture for multiple workloads on a single HDP cluster.
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
How do you turn data from many different sources into actionable insights and manufacture those insights into innovative information-based products and services?
Industry leaders are accomplishing this by adding Hadoop as a critical component in their modern data architecture to build a data lake. A data lake collects and stores data across a wide variety of channels including social media, clickstream data, server logs, customer transactions and interactions, videos, and sensor data from equipment in the field. A data lake cost-effectively scales to collect and retain massive amounts of data over time, and convert all this data into actionable information that can transform your business.
Join Hortonworks and Informatica as we discuss:
- What is a data lake?
- The modern data architecture for a data lake
- How Hadoop fits into the modern data architecture
- Innovative use-cases for a data lake
The document discusses accelerating enterprise adoption of Apache Hadoop through a capability-driven approach. It outlines four core tenets for a Hadoop journey: having a capability-driven framework, using a heterogeneous set of technologies, choosing the right fit of open source and commercial solutions, and developing a flexible operating model. Case studies show how following these tenets can help reduce data processing times and give business users improved analytics capabilities.
Presentation from Data Science Conference 2.0 held in Belgrade, Serbia. The focus of the talk was to address the challenges of deploying a Data Lake infrastructure within the organization.
Predictive Analytics and Machine Learning…with SAS and Apache HadoopHortonworks
In this interactive webinar, we'll walk through use cases on how you can use advanced analytics like SAS Visual Statistics and In-Memory Statistic with Hortonworks’ data platform (HDP) to reveal insights in your big data and redefine how your organization solves complex problems.
The document discusses Seagate's plans to integrate hard disk drives (HDDs) with flash storage, systems, services, and consumer devices to deliver unique hybrid solutions for customers. It notes Seagate's annual revenue, employees, manufacturing plants, and design centers. It also discusses Seagate exploring the use of big data analytics and Hadoop across various potential use cases and outlines Seagate's high-level plans for Hadoop implementation.
Today, when data is mushrooming and coming in heterogeneous forms, there is a growing need for a flexible, adaptable, efficient and cost effective integration platform which will take minimum on-boarding time and interact and entertain n number of platforms. Talend fits just perfect in this space with a proven track record, so learning talend makes lot of sense for anybody associated with data world.
If you understand how to manage, transform, store your organisation data (retail, banking, airlines, research, insurance, cards etc.) and effectively represent it which is the backbone behind any successful MIS system/reporting/dash board then you are a key person that organisation most sought after.
This webinar discusses tools for making big data easy to work with. It covers MetaScale Expertise, which provides Hadoop expertise and case studies. Kognitio Analytics is discussed as a way to accelerate Hadoop for organizations. The webinar agenda includes an introduction, presentations on MetaScale and Kognitio, and a question and answer session. Rethinking data strategies with Hadoop and using in-memory analytics are presented as ways to gain insights from large, diverse datasets.
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit
This document discusses how organizations can leverage data and analytics to power their business models. It provides examples of Fortune 100 companies that are using Attunity products to build data lakes and ingest data from SAP and other sources into Hadoop, Apache Kafka, and the cloud in order to perform real-time analytics. The document outlines the benefits of Attunity's data replication tools for extracting, transforming, and loading SAP and other enterprise data into data lakes and data warehouses.
Open innovation and collaboration between IBM and other technology companies is fueling advances in cloud computing, big data analytics, and software development. This includes contributions to open source projects like Linux as well as partnerships through organizations like the OpenPOWER Foundation. New systems based on IBM's Power architecture and optimized for Linux are helping customers improve the performance and efficiency of their analytics, database, and application workloads.
This document provides an overview of installing and programming with Apache Spark on the Hortonworks Data Platform (HDP). It discusses how Spark fits within HDP and can be used for batch processing, streaming, SQL queries and machine learning. The document outlines how to install Spark on HDP using Ambari and describes Spark programming with Resilient Distributed Datasets (RDDs), transformations, actions and caching/persistence. It provides examples of Spark APIs and programming patterns.
Attunity Efficient ODR For Sql Server Using Attunity CDC Suite For SSIS Slide...Melissa Kolodziej
This slidedeck focuses on how to leverage your SQL Server skills & software to reduce cost & accelerate SQL Server data replication, synchronization, & real-time integration while enabling operational reporting, business intelligence & data warehousing projects. It also highlights CDC concepts & benefits and how CDC can assist you with data replication projects. Screenshots are included to demonstrate Attunity\'s CDC Suite for SSIS.
Dynamic DDL: Adding structure to streaming IoT data on the flyDataWorks Summit
At the end of day the only thing that data scientists want is one thing. They want tabular data for their analysis.
They do not want to spend hours or days preparing data. How does a data engineer handle the massive amount of data
that is being streamed at them from IoT devices and apps and at the same time add structure to it so that data scientists
can focus on finding insights and not preparing data? By the way, you need to do this within minutes (sometimes seconds).
Oh... and there are a bunch more data sources that you need to ingest and the current providers of data are changing their structure.
At GoPro, we have massive amounts of heterogeneous data being streamed at us from our consumer devices
and applications, and we have developed a concept of "dynamic DDL" to structure our streamed data on the fly using
Spark Streaming, Kafka, HBase, Hive, and S3. The idea is simple. Add structure (schema) to the data as soon as possible.
Allow the providers of the data to dictate the structure. And automatically create event-based and state-based tables (DDL)
for all data sources to allow data scientists to access the data via their lingua franca, SQL, within minutes.
Transforming Data Architecture Complexity at Sears - StampedeCon 2013StampedeCon
At the StampedeCon 2013 Big Data conference in St. Louis, Justin Sheppard discussed Transforming Data Architecture Complexity at Sears. High ETL complexity and costs, data latency and redundancy, and batch window limits are just some of the IT challenges caused by traditional data warehouses. Gain an understanding of big data tools through the use cases and technology that enables Sears to solve the problems of the traditional enterprise data warehouse approach. Learn how Sears uses Hadoop as a data hub to minimize data architecture complexity – resulting in a reduction of time to insight by 30-70% – and discover “quick wins” such as mainframe MIPS reduction.
Empowering you with Democratized Data Access, Data Science and Machine LearningDataWorks Summit
Data science with its specialized tools and knowledge has been a forte of data scientists. However, it is not easy even for data scientists to get access to data that could be in different data stores in the organization. To unleash the power of data and gain valuable insights, machine learning needs to be made easily consumable by various stake holders and access to data made simpler. As an organization's data volumes continue to grow, delivering these insights real time is a complex challenge to solve.
This session will provide on overview of an approach to building a scalable solution where machine and deep learning and access to data is made much more consumable and simpler by the fastest SQL on Hadoop engine on the planet, a rich data scientist toolset and an infrastructure that can deliver the responsiveness needed for production environments.
Speakers:
Pandit Prasad, Program Director, IBM
Ashutosh Mate, Global Senior Solutions Architect, IBM
The document discusses culture and empathy and includes prompts for reflection on personal cultural influences and experiences putting oneself in someone else's shoes. It then introduces a guessing game activity and mentions an upcoming personal project in a future well-being lesson. The rest of the document provides autobiographical details about the author, including their birthplace, family, childhood, education experiences, and career path.
This document discusses tools for rapidly implementing Oracle E-Business Suite, including Oracle Business Accelerators (OBA) for automated configuration, User Productivity Kit (UPK) for recording training sessions, and Data Load Professional for automated data conversion. It provides an example project for KU Children's Services where OBA configured the system in 2 weeks, UPK created training materials, and Data Load Professional streamlined data migration. The tools helped reduce the implementation time from several months to just 3 months for KU to go live on the new system.
Hexaware implemented Oracle E-Business Suite for a leading manufacturer to optimize processes across multiple locations. The client needed to consolidate infrastructure and standardize systems to improve efficiency. Hexaware implemented Oracle modules for financials, purchasing, manufacturing, inventory, order management, and HR using its methodology to accelerate the process. This helped reduce cycle times and support optimization goals. The client appreciated Hexaware's guidance and commitment to the project schedule.
This document discusses using a Work Breakdown Structure (WBS) to organize an ERP implementation project. It describes the typical phases of an ERP project based on the ASAP methodology, including project preparation, business blueprint, realization, final preparation, and go-live and support. It then proposes expanding these phases into 11 work packages in a WBS to further break down the project into independent tasks and address potential challenges at each stage. These include activities like system architecture design, configuration and development, data conversion, testing, training, and cutover. The WBS provides a framework for managing the project by assigning responsibilities, estimating costs, and tracking progress for each defined work package and task.
The document provides an overview and introduction to Oracle's Project Management Method (PJM). PJM is Oracle's standardized approach to project management for information technology projects. It focuses on defining client expectations, maintaining visibility, and implementing control mechanisms. PJM is organized around five core management processes: Control and Reporting, Work Management, Resource Management, Quality Management, and Configuration Management. It also structures tasks into Planning, Control, and Execution categories to support the project life cycle.
Large Complex Projects (PMI-MY presentation Sept 2012)Jeremie Averous
Large, Complex Project Management is Fundamentally different from smaller, simpler project management. In this comprehensive presentation for the Project Management Institute, we dwell into the details of what needs to be setup to be successful in these world-changing ventures.
This document outlines the key steps in planning an ERP system implementation project. It discusses defining the phases of implementation including pre-evaluation screening of software packages, package evaluation, and the project planning phase. The project planning phase involves designing the implementation process, establishing time schedules and deadlines, assigning roles and responsibilities, deciding on resources and the project team, and planning contingencies to monitor progress and make corrections if needed. The planning is carried out by a committee of team leaders headed by the ERP in-charge to chart the course of action.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
This document summarizes the upgrade path to Oracle E-Business Suite Release 12.1.3 and new enhancements in 12.1.3. It discusses the interoperability database versions, technology updates from R12.1.1 to R12.1.3 including autoconfig, menu security and in-context diagnostics enhancements. It also provides an Apps-DBA, developer and end user perspective on changes in R12.1.3 like lightweight MLS, case sensitive passwords and integrated concurrent processing with OBI Publisher.
Critical Success Factors in Implementation of ERP SystemsStephen Coady
Project Report published for a Masters Degree Course research on Critical Success Factors in the Implementation of ERP Systems. A literature review of journals was used to develop the research questionnaire answered by managers and executives involved in the process of selection of an ERP System.
This plan provides any parties interested in implementing Oracle Applications with a framework for doing so. It contains the detailed tasks involved and lists the associated resources that may be needed. The Work Breakdown Structure (WBS) codes tie back in to the Oracle AIM documents that should be prepared for each task and phase.
1) The document discusses critical success factors (CSFs) for successful enterprise resource planning (ERP) project implementation. It identifies top management support, end user involvement, user training, vendor selection, and vendor support as key CSFs from a case study of an ERP implementation at a large financial services company.
2) The case study involved implementing PeopleSoft ERP software on a big-bang basis across Asia Pacific and Japan regions. A phased approach was taken, starting with pilots in Singapore and Hong Kong.
3) The document proposes a six-factor model for ERP implementation success incorporating management, technology, operational, human resource, attitudinal, and communication factors.
ERP implementation Failure at Hershey Food CorperationOlivier Tisun
Hershey Foods implemented an ERP system in 1999 that led to major problems. The key issues were:
1) Modules were simultaneously implemented instead of sequentially, leading to integration problems.
2) Training and testing were inadequate.
3) Peak season was chosen for the implementation.
4) Top management lacked IT expertise and did not properly oversee the project.
5) Problems were exacerbated by the looming Y2K issue and demand from retailers.
The failed implementation resulted in delivery delays, lost revenue, and excess inventory. Hershey later took steps to stabilize the systems and implement improvements.
Hershey, a leading chocolate manufacturer, needed to replace its legacy systems to address Y2K issues and enable more efficient operations. It implemented SAP, Siebel, and Manugistics software in a big bang approach over 30 months. However, the system went live during their busiest season, and they were unable to fulfill Halloween orders, which significantly hurt sales and profits. Key lessons learned were that enterprise software requires business process change, adequate testing is needed, and careful timing of go-live is important. After upgrades and improvements, Hershey now has near 99.96% inventory accuracy and can fulfill orders within 24-48 hours.
This document provides instructions for setting up the inventory organization structure for Oracle Application R12. It includes steps for defining a primary ledger and operating unit, custom inventory responsibility, security profile, workday calendar, item master organization, locations, subinventories, and other foundational elements. The goal is to establish the necessary setup for Inbox Business Technologies to use Oracle Inventory functionality.
New features in oracle fusion financial accounts receivables and account paya...Jade Global
Learn about the latest features and benefits of Oracle fusion financial accounts receivables and account payables. For more detail please visit: http://www.jadeglobal.com
The document outlines the project preparation and planning process for implementing new CRM and ERP systems, including forming project teams, drafting plans and documentation, procuring vendors, and designing "as-is" and "to-be" business processes for finance, supply chain management, and HR modules. It also describes subsequent phases for system design, testing, data migration, user acceptance testing, and project go-live. A sample risk register is included to manage potential risks to the project.
1) A data warehouse is a collection of data from multiple sources used to enable informed decision making. It contains data, metadata, dimensions, facts and aggregates.
2) The typical processes in a data warehouse are extract and load, data cleaning and transformation, user queries, and data archiving.
3) The key components that manage these processes are the load manager, warehouse manager and query manager. The load manager extracts, loads and does simple transformations on the data. The warehouse manager performs more complex transformations, integrity checks and generates summaries. The query manager directs user queries to the appropriate data.
The document discusses the extraction, transformation, and loading (ETL) process used in data warehousing. It describes how ETL tools extract data from operational systems, transform the data through cleansing and formatting, and load it into the data warehouse. Metadata is generated during the ETL process to document the data flow and mappings. The roles of different types of metadata are also outlined. Common ETL tools and their strengths and limitations are reviewed.
Equnix Business Solutions (Equnix) is an IT Solution provider in Indonesia, providing comprehensive solution services especially on the infrastructure side for corporate business needs based on research and Open Source. Equnix has 3 (three) main services known as the Trilogy of Services: Support (Maintenance/Managed), World class level of Software Development, and Expert Consulting and Assessment for High Performance Transactions System. Equnix is customer oriented, not product or principal. Equal opportunity based on merit is our credo in managing HR development.
Logical replication allows migration between different hardware, operating systems, and Oracle versions with minimal downtime. It works by reading the redo logs of the source database in real time and applying the changes to the target database. Some preparation is required, such as testing and validating the migration. If issues occur during cutover to the 12c target, the original production system remains intact with no data risk. Logical replication provides an effective method for migrating to Oracle 12c with zero or near-zero downtime.
Oracle 12c offers many new features and upgrading database can bring many advantages to organization. There are various upgrade and migration methods available and the best method to use for your upgrade/migration scenario depends on the source database version, the source and destination operating systems, your downtime requirements, and the personal preference of the DBA. Based upon factors there is a method available to best fit your organization needs.
This document discusses common data management challenges and evaluates different tools for moving data between SAP systems. It analyzes tools like TDMS, client copy, custom programs, R3Trans and manual entry based on factors such as data volume, technical difficulty, disruption caused and skill requirements. The document recommends using lower tier tools like CATT, custom programs or manual entry for low volume data, and higher tier tools for larger amounts of frequently copied data.
Oracle GoldenGate provides real-time data integration and replication capabilities. It uses non-intrusive change data capture to replicate transactional changes in real-time across heterogeneous database environments with sub-second latency. GoldenGate has over 500 customers across various industries and supports workloads involving terabytes of data movement per day. It extends Oracle's data integration and high availability capabilities beyond Oracle databases to other platforms like SQL Server and MySQL.
Saurabh Kumar Gupta is presenting to the Special Selection Committee for a promotion. He has over 10 years of experience as a Project Engineer working with Oracle databases, Tuxedo, and WebLogic technologies. In his role, he has led installations, migrations, performance tuning, and support work. He is seeking a job profile as a core database and storage team member or team lead. He highlights past work optimizing the FOIS infrastructure and contributions to projects implementing industry best practices.
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
Jan 22nd, 2010 Hadoop meetup presentation on project voldemort and how it plays well with Hadoop at linkedin. The talk focus on Linkedin Hadoop ecosystem. How linkedin manage complex workflows, data ETL , data storage and online serving of 100GB to TB of data.
The document discusses Project Voldemort, a distributed key-value storage system developed at LinkedIn. It provides an overview of Voldemort's motivation and features, including high availability, horizontal scalability, and consistency guarantees. It also describes LinkedIn's use of Voldemort and Hadoop for applications like event logging, online lookups, and batch processing of large datasets.
This document discusses data migration in Oracle E-Business Suite. It covers migrating data to Oracle using open interfaces/APIs, Oracle utilities like FNDLOAD and iSetup, and third party tools like DataLoad and Mercury Object Migrator. It also discusses migrating data from Oracle by creating materialized views or using the Business Event System to define custom events. The document provides an overview of different data migration scenarios and options for loading both setup, master, and transactional data in Oracle E-Business Suite.
Deri menjelaskan mengenai proses melakukan migrasi dari SAP on-prem ke cloud. Pemaparan materi juga mencakup mengenai tahapan-tahapan migrasi, aktivitas-aktivitas prioritas dalam melakukan migrasi, serta manfaat yang didapatkan ketika seluruh tahapan migrasi ke cloud selesai dilaksanakan.
Double-Take Share allows for easy, automated data sharing between databases, applications, and platforms. It captures changes made to the source database, transforms the data as needed, and replicates it to the desired target. This eliminates the need for manual data integration and allows businesses to be more productive, profitable, and see an immediate ROI. Double-Take Share supports replication between various database types and platforms, and provides monitoring and alerting to ensure accurate and reliable data sharing.
The document discusses using data virtualization and masking to optimize database migrations to the cloud. It notes that traditional copying of data is inefficient for large environments and can incur high data transfer costs in the cloud. Using data virtualization allows creating virtual copies of production databases that only require a small storage footprint. Masking sensitive data before migrating non-production databases ensures security while reducing costs. Overall, data virtualization and masking enable simpler, more secure, and cost-effective migrations to cloud environments.
This document outlines the steps to execute a database platform migration using Zero Data Loss Recovery Appliance (ZDLRA). It discusses ZDLRA backup and restore strategies using incremental forever backups and virtual full backups for fast restore. The presentation covers both cross-endian and same-endian database migration processes using ZDLRA, including automating steps with the dbmigusera.pl tool. A customer case study shows how a semiconductor manufacturer consolidated databases to Exadata using ZDLRA for near-zero downtime migration.
What is Data Warehousing? ,
Who needs Data Warehousing? ,
Why Data Warehouse is required? ,
Types of Systems ,
OLTP
OLAP
Maintenance of Data Warehouse
Data Warehousing Life Cycle
Whats new in Oracle Database 12c release 12.1.0.2Connor McDonald
This document provides an overview of new features in Oracle Database 12c Release 1 (12.1.0.2). It discusses Oracle Database In-Memory for accelerating analytics, improvements for developers like support for JSON and RESTful services, capabilities for accessing big data using SQL, enhancements to Oracle Multitenant for database consolidation, and other performance improvements. The document also briefly outlines features like Oracle Rapid Home Provisioning, Database Backup Logging Recovery Appliance, and Oracle Key Vault.
Data ingestion using NiFi - Quick OverviewDurga Gadiraju
* Overview of NiFi
* Understanding NiFi Layout as a service
* Key Concepts such as Flow Files, Attributes etc
* Understanding how to access the documentation
* Capabilities of NiFi as a Data Ingestion Tool
* NiFi vs. Traditional ETL Tools
* Role of NiFi in Data Engineering at Scale
* Simple pipeline to copy files from Local File System and HDFS
As part of this session, I will be giving an introduction to Data Engineering and Big Data. It covers up to date trends.
* Introduction to Data Engineering
* Role of Big Data in Data Engineering
* Key Skills related to Data Engineering
* Role of Big Data in Data Engineering
* Overview of Data Engineering Certifications
* Free Content and ITVersity Paid Resources
Don't worry if you miss the video - you can click on the below link to go through the video after the schedule.
https://youtu.be/dj565kgP1Ss
* Upcoming Live Session - Overview of Big Data Certifications (Spark Based) - https://www.meetup.com/itversityin/events/271739702/
Relevant Playlists:
* Apache Spark using Python for Certifications - https://www.youtube.com/playlist?list=PLf0swTFhTI8rMmW7GZv1-z4iu_-TAv3bi
* Free Data Engineering Bootcamp - https://www.youtube.com/playlist?list=PLf0swTFhTI8pBe2Vr2neQV7shh9Rus8rl
* Join our Meetup group - https://www.meetup.com/itversityin/
* Enroll for our labs - https://labs.itversity.com/plans
* Subscribe to our YouTube Channel for Videos - http://youtube.com/itversityin/?sub_confirmation=1
* Access Content via our GitHub - https://github.com/dgadiraju/itversity-books
* Lab and Content Support using Slack
IT Versity was founded in 2015 by Durga Gadiraju as an online training portal and YouTube channel for emerging technologies like Big Data and Cloud Computing. It has since expanded its services to include engineering, infrastructure management, staffing, and partnerships with colleges and companies. ITVersity now has offices in India and the US, and provides online training, consulting services, and infrastructure management to help individuals and organizations obtain skills in areas such as data engineering, product engineering, and DevOps.
Big Data Certifications Workshop - 201711 - Introduction and Database EssentialsDurga Gadiraju
This document provides an agenda and details for a comprehensive developer workshop on Spark-based big data certifications. The workshop will cover topics like introduction to big data, popular certifications, curriculum including Linux, SQL, Python, Spark, and more. It will run for 4 days a week for 8 weeks, with sessions at different times for India and US. The course fee is $495 or 25,000 INR and includes recorded videos, pre-recorded courses, 3-4 months of lab access, and a certification simulator. A separate document provides details on a database essentials course covering SQL using Oracle and application express.
Big Data Certifications Workshop - 201711 - Introduction and Linux EssentialsDurga Gadiraju
This document provides an agenda for a comprehensive developer workshop on Spark-based big data certifications offered by ITVersity. The workshop will cover topics like Linux essentials, Spark, Spark SQL, streaming analytics using Spark Streaming and MLlib over 4-8 weeks. Students will learn shell scripting and have an exercise to monitor multiple servers using a shell script. The course fee is $495 or 25,000 INR and will include recorded videos, pre-recorded certification courses and a 3-4 month lab access period.
This presentation covers the curriculum for HDPCD Spark certification using Python as programming language. HDPCD stands for Hortonworks Data Platform Certified Developer. This scenario based examination is one of the well recognized Big Data developer certifications.
This document provides an overview of big data and the Spark framework. It discusses the big data ecosystem, including file systems, data ingestion tools, batch and real-time data processing frameworks, visualization tools, and support technologies. It outlines common big data job roles and their associated skills. The document then focuses on Spark, describing its core functionality, modules like DataFrames and MLlib, and execution modes. It provides guidance on learning Spark, emphasizing programming skills and Spark APIs. A demo of Spark fundamentals on a big data lab is also proposed.
This presentation covers "Introduction to Big Data" for enterprises. It includes challenges and benefits of Big Data including transition plan based on few case studies.
When deliberating between CodeIgniter vs CakePHP for web development, consider their respective strengths and your project requirements. CodeIgniter, known for its simplicity and speed, offers a lightweight framework ideal for rapid development of small to medium-sized projects. It's praised for its straightforward configuration and extensive documentation, making it beginner-friendly. Conversely, CakePHP provides a more structured approach with built-in features like scaffolding, authentication, and ORM. It suits larger projects requiring robust security and scalability. Ultimately, the choice hinges on your project's scale, complexity, and your team's familiarity with the frameworks.
Mobile app Development Services | Drona InfotechDrona Infotech
Drona Infotech is one of the Best Mobile App Development Company In Noida Maintenance and ongoing support. mobile app development Services can help you maintain and support your app after it has been launched. This includes fixing bugs, adding new features, and keeping your app up-to-date with the latest
Visit Us For :
SOCRadar's Aviation Industry Q1 Incident Report is out now!
The aviation industry has always been a prime target for cybercriminals due to its critical infrastructure and high stakes. In the first quarter of 2024, the sector faced an alarming surge in cybersecurity threats, revealing its vulnerabilities and the relentless sophistication of cyber attackers.
SOCRadar’s Aviation Industry, Quarterly Incident Report, provides an in-depth analysis of these threats, detected and examined through our extensive monitoring of hacker forums, Telegram channels, and dark web platforms.
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
Preparing Non - Technical Founders for Engaging a Tech AgencyISH Technologies
Preparing non-technical founders before engaging a tech agency is crucial for the success of their projects. It starts with clearly defining their vision and goals, conducting thorough market research, and gaining a basic understanding of relevant technologies. Setting realistic expectations and preparing a detailed project brief are essential steps. Founders should select a tech agency with a proven track record and establish clear communication channels. Additionally, addressing legal and contractual considerations and planning for post-launch support are vital to ensure a smooth and successful collaboration. This preparation empowers non-technical founders to effectively communicate their needs and work seamlessly with their chosen tech agency.Visit our site to get more details about this. Contact us today www.ishtechnologies.com.au
Zoom is a comprehensive platform designed to connect individuals and teams efficiently. With its user-friendly interface and powerful features, Zoom has become a go-to solution for virtual communication and collaboration. It offers a range of tools, including virtual meetings, team chat, VoIP phone systems, online whiteboards, and AI companions, to streamline workflows and enhance productivity.
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
E-commerce Application Development Company.pdfHornet Dynamics
Your business can reach new heights with our assistance as we design solutions that are specifically appropriate for your goals and vision. Our eCommerce application solutions can digitally coordinate all retail operations processes to meet the demands of the marketplace while maintaining business continuity.
What is Augmented Reality Image Trackingpavan998932
Augmented Reality (AR) Image Tracking is a technology that enables AR applications to recognize and track images in the real world, overlaying digital content onto them. This enhances the user's interaction with their environment by providing additional information and interactive elements directly tied to physical images.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
Software Engineering, Software Consulting, Tech Lead, Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Transaction, Spring MVC, OpenShift Cloud Platform, Kafka, REST, SOAP, LLD & HLD.
Hand Rolled Applicative User ValidationCode KataPhilip Schwarz
Could you use a simple piece of Scala validation code (granted, a very simplistic one too!) that you can rewrite, now and again, to refresh your basic understanding of Applicative operators <*>, <*, *>?
The goal is not to write perfect code showcasing validation, but rather, to provide a small, rough-and ready exercise to reinforce your muscle-memory.
Despite its grandiose-sounding title, this deck consists of just three slides showing the Scala 3 code to be rewritten whenever the details of the operators begin to fade away.
The code is my rough and ready translation of a Haskell user-validation program found in a book called Finding Success (and Failure) in Haskell - Fall in love with applicative functors.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
2. About me
Technology leader and evangelist with deep dive expertise in
databases, data warehousing, data integration using tools like
Oracle, Goldengate, Informatica, Hadoop eco system, HBase,
Cassandra, MongoDB etc.
Executed zero downtime cross platform migration and upgrade of
10 Terabyte MDM database for Citigroup using Goldengate and
custom code.
Executed minimum downtime cross data center, cross platform
migration and upgrade of 150 Terabyte Data Warehouse
databases from Mexico to US using custom tool built using
PL/SQL
3. Overview
Oracle is synonym for relational database and is extensively used
for mission critical online and transactional systems. It is leading
and most advanced relational database and Oracle consistently
releases minor as well as major releases with new features.
Enterprises needs to upgrade their Oracle databases to leverage
these new features. Lately most of the enterprises are
consolidating the hardware to cut down the operational costs.
Upgrades and consolidation effort requires migration of the
databases.
4. Upgrade and
Migration
requirements
Upgrades
In place upgrade
Out of place upgrade
Migrations
Zero downtime migration
Minimal downtime migration
Cross platform migration (might include non ASM to ASM)
Cross datacenter migration
5. Tools and
Techniques
Backup, Restore and Recover
Export and Import using Datapump
ETL tools – is not very good approach and hence out of scope for
discussion
Custom Tools
* Goldengate needs to be used for zero down time migration
6. In Place
Upgrade
Steps
Bring down the applications and shutdown the database
Perform in place upgrade
Start the database and bring up the applications
Advantages
It is most straight forward way of upgrading Oracle databases
It works very well for smaller to medium databases which can
entertain few hours of down time
Challenges
Not practical to upgrade multi terabyte large databases
7. Out of place
Upgrade
Steps
Build the target database with desired version
Migrate data from source database to target database
Redirect applications to the target database
Considerations
Reliable testing framework
Solid fall back plan for any unforeseen issues
Pre migrate as much data as possible
Performance testing
8. Migrations
Migrations as part of upgrade
Zero downtime migration
Minimal downtime migration
Cross platform migration (might include non ASM to ASM)
Cross datacenter migration
* At times we need to do multiple things as part of migration
9. Migrations –
Zero
Downtime
Build target database
Migrate data to target database
Set up Goldengate replication to keep databases in synch
Set up Veridata to validate data is in synch
Test applications against target database. Make sure target
database performs better or at similar level to source database.
Perform first 4 steps (if you use snapshot for testing application
there is no need to rebuild)
Cut over applications to target database
Set up reverse replication for desired amount of time (from new
database to old one)
Delete the old database after confirming migration/upgrade is
successful
10. Migrations –
Non Zero
Downtime
Build target database
Migrate all the historic and static data to target database
Develop catch up process to catch up delta
Run catch up process at regular intervals
Test applications against target database. Make sure target
database performs better or at similar level to source database.
Cut over applications to target database
Test all the reports that requires latest data thoroughly
Enable ETL process on the target database
If possible continue running ETL on both source database and
target database (for a fall back plan)
11. Migrations –
Challenges
Volume (if improperly planned)
Cost of the migration effort will grow exponentially with the volume
of data
Requires additional storage on both source and target
Copy can take enormous amount of time
Availability
Cutover time is critical for most of the databases. It should be either
zero or as minimum as possible
Fall back plan
If some thing goes wrong there should be a way to fall back,
especially for mission critical transactional applications
Data Integrity
12. Migrations –
Challenges
(RMAN)
Backup, Restore and Recovery can take enormous amount of
storage and copy time over the network for very large databases.
With out Goldengate there is no feasible way to run catch ups
There is no easy fall back plan
13. Migrations –
Challenges
(Data Pump)
Export, copy and import can take enormous amount of time
Using export and import to catch up for final cut over is not
straight forward
Import is always serial at a given table (both for partitioned as well
as non partitioned)
Even if one uses parallel export and import, overall data migration
time is greater than or equal to export and import of the largest
table, build indexes.
14. Do Yourself
Parallelism
Given the challenges with migration of large databases that run
into tens of terabytes, it requires custom tools
Over time I have developed a tool which does that (DYPO – Do
Yourself Parallel Oracle)
Idea is to get true degree of parallelism while migrating the data
15. DYPO
(Architecture)
Uses PL/SQL code
Runs on the target database
Computes row id ranges for the tables or partitions (if required)
Selects data over database link
Inserts using append or plain insert (depending up on the size of
table)
Ability to run multiple jobs to load into multiple tables or multiple
partitions or multiple row id chunks
Ability to control number of jobs that can be run at a given point in
time
Keeps track of successful or failed inserts including counts
Ability to catch up by dropping dropped partitions, adding new
partitions and load data from only new partitions
Extension of the tool have the ability to get counts from source
table in parallel (which is key for validation)
16. DYPO
Advantages
No additional storage required
No additional scripting required to copy data in parallel
Code is completely written in PL/SQL
Keep tracks of what is successful and what is not. If migration
failed on a very large table after completing, we just need to copy
data for failed chunks
No need to have separate process to create indexes, as indexes
can be pre created while copying the data
It can be used as baseline data migration before starting
Goldengate replication for zero downtime migration
It can be effectively used to pre-migrate most of the data for very
large Operational Data Stores and Data Warehouses for minimal
downtime for cutover.
17. DYPO – Pre-
requisites
Read only access to source database
SELECT_CATALOG role to the user with some system privileges
such as select any table
Export and import tables using data pump with no data
Increase INITRANS on all the tables in target database
Disable logging on target database
Constraints have to be disabled while migration is going on and
enable with novalidate after migration is done
Source database can be 8i or later
Target database can be 10g or later
18. DYPO –
Known Issues
Not tested for special datatypes (such as clob, blob, xml etc)
Not tested for clustered tables