Running E-Business Suite Database on Oracle Database ApplianceMaris Elsins
This is my Collaborate 13 presentation.
ODA is a pre-configured, simple setup, high performance engineered system running 11gR2 cluster. It is a great choice for small to medium sized DBs and if you wish it can be used for Oracle EBS DB too. This paper will show you how the standardized configuration of ODA can be adjusted to comply with the specific requirements of e-Business Suite without sacrificing ODA’s flexibility and supportability. The paper will also share author’s experience migrating, running and maintaining R12 database tier on ODA.
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesMarkus Michalewicz
Oracle OpenWorld 2019 featured a presentation on best practices for high availability (HA) features in Oracle Database versions 12c, 18c, and 19c. The presentation covered key HA capabilities like Oracle Multitenant and Pluggable Databases, Data Guard, Hang Manager, and Real Application Clusters. It provided an overview of how each feature enables common lifecycle operations and maintenance tasks to be performed with minimal downtime.
You need to investigate some performance issues on an Oracle database, but you have no access to Oracle Enterprise Manager (OEM), or what is worse, not even to SQL*Plus. Where do you start? What is the first query you ask the DBA to execute for you? What is your second and third? Do you request an AWR report? For which snapshots? This scenario is not uncommon or unheard, it happens often to 3rd party consultants, or even to internal DBAs and Developers when the administration of the database has been subcontracted and access has been restricted.
Performance is not the only case where you may need to reach a database and struggle with access; doing a database health-check or collecting historical performance for a capacity planning exercise may face the same issues. What you wish you had is access to a restricted SQL*Plus account, even if remote, and a toolset to collect as much information as possible from the database of interest; or to ask a DBA with access to this database to simply run this toolset and give you back all the output so you can find answers to most of your questions.
eDB360 is a free tool that installs nothing on the database, executes through a SQL*Plus connection, and produces a zip file with a comprehensive report that provides a 360-degree view of an Oracle database. This session is about edb360. It covers what is included on its output, how you execute this tool, and how it can be used to gain a fair understanding of an Oracle database. This session is for DBAs, Developers and Consultants.
Benefits:
1. Learn how to get a fair 360-degree view of a database
2. Gather enough database information to start a health-check
3. Learn which performance to collect for a sizing exercise
Customer migration to azure sql database from on-premises SQL, for a SaaS app...George Walters
Why would someone take a working on-premises SaaS infrastructure, and migrate it to Azure? We review the technology decisions behind this conversion, and business choices behind migrating to Azure. The SQL 2012 infrastructure and application was migrated to PaaS Services. Finally, how would we do this architecture in 2019.
Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...Jim Czuprynski
The Internet of Things (IoT) has deep use cases - energy grids, communications, policing, security, and manufacturing. I’ll show how to use Oracle 19c’s Fast Ingest and Fast Lookup features to load IoT data from “edge” sources to take immediate advantage of that information in nearly real time.
Ben Prusinski is presenting on Oracle R12 E-Business Suite performance tuning. He will cover methodology, best practices, and techniques from basic to advanced. The presentation includes tuning at the infrastructure, application, and database levels with a focus on a holistic approach. Specific areas that will be discussed are concurrent manager tuning including queue size, sleep cycle, cache size, and number of processes.
The document provides an overview of the Oracle Exadata X10M Database Machine. Key points include:
- It features the latest 96-core AMD EPYC CPUs, up to 3TB of memory per database server, and 100Gb RDMA networking.
- Storage options include High Capacity servers with 264TB disk and 27.2TB flash, Extreme Flash servers with 122.88TB flash storage, and Extended servers with 264TB disk.
- The machines deliver extreme performance and scalability for all database workloads through automated management and database-optimized hardware and software.
Running E-Business Suite Database on Oracle Database ApplianceMaris Elsins
This is my Collaborate 13 presentation.
ODA is a pre-configured, simple setup, high performance engineered system running 11gR2 cluster. It is a great choice for small to medium sized DBs and if you wish it can be used for Oracle EBS DB too. This paper will show you how the standardized configuration of ODA can be adjusted to comply with the specific requirements of e-Business Suite without sacrificing ODA’s flexibility and supportability. The paper will also share author’s experience migrating, running and maintaining R12 database tier on ODA.
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesMarkus Michalewicz
Oracle OpenWorld 2019 featured a presentation on best practices for high availability (HA) features in Oracle Database versions 12c, 18c, and 19c. The presentation covered key HA capabilities like Oracle Multitenant and Pluggable Databases, Data Guard, Hang Manager, and Real Application Clusters. It provided an overview of how each feature enables common lifecycle operations and maintenance tasks to be performed with minimal downtime.
You need to investigate some performance issues on an Oracle database, but you have no access to Oracle Enterprise Manager (OEM), or what is worse, not even to SQL*Plus. Where do you start? What is the first query you ask the DBA to execute for you? What is your second and third? Do you request an AWR report? For which snapshots? This scenario is not uncommon or unheard, it happens often to 3rd party consultants, or even to internal DBAs and Developers when the administration of the database has been subcontracted and access has been restricted.
Performance is not the only case where you may need to reach a database and struggle with access; doing a database health-check or collecting historical performance for a capacity planning exercise may face the same issues. What you wish you had is access to a restricted SQL*Plus account, even if remote, and a toolset to collect as much information as possible from the database of interest; or to ask a DBA with access to this database to simply run this toolset and give you back all the output so you can find answers to most of your questions.
eDB360 is a free tool that installs nothing on the database, executes through a SQL*Plus connection, and produces a zip file with a comprehensive report that provides a 360-degree view of an Oracle database. This session is about edb360. It covers what is included on its output, how you execute this tool, and how it can be used to gain a fair understanding of an Oracle database. This session is for DBAs, Developers and Consultants.
Benefits:
1. Learn how to get a fair 360-degree view of a database
2. Gather enough database information to start a health-check
3. Learn which performance to collect for a sizing exercise
Customer migration to azure sql database from on-premises SQL, for a SaaS app...George Walters
Why would someone take a working on-premises SaaS infrastructure, and migrate it to Azure? We review the technology decisions behind this conversion, and business choices behind migrating to Azure. The SQL 2012 infrastructure and application was migrated to PaaS Services. Finally, how would we do this architecture in 2019.
Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...Jim Czuprynski
The Internet of Things (IoT) has deep use cases - energy grids, communications, policing, security, and manufacturing. I’ll show how to use Oracle 19c’s Fast Ingest and Fast Lookup features to load IoT data from “edge” sources to take immediate advantage of that information in nearly real time.
Ben Prusinski is presenting on Oracle R12 E-Business Suite performance tuning. He will cover methodology, best practices, and techniques from basic to advanced. The presentation includes tuning at the infrastructure, application, and database levels with a focus on a holistic approach. Specific areas that will be discussed are concurrent manager tuning including queue size, sleep cycle, cache size, and number of processes.
The document provides an overview of the Oracle Exadata X10M Database Machine. Key points include:
- It features the latest 96-core AMD EPYC CPUs, up to 3TB of memory per database server, and 100Gb RDMA networking.
- Storage options include High Capacity servers with 264TB disk and 27.2TB flash, Extreme Flash servers with 122.88TB flash storage, and Extended servers with 264TB disk.
- The machines deliver extreme performance and scalability for all database workloads through automated management and database-optimized hardware and software.
Class lecture by Prof. Raj Jain on Storage Virtualization. The talk covers Disk Arrays, Data Access Methods, SCSI (Small Computer System Interface), Advanced Technology Attachment (ATA), ESCON and FICON, Fibre Chanel, Fibre Channel Devices, Fibre Channel Protocol Layers, Fibre Channel Flow Control, Fibre Channel Classes of Service, What is Storage Virtualization?, Benefits of Storage Virtualization, Virtualizing Storage, RAID Levels, Nested RAIDs, Synchronous vs. Asynchronous Replication, Virtual Storage Area Network (VSAN), Physical Storage Network, Virtual Storage Network, SAN vs. NAS, iSCSI (Internet Small Computer System Interface), iFCP (Internet Fiber Channel Protocol), FCIP (Fibre Channel over IP), FCoE (Fibre Channel over Ethernet), Virtual File Systems. Video recording available in YouTube.
The document discusses Oracle Real Application Clusters (RAC) architecture and internals. A typical RAC configuration includes multiple nodes connected to a public network, interconnect, and shared storage. Oracle Grid Infrastructure manages the clusterware and Automatic Storage Management. It provides high availability of databases and other applications by enabling them to run on multiple nodes and utilize the shared storage. The document covers various RAC components like VIPs, listeners, SCAN, client connectivity, node membership, and the interconnect.
This document provides an overview of Red Hat JBoss Fuse, an open source integration platform. It discusses the history and components of JBoss Fuse, including Apache Camel, CXF, ActiveMQ, Karaf and Fabric8. It describes how JBoss Fuse can enable integration everywhere in a real-time enterprise by integrating applications, services, devices and partners through its lightweight footprint and deployment options both on-premise and in the cloud. The document also highlights key benefits of JBoss Fuse such as reducing costs, simplifying management and enabling new business opportunities through greater connectivity and data sharing.
Introduction of the possibilities to integrate with Dynamics 365 CE / PowerApps Platform. Talks about FLow, LogicApp and Azure Integration Services (Service Bus).
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1Tanel Poder
The document describes troubleshooting a complex performance issue in an Oracle database. Key details:
- The problem was sporadic extreme slowness of the Oracle database and server lasting 1-20 minutes.
- Initial AWR reports and OS metrics showed a spike at 18:10 with CPU usage at 66.89%, confirming a problem occurred then.
- Further investigation using additional metrics was needed to fully understand the root cause, as initial diagnostics did not provide enough context about this brief problem period.
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...Nicolas Fränkel
When one’s app is challenged with poor performances, it’s easy to set up a cache in front of one’s SQL database. It doesn’t fix the root cause (e.g. bad schema design, bad SQL query, etc.) but it gets the job done. If the app is the only component that writes to the underlying database, it’s a no-brainer to update the cache accordingly, so the cache is always up-to-date with the data in the database.
Things start to go sour when the app is not the only component writing to the DB. Among other sources of writes, there are batches, other apps (shared databases exist, unfortunately), etc. One might think about a couple of ways to keep data in sync i.e. polling the DB every now and then, DB triggers, etc. Unfortunately, they all have issues that make them unreliable and/or fragile.
In this talk, I will describe an easy-to-setup architecture that leverages CDC to have an evergreen cache.
On the Application of AI for Failure Management: Problems, Solutions and Algo...Jorge Cardoso
Artificial Intelligence for IT Operations (AIOps) is a class of software which targets the automation of operational tasks through machine learning technologies. ML algorithms are typically used to support tasks such as anomaly detection, root-causes analysis, failure prevention, failure prediction, and system remediation. AIOps is gaining an increasing interest from the industry due to the exponential growth of IT operations and the complexity of new technology. Modern applications are assembled from hundreds of dependent microservices distributed across many cloud platforms, leading to extremely complex software systems. Studies show that cloud environments are now too complex to be managed solely by humans. This talk discusses various AIOps problems we have addressed over the years and gives a sketch of the solutions and algorithms we have implemented. Interesting problems include hypervisor anomaly detection, root-cause analysis of software service failures using application logs, multi-modal anomaly detection, root-cause analysis using distributed traces, and verification of virtual private cloud networks.
The Top 5 Reasons to Deploy Your Applications on Oracle RACMarkus Michalewicz
This document discusses the top 5 reasons to deploy applications on Oracle Real Application Clusters (RAC). It discusses how RAC provides:
1. Developer productivity through transparency that allows developers to focus on application code without worrying about high availability or scalability.
2. Integrated scalability for both applications and database features through techniques like parallel execution and cache fusion that allow linear scaling.
3. Seamless high availability for the entire application stack through capabilities like fast reconfiguration times and zero data loss that prevent application outages.
4. Isolated consolidation for converged use cases through features like pluggable database isolation that allow secure sharing of hardware resources.
5. Full flexibility to choose deployment options
Automation Patterns for Scalable Secret ManagementMary Racter
So you’ve scaled your app up to 1000 instances. Do they all share the same credentials for access to stateful resources? Then the attack surface for your stateful resources just got scaled up too. Automated secret management lets you focus on scaling up your app, not your risk of data compromise.
This talk aims to introduce some important considerations in attack surface management at scale, and provide some patterns and tips on integrating secret management workflows into Continuous Deployment infrastructure.
Oracle RAC is a clustered version of the Oracle database that uses a shared disk architecture. It allows multiple instances of the database to run concurrently on multiple nodes, providing high availability and scalability. The document discusses how clients can connect to Oracle RAC using SCAN, which provides a single virtual IP address and listener for the entire cluster, making client connections easier to manage. It also covers how SCAN works with load balancing and provides failover between instances in the cluster.
Towards Digital Twin standards following an open source approachFIWARE
Digital Twins are gaining momentum when designing smart solutions in different application domains. However, there is a lack of open standards that warrant interoperability and portability of solutions, avoiding vendor lock-in.
During the presentation, we will review major developments in this area, focused on the adoption of a standard API for accessing Digital Twin Data and Smart Data Models. We will review how a Digital Twin approach enables data integration at different levels: architecting vertical smart solutions, within smart organizations and across organizations. At all levels interfacing with IoT, BigData, AI/ML, Blockchain, or Robotics technologies.
DOAG Oracle Unified Audit in Multitenant EnvironmentsStefan Oehrli
Oracle Audit is a well-known and proven database functionality. Or maybe not? What does auditing look like in combination with Oracle Multitenant Databases? Does database and Unified Audit work analogous to existing configurations? In the context of this presentation the auditing in the environment of container databases will be examined more closely. It will be shown what has to be considered and how an auditing concept has to be adapted to the new architecture. With focus on the current versions of the Oracle database, specific problems and workarounds in the area of Unified Audit will be shown. The presentation will be complemented by corresponding examples and live demos.
Building Data Pipelines with Spark and StreamSetsPat Patterson
Big data tools such as Hadoop and Spark allow you to process data at unprecedented scale, but keeping your processing engine fed can be a challenge. Metadata in upstream sources can ‘drift’ due to infrastructure, OS and application changes, causing ETL tools and hand-coded solutions to fail. StreamSets Data Collector (SDC) is an Apache 2.0 licensed open source platform for building big data ingest pipelines that allows you to design, execute and monitor robust data flows. In this session we’ll look at how SDC’s “intent-driven” approach keeps the data flowing, with a particular focus on clustered deployment with Spark and other exciting Spark integrations in the works.
The document discusses Dell EMC VxRail, a hyper-converged appliance that combines servers, storage, and networking into a single system. It is presented as the standard in hyper-converged infrastructure and focuses on enabling business innovation through consumption-based buying which allows customers to focus resources on differentiating their business instead of IT integration. VxRail offers various configurations and scale options to match different use cases from small to large environments.
Deploying CloudStack and Ceph with flexible VXLAN and BGP networking ShapeBlue
1) The document discusses using VXLAN, BGP and EVPN to implement a layer 3 network for a cloud deployment using Ceph and CloudStack. This allows scaling beyond the limits of layer 2 networks and VLANs.
2) Key infrastructure components discussed include Dell S5232F-ON switches running Cumulus Linux, SuperMicro hypervisors and Ceph storage servers using NVMe SSDs.
3) The deployment provides high performance private and public cloud infrastructure with scalable networking and over 650TB of reliable Ceph storage per rack.
Oracle Cloud is Best for Oracle Database - High AvailabilityMarkus Michalewicz
This presentation looks behind the covers and evaluates the offerings provided by various cloud vendors and compares them to the Oracle Database offerings available in the Oracle Cloud. The comparison includes Oracle Database in general, focusing on High Availability (HA) and Disaster Recovery (DR), as those areas have historically distinguished the Oracle Database from other databases and will likely continue to be some of the most distinguishing features when it comes to operating the Oracle Database in the cloud.
What to Expect From Oracle database 19cMaria Colgan
The Oracle Database has recently switched to an annual release model. Oracle Database 19c is only the second release in this new model. So what can you expect from the latest version of the Oracle Database? This presentation explains how Oracle Database 19c is really 12.2.0.3 the terminal release of the 12.2 family and the new features you can find in this release.
Cause 2013: A Flexible Approach to Creating an Enterprise Directoryrwgorrel
Leveraging Microsoft Active Directory LDS to create a flexible enterprise directory.
As UNCG sought to replace Novell Directory Services with the next generation enterprise authentication and directory services (LDAP), we examined OpenLDAP, Active Directory, and Active Directory Lightweight Domain Services. Hear why we picked a somewhat uncommon approach in the less known AD LDS product and the flexibility it afforded us a middle ground between OpenLDAP and the urge to use existing Active Directory domain. We will also discuss the ADAMSync tool used to populate this environment as well as the MSUserProxy object to centralize authentication.
Why Wait? Realtime Ingestion With Chen Qin and Heng Zhang | Current 2022HostedbyConfluent
Historically, Pinterest data warehouse ingestion and indexing services were implemented on batch ETL and Kafka streaming respectively. As the product side leans more toward real-time and near-realtime data to innovate and compete, teams work together to revamp the ingestion and processing stack in Pinterest.
In this talk, we plan to share our near-real-time ingestion system built on top of Apache Kafka, Apache Flink, and Apache Iceberg. We pick ANSI SQL as the common currency to minimize the ""lambda architecture"" learning curve of teams adopting fresh data near-realtime data.
Class lecture by Prof. Raj Jain on Storage Virtualization. The talk covers Disk Arrays, Data Access Methods, SCSI (Small Computer System Interface), Advanced Technology Attachment (ATA), ESCON and FICON, Fibre Chanel, Fibre Channel Devices, Fibre Channel Protocol Layers, Fibre Channel Flow Control, Fibre Channel Classes of Service, What is Storage Virtualization?, Benefits of Storage Virtualization, Virtualizing Storage, RAID Levels, Nested RAIDs, Synchronous vs. Asynchronous Replication, Virtual Storage Area Network (VSAN), Physical Storage Network, Virtual Storage Network, SAN vs. NAS, iSCSI (Internet Small Computer System Interface), iFCP (Internet Fiber Channel Protocol), FCIP (Fibre Channel over IP), FCoE (Fibre Channel over Ethernet), Virtual File Systems. Video recording available in YouTube.
The document discusses Oracle Real Application Clusters (RAC) architecture and internals. A typical RAC configuration includes multiple nodes connected to a public network, interconnect, and shared storage. Oracle Grid Infrastructure manages the clusterware and Automatic Storage Management. It provides high availability of databases and other applications by enabling them to run on multiple nodes and utilize the shared storage. The document covers various RAC components like VIPs, listeners, SCAN, client connectivity, node membership, and the interconnect.
This document provides an overview of Red Hat JBoss Fuse, an open source integration platform. It discusses the history and components of JBoss Fuse, including Apache Camel, CXF, ActiveMQ, Karaf and Fabric8. It describes how JBoss Fuse can enable integration everywhere in a real-time enterprise by integrating applications, services, devices and partners through its lightweight footprint and deployment options both on-premise and in the cloud. The document also highlights key benefits of JBoss Fuse such as reducing costs, simplifying management and enabling new business opportunities through greater connectivity and data sharing.
Introduction of the possibilities to integrate with Dynamics 365 CE / PowerApps Platform. Talks about FLow, LogicApp and Azure Integration Services (Service Bus).
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1Tanel Poder
The document describes troubleshooting a complex performance issue in an Oracle database. Key details:
- The problem was sporadic extreme slowness of the Oracle database and server lasting 1-20 minutes.
- Initial AWR reports and OS metrics showed a spike at 18:10 with CPU usage at 66.89%, confirming a problem occurred then.
- Further investigation using additional metrics was needed to fully understand the root cause, as initial diagnostics did not provide enough context about this brief problem period.
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...Nicolas Fränkel
When one’s app is challenged with poor performances, it’s easy to set up a cache in front of one’s SQL database. It doesn’t fix the root cause (e.g. bad schema design, bad SQL query, etc.) but it gets the job done. If the app is the only component that writes to the underlying database, it’s a no-brainer to update the cache accordingly, so the cache is always up-to-date with the data in the database.
Things start to go sour when the app is not the only component writing to the DB. Among other sources of writes, there are batches, other apps (shared databases exist, unfortunately), etc. One might think about a couple of ways to keep data in sync i.e. polling the DB every now and then, DB triggers, etc. Unfortunately, they all have issues that make them unreliable and/or fragile.
In this talk, I will describe an easy-to-setup architecture that leverages CDC to have an evergreen cache.
On the Application of AI for Failure Management: Problems, Solutions and Algo...Jorge Cardoso
Artificial Intelligence for IT Operations (AIOps) is a class of software which targets the automation of operational tasks through machine learning technologies. ML algorithms are typically used to support tasks such as anomaly detection, root-causes analysis, failure prevention, failure prediction, and system remediation. AIOps is gaining an increasing interest from the industry due to the exponential growth of IT operations and the complexity of new technology. Modern applications are assembled from hundreds of dependent microservices distributed across many cloud platforms, leading to extremely complex software systems. Studies show that cloud environments are now too complex to be managed solely by humans. This talk discusses various AIOps problems we have addressed over the years and gives a sketch of the solutions and algorithms we have implemented. Interesting problems include hypervisor anomaly detection, root-cause analysis of software service failures using application logs, multi-modal anomaly detection, root-cause analysis using distributed traces, and verification of virtual private cloud networks.
The Top 5 Reasons to Deploy Your Applications on Oracle RACMarkus Michalewicz
This document discusses the top 5 reasons to deploy applications on Oracle Real Application Clusters (RAC). It discusses how RAC provides:
1. Developer productivity through transparency that allows developers to focus on application code without worrying about high availability or scalability.
2. Integrated scalability for both applications and database features through techniques like parallel execution and cache fusion that allow linear scaling.
3. Seamless high availability for the entire application stack through capabilities like fast reconfiguration times and zero data loss that prevent application outages.
4. Isolated consolidation for converged use cases through features like pluggable database isolation that allow secure sharing of hardware resources.
5. Full flexibility to choose deployment options
Automation Patterns for Scalable Secret ManagementMary Racter
So you’ve scaled your app up to 1000 instances. Do they all share the same credentials for access to stateful resources? Then the attack surface for your stateful resources just got scaled up too. Automated secret management lets you focus on scaling up your app, not your risk of data compromise.
This talk aims to introduce some important considerations in attack surface management at scale, and provide some patterns and tips on integrating secret management workflows into Continuous Deployment infrastructure.
Oracle RAC is a clustered version of the Oracle database that uses a shared disk architecture. It allows multiple instances of the database to run concurrently on multiple nodes, providing high availability and scalability. The document discusses how clients can connect to Oracle RAC using SCAN, which provides a single virtual IP address and listener for the entire cluster, making client connections easier to manage. It also covers how SCAN works with load balancing and provides failover between instances in the cluster.
Towards Digital Twin standards following an open source approachFIWARE
Digital Twins are gaining momentum when designing smart solutions in different application domains. However, there is a lack of open standards that warrant interoperability and portability of solutions, avoiding vendor lock-in.
During the presentation, we will review major developments in this area, focused on the adoption of a standard API for accessing Digital Twin Data and Smart Data Models. We will review how a Digital Twin approach enables data integration at different levels: architecting vertical smart solutions, within smart organizations and across organizations. At all levels interfacing with IoT, BigData, AI/ML, Blockchain, or Robotics technologies.
DOAG Oracle Unified Audit in Multitenant EnvironmentsStefan Oehrli
Oracle Audit is a well-known and proven database functionality. Or maybe not? What does auditing look like in combination with Oracle Multitenant Databases? Does database and Unified Audit work analogous to existing configurations? In the context of this presentation the auditing in the environment of container databases will be examined more closely. It will be shown what has to be considered and how an auditing concept has to be adapted to the new architecture. With focus on the current versions of the Oracle database, specific problems and workarounds in the area of Unified Audit will be shown. The presentation will be complemented by corresponding examples and live demos.
Building Data Pipelines with Spark and StreamSetsPat Patterson
Big data tools such as Hadoop and Spark allow you to process data at unprecedented scale, but keeping your processing engine fed can be a challenge. Metadata in upstream sources can ‘drift’ due to infrastructure, OS and application changes, causing ETL tools and hand-coded solutions to fail. StreamSets Data Collector (SDC) is an Apache 2.0 licensed open source platform for building big data ingest pipelines that allows you to design, execute and monitor robust data flows. In this session we’ll look at how SDC’s “intent-driven” approach keeps the data flowing, with a particular focus on clustered deployment with Spark and other exciting Spark integrations in the works.
The document discusses Dell EMC VxRail, a hyper-converged appliance that combines servers, storage, and networking into a single system. It is presented as the standard in hyper-converged infrastructure and focuses on enabling business innovation through consumption-based buying which allows customers to focus resources on differentiating their business instead of IT integration. VxRail offers various configurations and scale options to match different use cases from small to large environments.
Deploying CloudStack and Ceph with flexible VXLAN and BGP networking ShapeBlue
1) The document discusses using VXLAN, BGP and EVPN to implement a layer 3 network for a cloud deployment using Ceph and CloudStack. This allows scaling beyond the limits of layer 2 networks and VLANs.
2) Key infrastructure components discussed include Dell S5232F-ON switches running Cumulus Linux, SuperMicro hypervisors and Ceph storage servers using NVMe SSDs.
3) The deployment provides high performance private and public cloud infrastructure with scalable networking and over 650TB of reliable Ceph storage per rack.
Oracle Cloud is Best for Oracle Database - High AvailabilityMarkus Michalewicz
This presentation looks behind the covers and evaluates the offerings provided by various cloud vendors and compares them to the Oracle Database offerings available in the Oracle Cloud. The comparison includes Oracle Database in general, focusing on High Availability (HA) and Disaster Recovery (DR), as those areas have historically distinguished the Oracle Database from other databases and will likely continue to be some of the most distinguishing features when it comes to operating the Oracle Database in the cloud.
What to Expect From Oracle database 19cMaria Colgan
The Oracle Database has recently switched to an annual release model. Oracle Database 19c is only the second release in this new model. So what can you expect from the latest version of the Oracle Database? This presentation explains how Oracle Database 19c is really 12.2.0.3 the terminal release of the 12.2 family and the new features you can find in this release.
Cause 2013: A Flexible Approach to Creating an Enterprise Directoryrwgorrel
Leveraging Microsoft Active Directory LDS to create a flexible enterprise directory.
As UNCG sought to replace Novell Directory Services with the next generation enterprise authentication and directory services (LDAP), we examined OpenLDAP, Active Directory, and Active Directory Lightweight Domain Services. Hear why we picked a somewhat uncommon approach in the less known AD LDS product and the flexibility it afforded us a middle ground between OpenLDAP and the urge to use existing Active Directory domain. We will also discuss the ADAMSync tool used to populate this environment as well as the MSUserProxy object to centralize authentication.
Why Wait? Realtime Ingestion With Chen Qin and Heng Zhang | Current 2022HostedbyConfluent
Historically, Pinterest data warehouse ingestion and indexing services were implemented on batch ETL and Kafka streaming respectively. As the product side leans more toward real-time and near-realtime data to innovate and compete, teams work together to revamp the ingestion and processing stack in Pinterest.
In this talk, we plan to share our near-real-time ingestion system built on top of Apache Kafka, Apache Flink, and Apache Iceberg. We pick ANSI SQL as the common currency to minimize the ""lambda architecture"" learning curve of teams adopting fresh data near-realtime data.
This document provides a summary of Boobalan Muthuku's experience as an Informatica Developer. He has 8 years of experience developing data warehouses using Informatica PowerCenter. He has extensive experience extracting data from sources like Oracle, SQL Server, and flat files, transforming the data using mappings, and loading it into Oracle databases. He has also implemented slowly changing dimensions, incremental loads, and performance tuning.
Venkata Ramana Reddy has over 5 years of experience as an IT analyst specializing in Informatica administration, Oracle databases, and SAP Data Services. He has extensive expertise in tasks like server maintenance, job scheduling, security configuration, and issue resolution. Ramana has worked on projects in data warehousing and ETL for clients such as Apple and Bell Helicopter, developing jobs to load and transform data between source and target systems. He is proficient in technologies including Informatica 9.1, Oracle 11g, Unix shell scripting, and SAP Data Services 4.0.
Treating operational aspects of software as 'non-functional requirements' and 'an Ops problem' rather than a core part of the software product leads to poor live service and unexplained errors in Production.
Traceability, deployability, recoverability, diagnosability, monitorability, and high quality logging are key features of a software system, along with user-visible features surfaced via the UI, or a capability of an API endpoint.
However, many Product Owners understandably feel uneasy about taking on the (necessary) responsibility for prioritising operational features alongside user-visible and API features.
This session brings Scrum Masters and Product Owners up to speed on operational features and covers proven practices for improving operability in an Agile context, empowering Product Owners to make effective prioritisation choices about all kinds of product features, whether user-visible or operational.
Transforming Data Architecture Complexity at Sears - StampedeCon 2013StampedeCon
At the StampedeCon 2013 Big Data conference in St. Louis, Justin Sheppard discussed Transforming Data Architecture Complexity at Sears. High ETL complexity and costs, data latency and redundancy, and batch window limits are just some of the IT challenges caused by traditional data warehouses. Gain an understanding of big data tools through the use cases and technology that enables Sears to solve the problems of the traditional enterprise data warehouse approach. Learn how Sears uses Hadoop as a data hub to minimize data architecture complexity – resulting in a reduction of time to insight by 30-70% – and discover “quick wins” such as mainframe MIPS reduction.
Airflow is a platform for authoring, scheduling, and monitoring workflows or data pipelines. It uses a directed acyclic graph (DAG) to define dependencies between tasks and schedule their execution. The UI provides dashboards to monitor task status and view workflow histories. Hands-on exercises demonstrate installing Airflow and creating sample DAGs.
Maintenance plans provide a way to automate database maintenance tasks such as integrity checks, index maintenance, and backups. They can be created using the Maintenance Plan Wizard or Maintenance Plan Designer. Common tasks include checking database integrity with DBCC CHECKDB, reorganizing or rebuilding indexes, updating statistics, and performing full, differential or transaction log backups. Care must be taken to choose the right tasks and schedule to maintain performance and protect the database.
SQLSaturday is a training event for SQL Server professionals and those wanting to learn about SQL Server. This event will be held Jun 13 2015 at Hochschule Bonn-Rhein-Sieg, Grantham-Allee 20, St. Augustin, Rheinland, 53757, Germany. Admittance to this event is free, all costs are covered by donations and sponsorships. Please register soon as seating is limited, and let friends and colleagues know about the event.
###
Maintenance Plans for Beginners (but not only) | Each of experienced administrators used (to some extent) what is called Maintenance Plans - Plans of Conservation. During this session, I'd like to discuss what can be useful for us to provide functionality when we use them and what to look out for. Session at 200 times the forward-300, with the opening of the discussion.
In the session, we discussed the End-to-end working of Apache Airflow that mainly focused on "Why What and How" factors. It includes the DAG creation/implementation, Architecture, pros & cons. It also includes how the DAG is created for scheduling the Job and what all steps are required to create the DAG using python script & finally with the working demo.
In the session, we discussed the End-to-end working of Apache Airflow that mainly focused on "Why What and How" factors. It includes the DAG creation/implementation, Architecture, pros & cons. It also includes how the DAG is created for scheduling the Job and what all steps are required to create the DAG using python script & finally with the working demo.
Venkateswarareddy has over 3 years of experience as a System Administrator working with Unix systems like Red Hat Linux and Solaris. He is proficient with monitoring tools like Nagios, application monitoring, running SQL queries, assisting with user acceptance testing, and user/group management. He currently works as a Senior System Administrator at NTT DATA Global Service Delivery where he is responsible for providing Linux server support, maintenance, job scheduling, backup/restores, and issue resolution.
C19013010 the tutorial to build shared ai services session 2Bill Liu
This document provides an agenda and overview for a tutorial on building shared AI services. The session will cover AI engineering platforms, data pipelines, traditional AI roles and their challenges, skills required for AI engineers, and benchmarking machine learning and deep learning approaches. It includes a live demo of building an end-to-end AI pipeline with Kafka, NiFi, Spark Streaming and Keras on Spark.
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)Sascha Wenninger
Provides an overview of popular integration approaches, maps them to SAP's integration tools and concludes with some lessons learnt in their application.
Database failover from client perspectivePriit Piipuu
In this presentation we will look deep into high availability technologies Oracle RAC provides for database clients, what actually happens during database instance failover or planned maintenance and how to configure database services so that Java applications experience no or minimal disruption during planned maintenance or unplanned downtime. This presentation will mainly focus on JDBC and UCP clients.
The document discusses three important things for IT leaders to know about SQL Server: database performance and speed matter; backups and disaster recovery plans are not all equal; and high availability/disaster recovery (HA/DR) tools provide proactive disaster protection. It provides tips on optimizing database performance through query tuning instead of hardware upgrades. It explains the importance of backing up transaction logs and having comprehensive disaster recovery plans, including solutions like AlwaysOn availability groups. The document promotes the services of SQLWatchmen for database diagnostics, tuning, disaster planning and recovery support.
Building Apps with Distributed In-Memory Computing Using Apache GeodePivotalOpenSourceHub
Slides from the Meetup Monday March 7, 2016 just before the beginning of #GeodeSummit, where we cover an introduction of the technology and community that is Apache Geode, the in-memory data grid.
Neha Sharma has over 4.5 years of experience as an Oracle DBA. She has extensive hands-on experience installing, configuring, and managing Oracle databases from versions 9i to 12c. She is proficient in backup and recovery strategies using RMAN, performance tuning, and high availability solutions. Currently she works as an Oracle Trainer and has managed database administration projects.
Similar to How does Apache DolphinScheduler (Incubator) support scheduling 100,000-level data tasks? (20)
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
4. 4
Apache DolphinScheduler Introduction
1、Established in Analysys in 2017.
2、Open source in March 2019 and join Apache incubator
in August.
3、Dedicated to solving the complex dependencies in data
processing , it assembles Tasks in DAG, which can monitor
the status of tasks in real time, and supports such
operations as retrying, resuming from specified tasks,
suspending and terminating tasks.
6. 6
Pain points
Visual DAG Dependency
High availability Alert mechanism
01
02
03
04
05
Simple and easy to operate
View task status in real time
Visual task log
Workflow fault tolerance
Failed retry, rollback, transfer
Easy maintenance
task self-dependency
workflow dependency and so on
Alert plugin:mail/sms/wechat…
Warning
Multi task types
Cross language
Custom task Plugin
Easy to extend
Complement
re-fresh historical data
06
11. 11
Advantages
Simple and Easy
High reliability
Rich usage scenarios High scalability
Decentralized multi-Master and multi-
worker, self-supporting HA
Task queue to avoid overload
Fault-tolerant capability
Process definitions are visualized through drag
and drop
Open API
One-click deployment
Support pause and resume operation.
Support multi-tenancy
Support more task types, such as
spark, hive, mr, python, sub_process,
shell
Support custom task types
Scheduling capacity grows linearly with the cluster
Master and Worker support dynamic online and
offline
12. 12
Main capabilities
• Workflow can be timed,
dependent, manual,
pause/stop/resume
• Tasks are associated in DAG form
• Real-time monitoring of task status
• Supports more than 10
task types such as Shell,
MR, Spark, SQL, and
dependency task type
• Workflow priority, task
priority,
• global parameters and
local parameters
• Complete system
monitoring, task overtime
alarm/failure.
• Supports multi-tenancy, online
log viewing and resource online
management
• Supports stable
operation of 100,000
data tasks per day
• The decentralized design
ensures the stability and
high availability of the
system
13. 13
Process definition visualized drag-and-drop configuration
1. Visualized drag-and-drop
2. Support multi data task type,
includes Shell、DataSource、Spark、
Flink、MR、Python、Http,
ChildProcess、and Task
Dependency
3. Child Process
• workflow building reuse, avoid
repeated configuration,
16. 16
Task management: multi-granularity monitoring of task status
Tracking of task execution status
Task status data statistics Process instance status view
Task execution log online
17. 17
Data source management: visual configure multiple data source
1. Visualize Data sources
include :MySql、
PostgerSql、Hive、
Impala、Spark、
ClickHouse、Oracle、
SqlServer、DB2、
MongoDB.
2. Pluggable data source
extension
3. Visualize data source
management,
Configure once, use
everywhere.
18. 18
Workflow startup management
Task failure strategy:
1. Continue
2. End
Multi notification strategy
1. Success
2. Failure
3. All
4. None
Workflow Priority
Complement Data
1. Serial execution
2. Parallel execution
24. 24
DolphinScheduler 1.3 Feature – K8S Support
Advantage:
1. Elastic scaling
2. Make full use of server resources
3. Environmental isolation
Disadvantage:
K8S maintenance experience
Cloud native is the trend
25. 25
DolphinScheduler 1.3 Other Features
Batch export and import workflow
Process definition copy
Delete process instance cascade delete task log
Simplify configuration and optimize deployment
experience
27. 27
DolphinScheduler 1.3 New Architecture
Reduce the pressure on the database
• Worker remove DB operation, Single
responsibility
• Master and Worker communicate
directly to reduce latency
• Master multi strategy to distribute
tasks
- Random, round-robin and linear
maximum base on CPU & Memory
29. 29
Experience: Priority
no priority design and fair scheduling
design:
• The task submitted first may be
completed at the same time as the
task submitted later
• Low-priority services run first,
occupying resources and not
releasing
Question: Solution:
different process priority > process instance
sequence > task instance priority > task
instance sequence
default: FIFO
31. 31
Experience: Data component integration
Current more than 10 task types may not meet the
business demand
data sync task
kettle task
data
quality
...
SQL task
procedure task
business task
Solution:
task plugin
hot pluggable
33. 33
Practice of DolphinScheduler in Analysys
• Analysys Qianfan is an App
benchmarking analysis
product.
• Qianfan is a SaaS service app
that needs to process tens of
billions of data every day,
620 million monthly
activities, and 6.8 PB of big
data clusters through tens of
thousands of ETL tasks
processing every day.
• In 2018, we started to use
DolphinScheduler to
schedule the entire ETL
process.
• The picture on the right is
one of the workflows
34. 34
Practice of DolphinScheduler in Baosight
Extensions implemented by Baosight:
• Plugin type task
• Resource cache
• SQL function extension
• Message triggered scheduling
• Multiple data source access
• Workflow concurrency control
• Operation audit
• Alert optimization
• Configuration management
• Access control
• Operational data archiving
35. 35
Practice of DolphinScheduler in Qianxin
3
2
1
6
7
8
4 5
9
Online manage resource files
don't worry about losing the jar
Cluster high availability
decentralization
Support multi-tenant
we can't use the same
account
Privilege management
can only access authorized
projects and resources
Complex scheduling
cron、dependent、manual
scheduling
Multi task types
Visualization
Distributed & easy to
extend
no single point of issue
insufficient resources need extend
spark、shell、mr、hive
python…
drag and drop to generate
DAG
Workflow
Task failure retry/alarm
retry times? interval? email?
Why DolphinScheduler ?
37. 37
DolphinScheduler Roadmap Draft
• Master refactor: api communicate with master, event-driven, etc.
• Task parameter transfer
• Task Plugin (doing)
• Concurrency control of tasks
• Workflow trigger
• Data quality
• List dependency (upstream dependency)
• Support multi-cluster online release
• Workflow version management
• Permission redesign
• Easy to use
If you have more suggestions for Roadmap, please disscuss in dev mailing list
38. 38
Open source history
2017.12
2018.05
2019.02
2019.03
2019.05
2019.08
External seed users
officially open sourced on
March 30th - 1.0.0
Decided to open
source
cost 2 months refactoring
Internal use
qianfan product use
DolphinScheduler
Architecture design
Enter apache
incubator
Release Version
1.0.1、1.0.2、1.0.3
First Apache Version
1.2.0
2019.12
…
1.3.2
41. 41
Online DEMO: http://106.75.43.194:8888/
website:https://dolphinscheduler.apache.org
github : https://github.com/apache/incubator-dolphinscheduler
get help:
Submit an issue
Mail to dev-subscribe@dolphinscheduler.apache.org, follow the reply to subscribe the mail list.
DolphinScheduler Resources
Welcome to join the community. Joining open source starts from submitting the first PR.
- try to find the “easy to fix” mark or some very easy issue, submit PR