SlideShare a Scribd company logo
1 of 41
Download to read offline
Welcome!
This presentation is a collection of topics around the challenges of sharing and
moving data between different types of relational databases.
Often there is a need to move or share data between different repositories and
companies often find this difficult because they underestimate the effort required
as it seems a simple process on the surface, or they don’t fully understand the
capabilities and limitations of the platforms and tools they are using.
A want to talk a bit about the SQL language, in particular, the SQL Standards that
have made inter-operability easier between RDBMS but is also a source of
confusion in the interpretation and implementation of those standards.
I will discuss some of the options available when moving or sharing data between
different types of databases.
Migration tools, replication tools, and access tools.
And I want to spend a bit of time talking about Oracle’s Heterogeneous Services
which is an area of Oracle functionality not often discussed and, I suspect,
relatively unfamiliar to a lot of you.
1
Here’s the “me” slide.
I started life in Oracle Pre-Sales before becoming an independent contractor doing
mostly operational DBA work.
Then I spent a few years designing and building RAC systems but for the last 3 or 4
years I’ve been specialising in logical replication within the Oracle area.
And as you can see, I like fishing from a kayak.
2
Cloudification is our way of cutting through the hype and confusion that businesses
commonly associate with moving their infrastructure and services into the Cloud.
We do this by focusing on the key issues that they face with their projects, and by
putting things in plain English for them.
Let’s have a quick look at the what, why, and who questions around heterogeneous
data.
3
So, the dictionary definition of the word heterogeneous which is something like
“not of the same kind or type”
doesn’t quite match with the word as it’s commonly used in computing.
In computing, it normally refers to a difference in architecture of the same type of
thing, be it hardware like processor, memory bus, etc,
or software like different types of relational database management systems.
4
A brief definition of Heterogeneous Data for the purposes of this presentation.
As I said, this topic brings together a number of different areas of interest but as a
one-liner, what I’ll be talking about is the accessing, moving, copying, or
synchronising of data between Oracle and non-Oracle SQL databases. I will
mention briefly some of the reasons for, and challenges of, re-platforming an
application’s data repository.
Accessing Data – Ability to remotely read, and possibly change, data in a non-
Oracle repository.
Moving Data – So Migrations, re-platforming. These are essentially “one-off” style
operations where, at the end of it, there is still only one copy of the data.
Copying Data – So this is essentially a static copy with batch updates for reasons of
accessibility, reliability, and availability.
Potentially a “one-off” style operation but more likely an on-going batch style
refresh process.
Synchronising Data – So this is essentially a dynamic copy. Keeping a copy of the
data in-sync with the original by applying any changes to the copy. This is typically a
continuous, or near continuous, process.
5
So basically, it’s “Oracle to non-Oracle RDBMS” or “non-Oracle RDBMS to Oracle”
What I won’t be talking about is moving data between other data stores like
noSQL or Haadoop style. That’s a whole other topic and too much to cover in one
presentation.
5
Why is Heterogeneous Data relevant?
Hands up who has more than one type of data repository in your organisation?
If you work for a medium to large business, there is a very good chance you will
have more than one, and probably more than one type of RDBMS.
• One of the use-cases I’m seeing is businesses are wanting to replicate their
data from their operational data store, typically Oracle to Microsoft SQL Server
to use it’s Reporting and/or Analysis Services for BI reporting.
• Another one is the re-platforming of an existing data repository by migrating it
to another database platform for various reasons. This may require Change
Data Capture (CDC) techniques to minimise the disruption during the exercise.
• And often, replicating data to a less expensive platform, like Oracle to MySQL
for example, and then directing the reporting and/or query requirements of the
application to the read-only MySQL copy is seen as a way to reduce the load on
the production system and extend it’s useful lifetime. However, as I’ll try to
point out, this is often not as easy as it seems.
6
I think everyone could benefit from at least a little knowledge of the capabilities
and limitations heterogeneous data access.
• Business Owners need to know they have options. Having data trapped in a
proprietary or “one of a kind” repository could be hurting your business. “Yeah,
we have that information, but it’s part of the ABC software package we
purchased a few years ago and it runs on a XYZ database, all our other stuff is
Oracle.”
• Architects/Developers needs to know the options. If you don’t know all the
alternatives, how can you recommend the best option? Is Goldengate really the
best data replication solution for your requirements? Maybe Dbvisit Replicate
would have been a much more cost effective option, or maybe Active Data
Guard would be a better fit?
• DBA/Operators need to know capabilities and limitations of the chosen
solution. “The third party ODBC driver cost additional money but it was 3 times
faster than the included driver.”
7
If you’re anything like me who has spent most of their career working with only
one type of database and looking down my nose at anyone using anything else, it
was a bit of an attitude adjustment to find companies making successful use of
something other than Oracle!
According to db-engines.com, which is a site that keeps track of DBMS and
publishes a “Popularity” ranking.
This is not a scientific measurement, rather it’s a ranking based on current activity
in social media and internet.
Number of mentions of the systems on websites using Google and Bing.
General interest in the system using Google Trends
Frequency of technical discussions about the system from Stack Overflow and DBA
Stack Exchange.
Number of job offers, in which the system is mentioned on international job sites
like Indeed and Simply Hired.
Number of profiles in professional networks, in which the system is mentioned like
Linkedin.
Relevance in social networks like Twitter.
As you can see the top three clearly distinguish themselves.
Note that vertical scale is logarithmic so the top three players, namely Oracle,
MySQL, and SQL Server are streets ahead of the others.
8
So it's not a count of the installed base of each RDBMS, but it's probably better
than that, it offers an early indicator as to the trending direction of these products.
So if you're looking for your next RDBMS, this is the info you need.
8
These rankings do include non-relational database management systems.
I think MongoDB is classified as a NoSQL database
Although NoSQL doesn’t mean “No SQL” it means “Not Only SQL”.
Most of the NoSQL databases out there today are more accurately “No Relational”
than “No SQL”.
The split between commercial and open source stands at about 1/3 open source
and 2/3 commercial.
So, what’s the big deal? The Relational Database Management Systems all use SQL
standard as their language, so how hard can it be to move data and/or applications
between them?
9
Quick show of hands. Who has written SQL? And who has written SQL with
consideration given to the SQL standard? Not many, if any.
SQL became an ANSI standard in 1986 and an ISO standard in 1987. Since then, the
standard has been enhanced several times with added features.
Despite these standards, code is not completely portable among different database
systems.
The different makers do not perfectly adhere to the standard, for instance by
adding extensions, and the standard itself is sometimes ambiguous.
There has been 7 revisions to the SQL standard since the SQL-86 ANSI standard.
The most significant was the second revision in 1992 (SQL-92) where it's entry level
standard was adopted as FIPS 127-2. Federal Information Processing Standards
(FIPS)
This was significant because up until 1996, there was an independent body,
the National Institute of Standards and Technology (NIST) that used to certify SQL
DBMS compliance with the current SQL standard, but they stopped doing this in
1996.
So the next release of SQL in 1999 (SQL-99) was the first release of the SQL
standard where the database vendors self-certified their compliance against the
standard, and for the last 18 years, through 4 more revisions to the SQL standard,
10
database vendors have been self-certifying the compliance of their products with
the latest SQL standard.
If you really want your own copy of the latest SQL standard, SQL-2011, it’s
available from standards.co.nz but it comes in something like 13 parts and is not
light reading. Each part costs about $250 so you’re looking at about over $3000 for
your own copy.
So, how different are the SQL based RDBMSs? Let’s take a look at a few
comparisons and I think you’ll get the idea.
10
First off, each RDBMS typically has very different internals, Database vendors are
free to do whatever they like with things that aren’t covered by any standard, or
indeed, interpret and implement and part of a standard as they see fit.
So what we get are SQL based RDBMS that are very different under the hood.
Concurrency (locking) models are very different and will affect the application in
periods of high concurrency.
Some databases may not flag a certain condition as an error while others will. In
fact you may have noticed during upgrade testing of Oracle that certain conditions
that weren’t considered an error are now raised as one due to Oracle tightening up
on it’s error checking from version to version.
I should note that there’s not a single database that follows the SQL standard
100%. Oracle, SQL Server, MySQL, DB2 and others, each claim certain levels of
support for the standard, but as you have seen with multiple versions of the
standard and self certification, even that statement is open to interpretation.
11
Not all databases implement all the standard SQL datatypes, and if they do, they
are often not the same.
By way of example, I’d like to look at a very simple datatype, that is, the CHAR
datatype.
The CHAR datatype, as you probably know, it a fixed length string datatype and is a
core SQL standard datatype.
I want to look at the two CHAR requirements, as specified by the SQL standard and
see how three of the leading RDBMS have implemented them.
12
So as you can see, even simple requirements for a basic datatype are not always
implemented consistently.
13
I don’t want to bore you with endless examples, so I’ve chosen one simple function
from the SQL standard to highlight the point.
String concatenation is a core function of the SQL standard and is done using the ||
operator with one of the rules that if any argument string is NULL then the
resulting concatenated string is NULL.
And speaking of NULLs…
14
So we all know what NULLs are, right?
NULLS support the representation of "missing or inapplicable information".
In SQL, NULL is a state (unknown) and not a value.
Misunderstanding of how NULLs work is the cause of a great number of errors in
SQL code.
These mistakes are usually the result of confusion between NULL and either 0
(zero) or an empty string, which is a string value with a length of zero.
NULL is defined by the ISO SQL standard as different from both an empty string and
the numerical value 0, however and while NULL indicates the absence of a value,
the empty string and numerical zero both represent actual values. And I think
that’s the source of most of the confusion.
Let’s look at how some of the RDBMS handle just one aspect of NULL processing.
That is, where do NULLs sit when sorted column containing NULLS.
But first, let’s look at what does the SQL standard has to say about this?
The core standard doesn’t explicitly define a default sort order for NULLs but in a
2003 optional extension, NULLs can be sorted using the NULLS FIRST or NULLS
LAST addition to the ORDER BY clause, but not all vendors have implemented this.
Nulls are ordered differently in Oracle compared with SQL Server or MySQL.
So, depending on how your SQL statements are written, they could produce a
15
different output if you executed the same (valid) SQL on Oracle or SQL Server.
PostgreSQL is different again by the way. (orders NULLs higher than non-NULL
values and allows the standard NULLS FIRST or NULLS LAST clauses)
15
So, how did a supposed standard become to be so different across vendors
implementing and supporting the standard?
• The complexity and size of the SQL standard means that most implementers do
not support the entire standard.
• The standard doesn’t specify database behaviour in several important areas
(e.g. indexes, file storage...), leaving the database vendors to decide how it
should behave.
• The SQL standard precisely specifies the syntax that a conforming database
system must implement. However, the standard's specification of the
semantics of language constructs is less well-defined, leading to ambiguity.
• Many database vendors have large existing customer bases; where the newer
version of the SQL standard conflicts with the prior behaviour of the vendor's
database, the vendor may be unwilling to break this backward compatibility.
• There is little commercial incentive for vendors to make it easier for users to
change database suppliers.
• Users evaluating database software tend to place other factors such as
performance higher in their priorities rather than compliance with standards.
16
So, how can you guard against your application issuing “non-standard” SQL?
I want to introduce you to what I am confidently calling “The most useless piece of
functionality in Oracle”.
16
Trouble is, Oracle supports numerous features that extend beyond what they call
standard SQL.
According to the Oracle manual, and this is a quote, “If you are concerned with the
portability of your applications to other implementations of SQL, then use Oracle's
FIPS Flagger to help identify the use of Oracle extensions to SQL92.”
FIPS, by the way, stands for Federal Information Processing Standard. It’s an
American standard developed by the US Federal government and they are usually
the same or slightly modified versions of ANSI, IEEE, or ISO standards.
The FLAGGER parameter specifies FIPS flagging, which causes an error message to
be generated when a SQL statement issued is an extension of the Entry Level of
SQL-92, which is a standard that has been superseded by SQL2008 (but there is no
FIPS certification for SQL2008).
FLAGGER is a session level parameter only. You can’t set it at the database level,
and why would you want to anyway?
17
So, what happens when you set the fips FLAGGER?
Here’s a simple test with a very basic table.
So I create the table,
then set the session level FLAGGER
and try a very simple SQL statement.
Now I’ve tried to make sense of the error message but the answer must be buried
in the SQL standard and I’m not stumping up $3k to find out.
18
But it gets even weirder.
With the fips FLAGGER set, here’s a select using a NUMBER column and a numeric
digit, and it works!
But try an inequality match with != and it errors, telling you in the error message to
try <> instead.
But when you try that form of inequality, it still says that function is not part of the
ANSI standard!
19
Oh, and I have to show you this.
Here’s what happens when you set the fips FLAGGER before creating the table.
So it seems that the fip FLAGGER is either broken or so restrictive that is appears to
be broken.
In the end, the FIPS 127-2 is 22 years old, is based on a version of SQL that is 5
versions old.
The last version of Oracle that complied with FIPS 127-2 was (probably) Oracle 7.
The standards body that certified compliance with the standard stopped 18 years
ago.
While SQL-92 has been superseded by other releases, there has been no
conformance testing authority for any version of SQL since SQL-92; hence, Entry
SQL-92 offers you the most assurance of portability. But appears to be broken and
is practically useless.
To be fair, Oracle had to include the fips FLAGGER in the code as part of their
compliance with FIPS.
(You’ve paid good money for all those neat Oracle features. Use them!)
20
But an RDBMS is more than just datatypes and functions and it’s ability to execute
SQL.
There are a raft of other considerations if you are considering re-platforming your
application to another database.
In fact, depending on the application, often the data migration is one of the easier
tasks.
Much more difficult is the migration of things like stored code (PL/SQL), Security
and access (users, privileges)
There are tools available to help to with re-platforming.
SSMA does a reasonable job if you’re moving from Oracle to SQL Server.
Oracle’s SQL Developer (apparently) does a reasonable job at migrating a selection
of common RDBMS to Oracle, and it’s free.
21
So, just to wrap up this whole SQL standard thing.
Don’t you love it when some consultant answers your question with the “It
depends” answer?
• No, because it frequently changes, is ambiguous in places, contains many
optional parts and no database vendor follows the standard 100%
• Yes, because it gives us, at the very least, a framework or common ground.
Standards promote a common skill set amongst IT professionals.
SQL’s a standard, but it’s a loose one at best. It’s useful for what it is, but don’t
make assumptions that it
It’s not a paved highway between different types of RDBMS that will let you flip
between vendors with ease.
It’s more like a gravel road that provides a path but you may get a bit dusty if you
travel it.
22
Ok, now that we have looked at some of the challenges with heterogeneous data,
let’s take a look at some of the technology solutions currently available to assist
with moving, replicating, or accessing data across different types of RDBMS.
Before I start, I’ll note again that this is not a complete list of solutions, even for the
top RDBMS’s mentioned at the start of the presentation.
These are the ones that are most obvious as a solution or those that I’ve had some
experience with so I feel I’m qualified to comment.
23
Most databases have much better tools and utilities for getting data into their
database compared with ways of transferring data to other types of databases.
Here’s some of the common ones but there are also plenty of 3rd party utilities
available ranging from free to very expensive but in my experience, you definitely
pay for want you get in this area.
So if you’re looking to migrate your data from A to B, look at the tools and utilities
available from B, they will usually be better than those from A.
I guess this makes sense from a competitive point of view. Let’s make it easy for
customers to move data into our database but don’t give them any help moving
data out of our database.
24
I’m going to give a special mention to MySQL and it’s migration tool, mainly
because of it’s relationship to the Oracle RDBMS and what Oracle did to MySQL’s
migration tool.
A little bit of the interesting history behind MySQL
MySQL was created in 1995
In 2000 a company called Innobase developed the InnoDB storage engine for
MySQL. This is what made MySQL a “real” RDBMS as it included things like
transactions, row level locking, and foreign keys, etc.
In 2005, Oracle acquired Innobase saying it wanted to increase support for Open
Source software. (yeah, right). It was really a strategic move by Oracle to squeeze
the life out of MySQL.
Also in 2005, MySQL released a utility called the MySQL Migration Toolkit as part of
MySQL GUI Tools Bundle that offered Oracle to MySQL schema and data transfer.
In 2008, Sun acquired MySQL.
In 2010, Oracle purchased Sun and acquired MySQL in the process.
Now, I thought Oracle would kill MySQL but I'm happy to see they have continued
to support and enhance the platform. Oracle OpenWorld this year had over 70
sessions around MySQL content. Although, the cynic in me thinks Oracle is still
trying to keep MySQL from being a serious competitor with Oracle’s database.
In 2010 MySQL added migration functionality to their MySQL WorkBench utility
which replaced the Migration Toolkit.
And when they did than, Oracle de-supported the Oracle database as a source for
25
migration. So you couldn’t do Oracle to MySQL anymore.
I can't find anything official except some forum comments to the effect that
"Migration from Oracle DB's is not supported."
So, If you want migrate data from Oracle to MySQL, you can’t do it with the MySQL
Workbench as Oracle has removed that functionality.
There are other third party solutions for Oracle -> MySQL migrations. Eg
http://www.ispirer.com/products/oracle-to-mysql-migration
Going the other way, as I’ve mentioned, using Oracle’s SQL Developer can migrate
a selection of common RDBMS to Oracle, and it’s free.
25
Also, I very quick mention of some of the products that enable you to capture
changes to data in one type of database and apply those changes into another type
of database.
That is, replicating data between heterogeneous databases, and by this I mean,
synchronised copies in near real time.
In heterogeneous environments, this typically means the logical replication of the
data where the SQL that is executed on the source database that changes data (I’m
talking about the DML statements of the SQL language like insert, update, delete)
is extracted as they occur, again, typically from the databases transaction logs and
converted to the native SQL of the target database. This process is known as
Change Data Capture, or CDC.
Logical replication using Change Data Capture is often a viable solution in
heterogeneous environments because the they have the ability to translate the
changes into the native SQL of the target database, so once the bulk of the data
has been migrated to the target, a heterogeneous CDC product can keep the two
data sources in sync.
There are many companies offering heterogeneous change data capture with
Oracle being at least one of the source and/or target databases.
• Oracle GoldenGate http://www.oracle.com/us/products/middleware/data-
26
integration/goldengate/overview/index.html
• Dbvisit Replicate
http://www.dbvisit.com/products/dbvisit_replicate_real_time_oracle_databa
se_replication/
• Dell SharePlex http://www.quest.com/shareplex-for-oracle/
• Attunity Replicate http://www.attunity.com/products/attunity-replicate
• Informatica Data Replication http://www.informatica.com/uk/products/data-
replication/data-replication/
• HVR Software http://www.hvr-software.com/product/real-time-database-
replication
• Astera Change Data Capture http://www.astera.com/solutions/technology-
solutions/change-data-capture
• Gravic Shadowbase
http://www.gravic.com/shadowbase/solutions/overview.html
• IBM InfoSphere http://www-
03.ibm.com/software/products/en/ibminfochandatacapt
• Hit Software DBMoto
http://www.hitsw.com/products_services/dbmoto/DBMoto_for_Oracle.html
26
In the final section of this presentation, I’d like to talk briefly about Oracle’s
solution to heterogeneous data access from other relational data sources.
Oracle’s had this functionality for many years but it’s gone though a few a number
of name changes. It started off being something called SQL*Connect, then
Transparent Gateways, but the latest name under 12c is Oracle Database
Gateways. But it’s essentially part of what Oracle called Heterogeneous Services
under 11g.
Oracle Gateways allow heterogeneous data access from other relational data
sources to an Oracle application.
Gateways are available for RDBMs like DB2 and SQL Server but also non relational
data sources like Excel and transaction managers like IBM’s CICS and message
queuing systems like IBM’s MQ.
The Gateways are a separate purchased option but are available, with a couple of
exceptions, for both Standard and Enterprise database editions.
These gateways handle some of the issues I have been talking about like SQL
translations, dictionary translations, datatype mappings.
The gateways for specific databases aren’t cheap. About the same per processor
license cost as GoldenGate, however the Database Gateway for ODBC, which is a
27
generic gateway for any ODBC compliant non-Oracle system is free with the
database although more functionally restricted that the specific Database
Gateways and you typically still need to purchase an ODBC driver
27
The way they work is that SQL statements are translated into the SQL of the non-
Oracle database.
With SQL statements, if the functionality is missing on the non-Oracle system, then
either a simpler query is issued, or the statement is broken up into multiple queries
and the results are obtained by post-processing in the Oracle database.
Remember, most of these features come with a list of restrictions and limitations
to capability so it’s not as simple as I’ve described it. For example, the
Heterogeneous Connectivity User’s Guide lists 10 rules restricting the use of SQL
statements in a heterogeneous distributed environment, so it’s not 100%
transparent.
But here’s a couple of examples of what I’m talking about.
28
All RDBM’s store metadata, that is, data about the data. Trouble is, they all store
this information in different ways.
One of the facilities that the Gateway provides is data dictionary translations.
So the example shows Oracle executing a select from the ALL_CATALOG data
dictionary but through a link to a SQL Server database.
The Gateway intercepts the query and translates it into the dictionary objects of
the SQL Server database.
The results of the new query are then returned to the user as it the information
came from the ALL_CATALOG view within Oracle.
29
There’s a package that’s part of Oracle’s heterogeneous services that deserves
special mention.
Using the DBMS_HS_PASSTHROUGH package allows you to execute SQL
statements directly on the non-Oracle system without them being interpreted by
the Oracle database.
What’s special about DBMS_HS_PASSTHROUGH is that it’s a virtual package, It
doesn’t exist in the Oracle or non-Oracle system, yet it still works! Conceptually it
resides on the non-Oracle system but in reality, calls to the package are intercepted
by the Heterogeneous Services component of Oracle and mapped to one of the
Gateway calls.
30
And so, just to wrap up before I take any questions, here’s a few key points from
the session.
• Know that there are options out there for accessing a moving data between
different types of data stores.
Depending on you position within your company, you may not need be
aware of them all but at least know someone who does and can select the
one that’s right for you.
• SQL databases are not the same, but with the SQL language, and some careful
consideration, they can work together, often seamlessly.
• Don’t sacrifice performance and features for conformity. Use what you have
been given, and paid for, to the best of it’s ability.
32
I want to leave a bit of time for questions so I’ve skipped a few topics like database
abstraction layers and ODBC.
Also, the combinations of lots of RDBMS’s and business use-case requirements
results in dozens of different functional specifications and it would be impossible to
cover all the options in this type of session but knowing the capabilities of the
available options will help you select the best fit to your requirements.
33
Ok, I hope you found that interesting and learnt a few things along the way.
Well, thank you for your attendance, and please enjoy the rest of the conference.
Thanks!
34

More Related Content

What's hot

Asp.net interview questions
Asp.net interview questionsAsp.net interview questions
Asp.net interview questionsAkhil Mittal
 
Making the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked DataMaking the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked DataKingsley Uyi Idehen
 
Oracle Exadata Interview Questions and Answers
Oracle Exadata Interview Questions and AnswersOracle Exadata Interview Questions and Answers
Oracle Exadata Interview Questions and AnswersExadatadba
 
Karen's Favourite Features of SQL Server 2016
Karen's Favourite Features of  SQL Server 2016Karen's Favourite Features of  SQL Server 2016
Karen's Favourite Features of SQL Server 2016Karen Lopez
 
Redis Cashe is an open-source distributed in-memory data store.
Redis Cashe is an open-source distributed in-memory data store.Redis Cashe is an open-source distributed in-memory data store.
Redis Cashe is an open-source distributed in-memory data store.Artan Ajredini
 
call for paper 2012, hard copy of journal, research paper publishing, where t...
call for paper 2012, hard copy of journal, research paper publishing, where t...call for paper 2012, hard copy of journal, research paper publishing, where t...
call for paper 2012, hard copy of journal, research paper publishing, where t...IJERD Editor
 
Why no sql_ibm_cloudant
Why no sql_ibm_cloudantWhy no sql_ibm_cloudant
Why no sql_ibm_cloudantPeter Tutty
 
Data warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionData warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionKlaudiia Jacome
 
Key aspects of big data storage and its architecture
Key aspects of big data storage and its architectureKey aspects of big data storage and its architecture
Key aspects of big data storage and its architectureRahul Chaturvedi
 
PLSQL Standards and Best Practices
PLSQL Standards and Best PracticesPLSQL Standards and Best Practices
PLSQL Standards and Best PracticesAlwyn D'Souza
 
Trends in Computer Science and Information Technology
Trends in Computer Science and Information TechnologyTrends in Computer Science and Information Technology
Trends in Computer Science and Information Technologypeertechzpublication
 
Deploying data tier applications sql saturday dc
Deploying data tier applications sql saturday dcDeploying data tier applications sql saturday dc
Deploying data tier applications sql saturday dcJoseph D'Antoni
 
The Rise of Nosql Databases
The Rise of Nosql DatabasesThe Rise of Nosql Databases
The Rise of Nosql DatabasesJAMES NGONDO
 

What's hot (20)

Artigo no sql x relational
Artigo no sql x relationalArtigo no sql x relational
Artigo no sql x relational
 
Erciyes university
Erciyes universityErciyes university
Erciyes university
 
Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)
 
Asp.net interview questions
Asp.net interview questionsAsp.net interview questions
Asp.net interview questions
 
Midao JDBC presentation
Midao JDBC presentationMidao JDBC presentation
Midao JDBC presentation
 
Making the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked DataMaking the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked Data
 
Oracle Exadata Interview Questions and Answers
Oracle Exadata Interview Questions and AnswersOracle Exadata Interview Questions and Answers
Oracle Exadata Interview Questions and Answers
 
Mobile datebase tool
Mobile datebase toolMobile datebase tool
Mobile datebase tool
 
Karen's Favourite Features of SQL Server 2016
Karen's Favourite Features of  SQL Server 2016Karen's Favourite Features of  SQL Server 2016
Karen's Favourite Features of SQL Server 2016
 
Redis Cashe is an open-source distributed in-memory data store.
Redis Cashe is an open-source distributed in-memory data store.Redis Cashe is an open-source distributed in-memory data store.
Redis Cashe is an open-source distributed in-memory data store.
 
call for paper 2012, hard copy of journal, research paper publishing, where t...
call for paper 2012, hard copy of journal, research paper publishing, where t...call for paper 2012, hard copy of journal, research paper publishing, where t...
call for paper 2012, hard copy of journal, research paper publishing, where t...
 
Why no sql_ibm_cloudant
Why no sql_ibm_cloudantWhy no sql_ibm_cloudant
Why no sql_ibm_cloudant
 
Data warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionData warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and vision
 
Key aspects of big data storage and its architecture
Key aspects of big data storage and its architectureKey aspects of big data storage and its architecture
Key aspects of big data storage and its architecture
 
PLSQL Standards and Best Practices
PLSQL Standards and Best PracticesPLSQL Standards and Best Practices
PLSQL Standards and Best Practices
 
Trends in Computer Science and Information Technology
Trends in Computer Science and Information TechnologyTrends in Computer Science and Information Technology
Trends in Computer Science and Information Technology
 
Data dictionary
Data dictionaryData dictionary
Data dictionary
 
Deploying data tier applications sql saturday dc
Deploying data tier applications sql saturday dcDeploying data tier applications sql saturday dc
Deploying data tier applications sql saturday dc
 
The NoSQL Movement
The NoSQL MovementThe NoSQL Movement
The NoSQL Movement
 
The Rise of Nosql Databases
The Rise of Nosql DatabasesThe Rise of Nosql Databases
The Rise of Nosql Databases
 

Similar to Heterogeneous Data - Published

SQL vs MongoDB
SQL vs MongoDBSQL vs MongoDB
SQL vs MongoDBcalltutors
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYCHands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYCLaura Ventura
 
3 pages Research paper be sure to include 2 referencesResearch T.docx
3 pages Research paper be sure to include 2 referencesResearch T.docx3 pages Research paper be sure to include 2 referencesResearch T.docx
3 pages Research paper be sure to include 2 referencesResearch T.docxgilbertkpeters11344
 
Webcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopWebcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopImpetus Technologies
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLbalwinders
 
Enterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillEnterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillBilly Newport
 
Module-1.pptx63.pptx
Module-1.pptx63.pptxModule-1.pptx63.pptx
Module-1.pptx63.pptxShrinivasa6
 
1. introduction to no sql
1. introduction to no sql1. introduction to no sql
1. introduction to no sqlAnuja Gunale
 
Discussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxDiscussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxmadlynplamondon
 
SQL vs NoSQL deep dive
SQL vs NoSQL deep diveSQL vs NoSQL deep dive
SQL vs NoSQL deep diveAhmed Shaaban
 

Similar to Heterogeneous Data - Published (20)

SQL vs MongoDB
SQL vs MongoDBSQL vs MongoDB
SQL vs MongoDB
 
On no sql.partiii
On no sql.partiiiOn no sql.partiii
On no sql.partiii
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Dwh faqs
Dwh faqsDwh faqs
Dwh faqs
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYCHands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
 
No sql database
No sql databaseNo sql database
No sql database
 
Report 1.0.docx
Report 1.0.docxReport 1.0.docx
Report 1.0.docx
 
Report 2.0.docx
Report 2.0.docxReport 2.0.docx
Report 2.0.docx
 
3 pages Research paper be sure to include 2 referencesResearch T.docx
3 pages Research paper be sure to include 2 referencesResearch T.docx3 pages Research paper be sure to include 2 referencesResearch T.docx
3 pages Research paper be sure to include 2 referencesResearch T.docx
 
Webcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopWebcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond Hadoop
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Enterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillEnterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison Pill
 
Module-1.pptx63.pptx
Module-1.pptx63.pptxModule-1.pptx63.pptx
Module-1.pptx63.pptx
 
1. introduction to no sql
1. introduction to no sql1. introduction to no sql
1. introduction to no sql
 
Discussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxDiscussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docx
 
12363 database certification
12363 database certification12363 database certification
12363 database certification
 
NoSQL Basics and MongDB
NoSQL Basics and  MongDBNoSQL Basics and  MongDB
NoSQL Basics and MongDB
 
No sql
No sqlNo sql
No sql
 
SQL vs NoSQL deep dive
SQL vs NoSQL deep diveSQL vs NoSQL deep dive
SQL vs NoSQL deep dive
 

Heterogeneous Data - Published

  • 1. Welcome! This presentation is a collection of topics around the challenges of sharing and moving data between different types of relational databases. Often there is a need to move or share data between different repositories and companies often find this difficult because they underestimate the effort required as it seems a simple process on the surface, or they don’t fully understand the capabilities and limitations of the platforms and tools they are using. A want to talk a bit about the SQL language, in particular, the SQL Standards that have made inter-operability easier between RDBMS but is also a source of confusion in the interpretation and implementation of those standards. I will discuss some of the options available when moving or sharing data between different types of databases. Migration tools, replication tools, and access tools. And I want to spend a bit of time talking about Oracle’s Heterogeneous Services which is an area of Oracle functionality not often discussed and, I suspect, relatively unfamiliar to a lot of you. 1
  • 2. Here’s the “me” slide. I started life in Oracle Pre-Sales before becoming an independent contractor doing mostly operational DBA work. Then I spent a few years designing and building RAC systems but for the last 3 or 4 years I’ve been specialising in logical replication within the Oracle area. And as you can see, I like fishing from a kayak. 2
  • 3. Cloudification is our way of cutting through the hype and confusion that businesses commonly associate with moving their infrastructure and services into the Cloud. We do this by focusing on the key issues that they face with their projects, and by putting things in plain English for them. Let’s have a quick look at the what, why, and who questions around heterogeneous data. 3
  • 4. So, the dictionary definition of the word heterogeneous which is something like “not of the same kind or type” doesn’t quite match with the word as it’s commonly used in computing. In computing, it normally refers to a difference in architecture of the same type of thing, be it hardware like processor, memory bus, etc, or software like different types of relational database management systems. 4
  • 5. A brief definition of Heterogeneous Data for the purposes of this presentation. As I said, this topic brings together a number of different areas of interest but as a one-liner, what I’ll be talking about is the accessing, moving, copying, or synchronising of data between Oracle and non-Oracle SQL databases. I will mention briefly some of the reasons for, and challenges of, re-platforming an application’s data repository. Accessing Data – Ability to remotely read, and possibly change, data in a non- Oracle repository. Moving Data – So Migrations, re-platforming. These are essentially “one-off” style operations where, at the end of it, there is still only one copy of the data. Copying Data – So this is essentially a static copy with batch updates for reasons of accessibility, reliability, and availability. Potentially a “one-off” style operation but more likely an on-going batch style refresh process. Synchronising Data – So this is essentially a dynamic copy. Keeping a copy of the data in-sync with the original by applying any changes to the copy. This is typically a continuous, or near continuous, process. 5
  • 6. So basically, it’s “Oracle to non-Oracle RDBMS” or “non-Oracle RDBMS to Oracle” What I won’t be talking about is moving data between other data stores like noSQL or Haadoop style. That’s a whole other topic and too much to cover in one presentation. 5
  • 7. Why is Heterogeneous Data relevant? Hands up who has more than one type of data repository in your organisation? If you work for a medium to large business, there is a very good chance you will have more than one, and probably more than one type of RDBMS. • One of the use-cases I’m seeing is businesses are wanting to replicate their data from their operational data store, typically Oracle to Microsoft SQL Server to use it’s Reporting and/or Analysis Services for BI reporting. • Another one is the re-platforming of an existing data repository by migrating it to another database platform for various reasons. This may require Change Data Capture (CDC) techniques to minimise the disruption during the exercise. • And often, replicating data to a less expensive platform, like Oracle to MySQL for example, and then directing the reporting and/or query requirements of the application to the read-only MySQL copy is seen as a way to reduce the load on the production system and extend it’s useful lifetime. However, as I’ll try to point out, this is often not as easy as it seems. 6
  • 8. I think everyone could benefit from at least a little knowledge of the capabilities and limitations heterogeneous data access. • Business Owners need to know they have options. Having data trapped in a proprietary or “one of a kind” repository could be hurting your business. “Yeah, we have that information, but it’s part of the ABC software package we purchased a few years ago and it runs on a XYZ database, all our other stuff is Oracle.” • Architects/Developers needs to know the options. If you don’t know all the alternatives, how can you recommend the best option? Is Goldengate really the best data replication solution for your requirements? Maybe Dbvisit Replicate would have been a much more cost effective option, or maybe Active Data Guard would be a better fit? • DBA/Operators need to know capabilities and limitations of the chosen solution. “The third party ODBC driver cost additional money but it was 3 times faster than the included driver.” 7
  • 9. If you’re anything like me who has spent most of their career working with only one type of database and looking down my nose at anyone using anything else, it was a bit of an attitude adjustment to find companies making successful use of something other than Oracle! According to db-engines.com, which is a site that keeps track of DBMS and publishes a “Popularity” ranking. This is not a scientific measurement, rather it’s a ranking based on current activity in social media and internet. Number of mentions of the systems on websites using Google and Bing. General interest in the system using Google Trends Frequency of technical discussions about the system from Stack Overflow and DBA Stack Exchange. Number of job offers, in which the system is mentioned on international job sites like Indeed and Simply Hired. Number of profiles in professional networks, in which the system is mentioned like Linkedin. Relevance in social networks like Twitter. As you can see the top three clearly distinguish themselves. Note that vertical scale is logarithmic so the top three players, namely Oracle, MySQL, and SQL Server are streets ahead of the others. 8
  • 10. So it's not a count of the installed base of each RDBMS, but it's probably better than that, it offers an early indicator as to the trending direction of these products. So if you're looking for your next RDBMS, this is the info you need. 8
  • 11. These rankings do include non-relational database management systems. I think MongoDB is classified as a NoSQL database Although NoSQL doesn’t mean “No SQL” it means “Not Only SQL”. Most of the NoSQL databases out there today are more accurately “No Relational” than “No SQL”. The split between commercial and open source stands at about 1/3 open source and 2/3 commercial. So, what’s the big deal? The Relational Database Management Systems all use SQL standard as their language, so how hard can it be to move data and/or applications between them? 9
  • 12. Quick show of hands. Who has written SQL? And who has written SQL with consideration given to the SQL standard? Not many, if any. SQL became an ANSI standard in 1986 and an ISO standard in 1987. Since then, the standard has been enhanced several times with added features. Despite these standards, code is not completely portable among different database systems. The different makers do not perfectly adhere to the standard, for instance by adding extensions, and the standard itself is sometimes ambiguous. There has been 7 revisions to the SQL standard since the SQL-86 ANSI standard. The most significant was the second revision in 1992 (SQL-92) where it's entry level standard was adopted as FIPS 127-2. Federal Information Processing Standards (FIPS) This was significant because up until 1996, there was an independent body, the National Institute of Standards and Technology (NIST) that used to certify SQL DBMS compliance with the current SQL standard, but they stopped doing this in 1996. So the next release of SQL in 1999 (SQL-99) was the first release of the SQL standard where the database vendors self-certified their compliance against the standard, and for the last 18 years, through 4 more revisions to the SQL standard, 10
  • 13. database vendors have been self-certifying the compliance of their products with the latest SQL standard. If you really want your own copy of the latest SQL standard, SQL-2011, it’s available from standards.co.nz but it comes in something like 13 parts and is not light reading. Each part costs about $250 so you’re looking at about over $3000 for your own copy. So, how different are the SQL based RDBMSs? Let’s take a look at a few comparisons and I think you’ll get the idea. 10
  • 14. First off, each RDBMS typically has very different internals, Database vendors are free to do whatever they like with things that aren’t covered by any standard, or indeed, interpret and implement and part of a standard as they see fit. So what we get are SQL based RDBMS that are very different under the hood. Concurrency (locking) models are very different and will affect the application in periods of high concurrency. Some databases may not flag a certain condition as an error while others will. In fact you may have noticed during upgrade testing of Oracle that certain conditions that weren’t considered an error are now raised as one due to Oracle tightening up on it’s error checking from version to version. I should note that there’s not a single database that follows the SQL standard 100%. Oracle, SQL Server, MySQL, DB2 and others, each claim certain levels of support for the standard, but as you have seen with multiple versions of the standard and self certification, even that statement is open to interpretation. 11
  • 15. Not all databases implement all the standard SQL datatypes, and if they do, they are often not the same. By way of example, I’d like to look at a very simple datatype, that is, the CHAR datatype. The CHAR datatype, as you probably know, it a fixed length string datatype and is a core SQL standard datatype. I want to look at the two CHAR requirements, as specified by the SQL standard and see how three of the leading RDBMS have implemented them. 12
  • 16. So as you can see, even simple requirements for a basic datatype are not always implemented consistently. 13
  • 17. I don’t want to bore you with endless examples, so I’ve chosen one simple function from the SQL standard to highlight the point. String concatenation is a core function of the SQL standard and is done using the || operator with one of the rules that if any argument string is NULL then the resulting concatenated string is NULL. And speaking of NULLs… 14
  • 18. So we all know what NULLs are, right? NULLS support the representation of "missing or inapplicable information". In SQL, NULL is a state (unknown) and not a value. Misunderstanding of how NULLs work is the cause of a great number of errors in SQL code. These mistakes are usually the result of confusion between NULL and either 0 (zero) or an empty string, which is a string value with a length of zero. NULL is defined by the ISO SQL standard as different from both an empty string and the numerical value 0, however and while NULL indicates the absence of a value, the empty string and numerical zero both represent actual values. And I think that’s the source of most of the confusion. Let’s look at how some of the RDBMS handle just one aspect of NULL processing. That is, where do NULLs sit when sorted column containing NULLS. But first, let’s look at what does the SQL standard has to say about this? The core standard doesn’t explicitly define a default sort order for NULLs but in a 2003 optional extension, NULLs can be sorted using the NULLS FIRST or NULLS LAST addition to the ORDER BY clause, but not all vendors have implemented this. Nulls are ordered differently in Oracle compared with SQL Server or MySQL. So, depending on how your SQL statements are written, they could produce a 15
  • 19. different output if you executed the same (valid) SQL on Oracle or SQL Server. PostgreSQL is different again by the way. (orders NULLs higher than non-NULL values and allows the standard NULLS FIRST or NULLS LAST clauses) 15
  • 20. So, how did a supposed standard become to be so different across vendors implementing and supporting the standard? • The complexity and size of the SQL standard means that most implementers do not support the entire standard. • The standard doesn’t specify database behaviour in several important areas (e.g. indexes, file storage...), leaving the database vendors to decide how it should behave. • The SQL standard precisely specifies the syntax that a conforming database system must implement. However, the standard's specification of the semantics of language constructs is less well-defined, leading to ambiguity. • Many database vendors have large existing customer bases; where the newer version of the SQL standard conflicts with the prior behaviour of the vendor's database, the vendor may be unwilling to break this backward compatibility. • There is little commercial incentive for vendors to make it easier for users to change database suppliers. • Users evaluating database software tend to place other factors such as performance higher in their priorities rather than compliance with standards. 16
  • 21. So, how can you guard against your application issuing “non-standard” SQL? I want to introduce you to what I am confidently calling “The most useless piece of functionality in Oracle”. 16
  • 22. Trouble is, Oracle supports numerous features that extend beyond what they call standard SQL. According to the Oracle manual, and this is a quote, “If you are concerned with the portability of your applications to other implementations of SQL, then use Oracle's FIPS Flagger to help identify the use of Oracle extensions to SQL92.” FIPS, by the way, stands for Federal Information Processing Standard. It’s an American standard developed by the US Federal government and they are usually the same or slightly modified versions of ANSI, IEEE, or ISO standards. The FLAGGER parameter specifies FIPS flagging, which causes an error message to be generated when a SQL statement issued is an extension of the Entry Level of SQL-92, which is a standard that has been superseded by SQL2008 (but there is no FIPS certification for SQL2008). FLAGGER is a session level parameter only. You can’t set it at the database level, and why would you want to anyway? 17
  • 23. So, what happens when you set the fips FLAGGER? Here’s a simple test with a very basic table. So I create the table, then set the session level FLAGGER and try a very simple SQL statement. Now I’ve tried to make sense of the error message but the answer must be buried in the SQL standard and I’m not stumping up $3k to find out. 18
  • 24. But it gets even weirder. With the fips FLAGGER set, here’s a select using a NUMBER column and a numeric digit, and it works! But try an inequality match with != and it errors, telling you in the error message to try <> instead. But when you try that form of inequality, it still says that function is not part of the ANSI standard! 19
  • 25. Oh, and I have to show you this. Here’s what happens when you set the fips FLAGGER before creating the table. So it seems that the fip FLAGGER is either broken or so restrictive that is appears to be broken. In the end, the FIPS 127-2 is 22 years old, is based on a version of SQL that is 5 versions old. The last version of Oracle that complied with FIPS 127-2 was (probably) Oracle 7. The standards body that certified compliance with the standard stopped 18 years ago. While SQL-92 has been superseded by other releases, there has been no conformance testing authority for any version of SQL since SQL-92; hence, Entry SQL-92 offers you the most assurance of portability. But appears to be broken and is practically useless. To be fair, Oracle had to include the fips FLAGGER in the code as part of their compliance with FIPS. (You’ve paid good money for all those neat Oracle features. Use them!) 20
  • 26. But an RDBMS is more than just datatypes and functions and it’s ability to execute SQL. There are a raft of other considerations if you are considering re-platforming your application to another database. In fact, depending on the application, often the data migration is one of the easier tasks. Much more difficult is the migration of things like stored code (PL/SQL), Security and access (users, privileges) There are tools available to help to with re-platforming. SSMA does a reasonable job if you’re moving from Oracle to SQL Server. Oracle’s SQL Developer (apparently) does a reasonable job at migrating a selection of common RDBMS to Oracle, and it’s free. 21
  • 27. So, just to wrap up this whole SQL standard thing. Don’t you love it when some consultant answers your question with the “It depends” answer? • No, because it frequently changes, is ambiguous in places, contains many optional parts and no database vendor follows the standard 100% • Yes, because it gives us, at the very least, a framework or common ground. Standards promote a common skill set amongst IT professionals. SQL’s a standard, but it’s a loose one at best. It’s useful for what it is, but don’t make assumptions that it It’s not a paved highway between different types of RDBMS that will let you flip between vendors with ease. It’s more like a gravel road that provides a path but you may get a bit dusty if you travel it. 22
  • 28. Ok, now that we have looked at some of the challenges with heterogeneous data, let’s take a look at some of the technology solutions currently available to assist with moving, replicating, or accessing data across different types of RDBMS. Before I start, I’ll note again that this is not a complete list of solutions, even for the top RDBMS’s mentioned at the start of the presentation. These are the ones that are most obvious as a solution or those that I’ve had some experience with so I feel I’m qualified to comment. 23
  • 29. Most databases have much better tools and utilities for getting data into their database compared with ways of transferring data to other types of databases. Here’s some of the common ones but there are also plenty of 3rd party utilities available ranging from free to very expensive but in my experience, you definitely pay for want you get in this area. So if you’re looking to migrate your data from A to B, look at the tools and utilities available from B, they will usually be better than those from A. I guess this makes sense from a competitive point of view. Let’s make it easy for customers to move data into our database but don’t give them any help moving data out of our database. 24
  • 30. I’m going to give a special mention to MySQL and it’s migration tool, mainly because of it’s relationship to the Oracle RDBMS and what Oracle did to MySQL’s migration tool. A little bit of the interesting history behind MySQL MySQL was created in 1995 In 2000 a company called Innobase developed the InnoDB storage engine for MySQL. This is what made MySQL a “real” RDBMS as it included things like transactions, row level locking, and foreign keys, etc. In 2005, Oracle acquired Innobase saying it wanted to increase support for Open Source software. (yeah, right). It was really a strategic move by Oracle to squeeze the life out of MySQL. Also in 2005, MySQL released a utility called the MySQL Migration Toolkit as part of MySQL GUI Tools Bundle that offered Oracle to MySQL schema and data transfer. In 2008, Sun acquired MySQL. In 2010, Oracle purchased Sun and acquired MySQL in the process. Now, I thought Oracle would kill MySQL but I'm happy to see they have continued to support and enhance the platform. Oracle OpenWorld this year had over 70 sessions around MySQL content. Although, the cynic in me thinks Oracle is still trying to keep MySQL from being a serious competitor with Oracle’s database. In 2010 MySQL added migration functionality to their MySQL WorkBench utility which replaced the Migration Toolkit. And when they did than, Oracle de-supported the Oracle database as a source for 25
  • 31. migration. So you couldn’t do Oracle to MySQL anymore. I can't find anything official except some forum comments to the effect that "Migration from Oracle DB's is not supported." So, If you want migrate data from Oracle to MySQL, you can’t do it with the MySQL Workbench as Oracle has removed that functionality. There are other third party solutions for Oracle -> MySQL migrations. Eg http://www.ispirer.com/products/oracle-to-mysql-migration Going the other way, as I’ve mentioned, using Oracle’s SQL Developer can migrate a selection of common RDBMS to Oracle, and it’s free. 25
  • 32. Also, I very quick mention of some of the products that enable you to capture changes to data in one type of database and apply those changes into another type of database. That is, replicating data between heterogeneous databases, and by this I mean, synchronised copies in near real time. In heterogeneous environments, this typically means the logical replication of the data where the SQL that is executed on the source database that changes data (I’m talking about the DML statements of the SQL language like insert, update, delete) is extracted as they occur, again, typically from the databases transaction logs and converted to the native SQL of the target database. This process is known as Change Data Capture, or CDC. Logical replication using Change Data Capture is often a viable solution in heterogeneous environments because the they have the ability to translate the changes into the native SQL of the target database, so once the bulk of the data has been migrated to the target, a heterogeneous CDC product can keep the two data sources in sync. There are many companies offering heterogeneous change data capture with Oracle being at least one of the source and/or target databases. • Oracle GoldenGate http://www.oracle.com/us/products/middleware/data- 26
  • 33. integration/goldengate/overview/index.html • Dbvisit Replicate http://www.dbvisit.com/products/dbvisit_replicate_real_time_oracle_databa se_replication/ • Dell SharePlex http://www.quest.com/shareplex-for-oracle/ • Attunity Replicate http://www.attunity.com/products/attunity-replicate • Informatica Data Replication http://www.informatica.com/uk/products/data- replication/data-replication/ • HVR Software http://www.hvr-software.com/product/real-time-database- replication • Astera Change Data Capture http://www.astera.com/solutions/technology- solutions/change-data-capture • Gravic Shadowbase http://www.gravic.com/shadowbase/solutions/overview.html • IBM InfoSphere http://www- 03.ibm.com/software/products/en/ibminfochandatacapt • Hit Software DBMoto http://www.hitsw.com/products_services/dbmoto/DBMoto_for_Oracle.html 26
  • 34. In the final section of this presentation, I’d like to talk briefly about Oracle’s solution to heterogeneous data access from other relational data sources. Oracle’s had this functionality for many years but it’s gone though a few a number of name changes. It started off being something called SQL*Connect, then Transparent Gateways, but the latest name under 12c is Oracle Database Gateways. But it’s essentially part of what Oracle called Heterogeneous Services under 11g. Oracle Gateways allow heterogeneous data access from other relational data sources to an Oracle application. Gateways are available for RDBMs like DB2 and SQL Server but also non relational data sources like Excel and transaction managers like IBM’s CICS and message queuing systems like IBM’s MQ. The Gateways are a separate purchased option but are available, with a couple of exceptions, for both Standard and Enterprise database editions. These gateways handle some of the issues I have been talking about like SQL translations, dictionary translations, datatype mappings. The gateways for specific databases aren’t cheap. About the same per processor license cost as GoldenGate, however the Database Gateway for ODBC, which is a 27
  • 35. generic gateway for any ODBC compliant non-Oracle system is free with the database although more functionally restricted that the specific Database Gateways and you typically still need to purchase an ODBC driver 27
  • 36. The way they work is that SQL statements are translated into the SQL of the non- Oracle database. With SQL statements, if the functionality is missing on the non-Oracle system, then either a simpler query is issued, or the statement is broken up into multiple queries and the results are obtained by post-processing in the Oracle database. Remember, most of these features come with a list of restrictions and limitations to capability so it’s not as simple as I’ve described it. For example, the Heterogeneous Connectivity User’s Guide lists 10 rules restricting the use of SQL statements in a heterogeneous distributed environment, so it’s not 100% transparent. But here’s a couple of examples of what I’m talking about. 28
  • 37. All RDBM’s store metadata, that is, data about the data. Trouble is, they all store this information in different ways. One of the facilities that the Gateway provides is data dictionary translations. So the example shows Oracle executing a select from the ALL_CATALOG data dictionary but through a link to a SQL Server database. The Gateway intercepts the query and translates it into the dictionary objects of the SQL Server database. The results of the new query are then returned to the user as it the information came from the ALL_CATALOG view within Oracle. 29
  • 38. There’s a package that’s part of Oracle’s heterogeneous services that deserves special mention. Using the DBMS_HS_PASSTHROUGH package allows you to execute SQL statements directly on the non-Oracle system without them being interpreted by the Oracle database. What’s special about DBMS_HS_PASSTHROUGH is that it’s a virtual package, It doesn’t exist in the Oracle or non-Oracle system, yet it still works! Conceptually it resides on the non-Oracle system but in reality, calls to the package are intercepted by the Heterogeneous Services component of Oracle and mapped to one of the Gateway calls. 30
  • 39. And so, just to wrap up before I take any questions, here’s a few key points from the session. • Know that there are options out there for accessing a moving data between different types of data stores. Depending on you position within your company, you may not need be aware of them all but at least know someone who does and can select the one that’s right for you. • SQL databases are not the same, but with the SQL language, and some careful consideration, they can work together, often seamlessly. • Don’t sacrifice performance and features for conformity. Use what you have been given, and paid for, to the best of it’s ability. 32
  • 40. I want to leave a bit of time for questions so I’ve skipped a few topics like database abstraction layers and ODBC. Also, the combinations of lots of RDBMS’s and business use-case requirements results in dozens of different functional specifications and it would be impossible to cover all the options in this type of session but knowing the capabilities of the available options will help you select the best fit to your requirements. 33
  • 41. Ok, I hope you found that interesting and learnt a few things along the way. Well, thank you for your attendance, and please enjoy the rest of the conference. Thanks! 34