E&P data management: Implementing data standards


Published on

Implementing corporate data standards in the field, using metadata-centric tools.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

E&P data management: Implementing data standards

  1. 1. Raising datamanagementstandardswww.etlsolutions.comE&P Data StandardisationImplementing corporate datastandards in the field usingmetadata-centric tools
  2. 2. • Many companies are today engaged in the standardization of the data they hold incorporate data stores. In the Geoscience sector, where ETL Solutions is active, agreat variety of data stores are used, such as OpenWorks, Corporate Data Store,Seabed, Finder and PPDM.• These data stores usually define formal catalogue controlled values to assist withenforcing standards. However, this data, usually held in database tables - whichthemselves contain catalogue controlled values - has often been extended or adaptedfor regional circumstances.• This causes two day-to-day issues:1. Corporate reports assembled from such data can easily be misleading. An attempt todiscover something as simple as the relative number of on shore and off shore wells mayfail if one of the regional stalls has decided to classify wells as tidal, shallow water anddeep water - or just as on and off.2. Engineers trained in one region, or working for one subcontractor, can find it difficult to workin another region where standard reference values are chosen from a different set.• While these are awkward issues, it is typically difficult to assign a direct cost to anyaccounting centre and hence calculate the benefit of fixing it. A more pressingreason often arises when stores are to be upgraded to newer (and more consistent)versions, or when data held in legacy stores needs to be sufficiently cleaned so it canbe migrated into corporate standard stores. Often, the day-to-day issues are tolerateduntil a migration is required by the business.The challenge (1)
  3. 3. • What makes this a tricky problem is the mismatch between thetools available to the average engineer - usually not much morethan Excel and SQLplus - and the extreme complexity of amodern data store: PPDM 3.8, for example, contains over 1700tables.• A hand-tool approach to such a problem is going to be veryexpensive, tying up valuable staff for weeks on end and quitepossibly failing.• It is in these circumstances that metadata tools in the rightdomain specialist’s hands come into their own. ETL SolutionsTransformation Manager, for example, is capable of introspectingand interpreting the metadata from any one of the stores in lessthan a minute.• With Transformation Manager, it is possible to see the structure ofthe data in an easily understood Graphical User Interface (GUI)and indeed to navigate the data (or instance values) within thissame user interface. However, much more usefully than merelyviewing the data, it is possible to use the metadata to directlydiscover - and then manipulate - all the data that will be affectedby changes in standards values.The challenge (2)
  4. 4. • The client’s pilot project uses ametadata-driven method tosupport the datastandardisation and migrationof legacy databases to theclient’s standards.• Transformation Manager hasbeen used in a once-only‘Prepare Phase’ toautomatically createtransforms that will be usedduring every project migration.• These generic transforms willbe run during theExtract, Verify and Updatephases of the migrationprocess (see the diagramopposite).ProcessExtractReviewVerifyUpdateMappingspreadsheetMappingspreadsheetError reportsOW data dictchangesStandards
  5. 5. • The client’s standards are defined using two spreadsheets; this is the starting point foreach migration of a legacy project.• The standard values spreadsheet (below) contains a worksheet for each table thatincludes their standard values. These tables contain a number of verification code tables(_VC), a number of reference data tables (_R), and other tables, e.g.OW_DATA_SOURCE, that contain standards data. Each worksheet consists of a headerrow specifying the fields defined by the company’s standards, followed by rows whichprovide the set of permitted values.Standard values
  6. 6. • The data dictionary spreadsheet, shown above, defines data to be entered into the OWData_Dict Table to specify default values or data ranges covered by the companystandards.• The data dictionary spreadsheet is used to set default values, which may be null, and arange of values that each field must conform to, expressed either as a numeric valuerange, e.g. for CASING_SIZE ‘0..10e+4’, or as a set of value strings.Data dictionary
  7. 7. • Any project can be updated by running a command linescript during an Extract phase to determine any non-standard values used by the project. This automaticprocess reviews the project tables, determining anyvalues currently used which require updating toconform to the client’s standards. A set of mappingCSV files is created with values currently used whichare not to standard.• A mapping file (.CSV) is created for each table and itsnatural key field, as shown opposite, which containsvalues that are not present in the client’s standards.• An absent mapping spreadsheet indicates that allcurrent values are consistent with the standard values.A mapping file initially contains the set of existingvalues used in the project and forms the interface for auser to enter a standards value to be used instead.When mapping spreadsheets have already beencreated for a previous project, these can be providedand will be updated during the extract process. In thiscase, the spreadsheet will contain some new standardvalues as recommendations for the project migration.Mapping files
  8. 8. • One of the principal activities the user undertakes, referred to as the Review phase, is tocheck the mapping spreadsheets and provide any standard values that are required to beused instead of the non-standard presently used.• Once this has been completed, the user again runs a command line script to verify thatonce the client’s standard values are applied, the project will be consistent with thestandards. Transformation Manager transforms are run to verify that this is the case.• The output is a set of reports; one of these as viewed by Excel is shown below.Review reports
  9. 9. • During the Verify stage, the user will correct all non-standard valuesand data dictionary errors until the reports indicate that a completelyconsistent project will be produced if migration proceeds.• Up to this stage, the project data has been accessed in read onlymode and the project has been available to all other users. It is notuntil the final Upgrade phase of the project migration that exclusiveaccess to the project is required.• Once verification has been completed, the final Update phase is againrun from the command line and completes the migration process.• The Extract, Review and Update phases implemented usingTransformation Manager are completely automatic, and providewarnings if the migration process is not completely compatible withthe client’s standards. The main user input is in the Review phase,where a user with knowledge of the project specifies which standardvalues shall be used to replace any non-standard values presentlyused.• Throughout the migration phases, the user operates in a simple userenvironment, with minimal involvement of system or databaseadministrators. The project remains online until the final Upgradephase, which requires exclusive access to the project database for arelatively short duration.Process to completion
  10. 10. • If the business wants to leverage the benefitsof corporate data which conforms to a definedset of standards, using metadata-drivenstandardization tools will enable the businessto truly implement and maintain thestandards.• Also, such a programme can be delivered in afraction of the effort, time and risk that oftenadversely affects even well supportedinitiatives.• Transformation Manager delivers corporatedata standards, ensuring they areimplemented, enforced and inherently moreflexible to meet changing needs.• Finally, in the future, the migration processcan easily be extended to other Geoscienceareas of interest, and to other types ofdatabase stores in virtually any sector.Conclusions
  11. 11. Our offerings: E&P data managementTransformationManagersoftwareTransformationManager dataloaderdeveloper kitsSupport,training andmentoringservicesData loaderandconnectordevelopmentDatamigrationpackagedservices
  12. 12.  Everything under one roof Greater control andtransparency Identify and test against errorsiteratively Greater understanding of thetransformation requirement Automatically document Re-use and changemanagement Uses domain specificterminology in the mappingWhy Transformation Manager?For the user:
  13. 13. Why Transformation Manager? Reduces cost and effort Reduces risk in the project Delivers higher quality andreduces error Increases control andtransparency in thedevelopment Single product Reduces time to marketFor the business:
  14. 14. Raising datamanagementstandardswww.etlsolutions.comRaising datamanagementstandardswww.etlsolutions.comContact informationKarl Glennkg@etlsolutions.com+44 (0) 1912 894040