The need of Interoperability in Office and GIS formats

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    The need of Interoperability in Office and GIS formats - Presentation Transcript

    1. Free GIS and Interoperability GIS Open Source, interoperabilità e cultura del dato nei SIAT della Pubblica Amministrazione [GIS Open Source, interoperability and the 'culture of data' in the spatial data warehouses of the Public Administration] GFOSS'04 ITC-irst, 16 Nov 2004 (last revised 10 2005) M. Neteler neteler at itc it http://mpa.itc.it ITC-irst, Povo (Trento), Italy
    2. The need for Interoperability The problem
      • nowadays data have to be exchanged across often very heterogeneous groups
      • the personal choice of application software/operating system should not affect the data exchange
      • data exchange standards are available
      • limited awareness for the need of interoperability
      • limited implementation of interoperability in processes and software
      • commonly used file formats let to believe in interoperability: “false friends”
    3. What are Standardization & Interoperability? Standardization versus Interoperability Standardization: Written/published document describing data formats, models etc. Example Office Standards: ASCII, HTML, XML, ... Example GIS Standards: GML, ISO 08211, ISO/IEC 15444-1, WMS etc. Only published standards are acceptable. Interoperability: More than application of standardization, it also comprises the interpretation of the standard (sometimes definitions are incomplete)
    4. Interoperability? The two dimensions of Interoperability Longitudinal Interoperability: time - long term storage Data shall be readable over time (years, decades, ...). This is of particular interest for data of public administration and long-term projects. Transversal Interoperability: sharing data between users Data shall be readable across user communities, independent from software or operating system used (freedom of software choice). Again, this is of particular interest for data of public administration and long-term projects.
    5. Part I: Office Interoperability
    6. Example: MS-Word .DOC format Are WORD.doc files a suitable for data exchange?
      • the format is undocumented, to some extend it was reverse-engineered -> does not support transversal interoperability
      • the format is regularly changed (Word 1, 2, 95, 97, NT, 2000, XP, ... also named WinWORD 6, 8, 10,...) -> does not support longitudinal interoperability
      • Prone to MS-Windows macro viruses
      • severe security/privacy issues (example next slide) - DOC files contain sensitive information about user (unrelated to the contents) - deleted text may still be legible outside of MS-Word -> contents cannot be completely verified
    7. Example: MS-Word .DOC format - security/privacy issues Descrambling a WORD.doc file
      • Your unique MS-Windows user ID (or similar): PID_GUIDäAN{714738E3-FF4C-11D3-ZD7C-00E0281D67A7} This makes your (anonymous) document traceable .
      • Sometimes delete text is still visible (think of re-using an existing WORD file) A famous example: In February 2003, the British government of Tony Blair published a dossier on Iraq's security and intelligence organizations . This dossier was cited by Colin Powell in his address to the United Nations the same month. Dr. Glen Rangwala, a lecturer in politics at Cambridge University, quickly discovered that much of the material in the dossier was actually plagiarized from a U.S. researcher on Iraq. http://www.computerbytesman.com/privacy/blair.htm
      What you may find:
    8. Descrambling a WORD.doc file: The British Iraq dossier 2003 1/2 http://nytimes.com Example: MS-Word .DOC format - security/privacy issues
    9. [neteler@dandre2 gfoss04]$ tr -d [:cntrl:] < blair.doc ÐÏࡱá>þÿz|þÿÿÿyÿ [...] -xxxxí-o#o#{'?^,k6®äí-* RûuËÂG (É-$IRAQ ITS INFRASTRUCTURE OF CONCEALMENT, DECEPTION AND INTIMIDATIONThis report draws upon a number of sources, including intelligence material, and shows how the Iraqi regime is constructed to have, and to keep, WMD, and is now engaged in a campaign of obstruction of the United Nations Weapons Inspectors. [...] [`azbhh§h»h?h-i/isjÿÿ cic22 JC:DOCUME~1 phamill LOCALS~1TempAutoRecovery save of Iraq - security.asd cic22 JC:DOCUME~1 phamill LOCALS~1TempAutoRecovery save of Iraq - security.asd cic22 JC:DOCUME~1 phamill LOCALS~1TempAutoRecovery save of Iraq - security.asd JPratt C:TEMPIraq - security.doc JPratt A:Iraq - security.doc ablackshaw!C: ABlackshaw Iraq - security.docablackshaw#C: ABlackshaw A;Iraq - security.doc ablackshaw A:Iraq - security.doc MKhan C:TEMPIraq - security.doc MKhan (C:WINNTProfilesmkhanDesktopIraq.docþÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ PjÿzXVÿ*uzLl_ÿbêzLl_ [...] jP@GTimes New Roman5SymbolG&ArialHelveticaA&Arial Narrow?&Arial Black&quot;qÐh_r&Òr&aõq#JV,?RVW,º!¥À??20døi?fÿÿCIraq- ITS INFRASTRUCTURE OF CONCEALMENT, DECEPTION AND INTIMIDATIONdefaultMKhanþÿàòùOh«+'³Ù0? ìø 4DPlx?¬?äDIraq- ITS INFRASTRUCTURE OF CONCEALMENT, DECEPTION AND INTIMIDATIONraqdefaultefaefaNormal.dotN MKhan .d4ha Microsoft Word 8.0 C@ÒIk@n)§ÈÂ@&quot;ZöfËÂ@døèuËÂ#JVþÿÕÍÕ [...] http://www.computerbytesman.com/privacy/blair.htm Weapons of mass destruction Descrambling a WORD.doc file: The British Iraq dossier 2003 2/2 Example: MS-Word .DOC format - security/privacy issues
    10. Example: MS-Excel .XLS format Are EXCEL.xls files a suitable for data exchange?
      • the format is undocumented, to some extend it was reverse-engineered -> does not support transversal interoperability
      • the format is regularly changed (Excel 95, 97, NT, 2000, ...) -> does not support longitudinal interoperability
      • Prone to MS-Windows viruses
      • Limitation: max. 65535 lines in a table (2 16 )
      • Auto-conversion feature risky: Some fields/columns are automatically changed to date-time format (see example next slides) -> risk of accidental data damage high
    11. Example: MS-Excel .XLS format – accidental data damage The “Human Genome Project” case 1/3
      • In 2004 scientists discovered that some gene names were being changed inadvertently to non-gene names. Citation: “ A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package. The date conversions affect at least 30 gene names ; the floating-point conversions affect at least 2,000 if Riken identifiers are included. These conversions are irreversible ; the original gene names cannot be recovered. A default date conversion feature in Excel (Microsoft Corp., Redmond, WA) was altering gene names that it considered to look like dates . For example, the tumor suppressor DEC1 [Deleted in Esophageal Cancer 1] [3] was being converted to '1-DEC.' ”
      Cited after: B.R. Zeeberg, J. Riss, D.W. Kane, K.J. Bussey, E. Uchio, W.M. Linehan, J.C. Barrett and J.N. Weinstein, BMC Bioinformatics 2004, 5:80 http://dx.doi.org/10.1186/1471-2105-5-80
    12. The “Human Genome Project” case 2/3 Example: MS-Excel .XLS format – accidental data damage http://dx.doi.org/10.1186/1471-2105-5-80
    13. The “Human Genome Project” case 3/3 Example: MS-Excel .XLS format – accidental data damage http://dx.doi.org/10.1186/1471-2105-5-80
    14. Suggestions for “Office” data interoperability
      • Text files: ASCII, HTML, RTF, XML, Latex Postscript/PDF for read-only documents
      • Tables: CSV, xBase (dBase), XML
      • Databases: SQL92-ASCII
      • Bibliography: BibTex
    15. Suggestions for “Office” data interoperability Automated conversion tools can be used to provide all formats
      • Text files: ASCII, HTML, RTF, XML Postscript/PDF
      • Tables: CSV, xBase (dBase), XML
      • Databases: SQL92-ASCII
      • Bibliography: BibTex
      Converters (examples):
      • OpenOffice.org [1]
      • wvWare [2[
      • OpenOffice.org, xbase2pg [3]
      • ODBC, xbase2pg
      • Bibutils [4]
      • Bibtex2html [5], (Endnote)
      [1] http://OpenOffice.org itself uses XML as own standard format [2] http://wvware.sourceforge.net/ [3] http://www.klaban.torun.pl/prog/pg2xbase/ [4] http://www.scripps.edu/~cdputnam/software/bibutils/bibutils.html [5] http://www.lri.fr/~filliatr/bibtex2html/
    16. OASIS: “Office” data interoperability Promotion of Open Document Exchange Format
      • Proposed and implemented new open standard format: OASIS OpenDocument XML format
      • The OASIS OpenDocument format [1] is a vendor and implementation independent file format which guarantees freedom and independence
      • E.g., OpenOffice.org uses OASIS as default format from version 2.0 onwards as well as KOffice , StarOffice software and other vendors
      The OASIS OpenDocument file format is one of the file formats recommended by the European Commision [2] [1] http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office [2] http://europa.eu.int/idabc/en/document/3439
    17.  
    18. GIS Standards and Organizations GIS data sets are more than geometry: Metadata - geographic reference - colors, display attributes etc - history of data modifications 1990 1992 2004 1994 1997 http://www.opengeospatial.org
    19. GIS Interoperability: GDAL and OGR libraries Data abstraction GDAL http://www.gdal.org Abstraction layer ENVI GeoTIFF SAR GRASS ECW HDF4 JPEG2000 MrSID ArcGRID Metadata - Number of bands - Color table - ... - Coordinate system - Projection 40 Frmts EPSG Codes PROJ.4
    20. GIS Interoperability: GDAL and OGR libraries Data abstraction OGR http://www.gdal.org/ogr/ Metadata - Coordinate system - Projection Abstraction layer EPSG Codes ArcCover MITAB Oracle SHAPE PostGIS Geodatabase DGN 20 Frmts
    21. GIS Data formats and support question GDAL Development: Raster formats Direct fundings: - Atlantis (ENVISAT, MFF, HKV Blobs) - eCognition Germany (FUJI BAS Format) - Los Alamos Nat. Labs (FITS) - OPeNDAP Inc. (OPeNDAP/DODS) - PeopleSoft ( ERDAS LAN ) - Safe Software (USGS SDTS, ISO8211 support) - Yukon Department of Environment (USGS DEM) Public formats/Open documents/Reverse engineered - ERDAS Imagine ( IMG ) - ERMAPPER ( ECW ) - ESRI formats ( ArcGrid ) - GDAL Virtual Format - JasPer ( JPEG2000 ); Kakadu (GeoJP2 interface for JPEG2000 = ISO/IEC 15444-1) - LizardTech ( MrSID , JPEG2000 ) - NOAA (AVHRR data)
    22. GIS Data formats and support question OGR Development: Vector formats Direct fundings: - DM Solutions Group and GoMOOS ( SQLite RDBMS, Comma Sep. Values CSV ) - OPeNDAP Inc. (OPeNDAP/DODS) - Safe Software (FMEObjects) - SRC, LLC ( Oracle Spatial ) Public formats/Open documents/Reverse engineered - ESRI ( SHAPE , ArcCoverage ) - GML - IHO S-57 - MapInfo ( TAB and MIF/MID ) - Microsoft ( ODBC OGR) - Microstation ( DGN ) - MySQL (non-spatial data) OGR - OGDI Vectors (VMAP) - OGR Virtual Format - PostgreSQL/PostGIS - SDTS - UK Ordnance Survey (NTF) - U.S. Census (TIGER)
    23. GIS formats Why so many formats? No big problem! Application specific requirements, which partially contradict each other
      • high compression rate
      • small runtime storage requirements
      • coding without information loss
      • fast decoding
      • easy access to pixels
      • simple algorithm
      • Hardware-/CPU-independence “Good software” can handle numerous formats.
      • Software patents and rights of third parties: future traps ?!
    24. GIS formats and Software Patents How software patents affect GIS users LZW (Lempel Ziv Welch) Compression
      • Used in many raster formats (e.g. GIF)
      • Integrated into GRASS before it became patent, later replaced by Zlib Deflate
      • Unisys started to charge for usage after waiting some years
      MrSID (Multi-resolution Seamless Image Database)
      • wavelet based image file format
      • three patents covering both the image compression and on the fly image decompression technology
      • GDAL support MrSID but requires MrSID SDK license
      ECW (ERMAPPER Compressed Wavelets)
      • Patent pending
      • GPL released source code available (of patented code?)
      JPEG 2000
      • Situation not very clear
    25. Summary
      • The personal choice of application software/operating system should not affect the data exchange
      • longitudinal and transversal interoperability must be granted
      • Only documented formats may be used
      • There is no excuse: start to use interoperable formats today
      • GIS interoperability is at a better state than Office documents interoperability
      • Interoperability awareness needs to be promoted : today and in future
    26. License of this document Document home: http://mpa.itc.it/gfoss04/neteler_gfoss04_interoperability2005.pdf This work is licensed under a Creative Commons License. http://creativecommons.org/licenses/by-sa/2.0/deed.en “ Free GIS and Interoperability”, © 2004-2005 Markus Neteler [ OpenOffice SXI file available upon request: neteler at itc it neteler at osgeo org ] License details: Attribution-ShareAlike 2.0 You are free:
        • to copy, distribute, display, and perform the work
        • to make derivative works
        • to make commercial use of the work
        • Under the following conditions: Attribution. You must give the original author credit. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. For any reuse or distribution, you must make clear to others the license terms of this work. Any of these conditions can be waived if you get permission from the copyright holder. Your fair use and other rights are in no way affected by the above.

    + Markus NetelerMarkus Neteler, 8 months ago

    custom

    507 views, 0 favs, 0 embeds more stats

    Free GIS and Interoperability: The need of Interope more

    More info about this document

    CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

    Go to text version

    • Total Views 507
      • 507 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 19
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories