New Technology Presentation for the School of InformationPresentation Transcript
Universal File Format
Converter AKA the NCSA
By: Catherine Bell, Stacy Hays and
What is Polyglot?
Polyglot is an attempt to create a universal file format
converter through the National Center for Supercomputer
Applications through the University of Illinois Urbana-
The National Archives and Records Administration (NARA)
is sponsoring the development of the NCSA Polyglot
Definition of Polyglot - One who speaks many languages
Why make the NCSA Polyglot?
● There are hundreds of thousands of file formats in the world, most of which are
not transferrable between software
● There isn’t any way to convert most file extensions
● Most discrepancy in file formats are the result of proprietary software
companies competing against each other to increase their user base
● Not only does the lack of compatibility between file formats make it hard to
share information, but it makes the task of preservation for born digital
materials increasing difficult as proprietary software constantly develops and
● File incompatibility is most evident in 3D file formats
● Polyglot has focused on finding ways to convert between 3D file formats, as
they provide the most complications
Why are 3D files so Complicated?
● There are over 140 types of 3D file
● Most 3D viewers are manufactured by
the proprietary software companies that
create the file format
● Different types of file formats supports
different kinds of 3D content
● Extreme amounts of data can loss occurs
when converting 3D files between
● 3D objects point to a need for a
universal file format converter that
produces the least amount of data loss
to ensure preservation quality
Towards a Universal File Format Converter
● Polyglot analyzes and automates the import/export
features of third party software
● Creates an I/O graph weighted tool from information about
the software available on multiple servers
● Uses a quantifiable scale for measuring data loss that
occurs when conversions are done to calculate the path of
conversion for best possible quality
● Submits script using Java to servers to do conversion
● Uses third party applications in the conversion process
So, How does Polyglot Work?
Here is video demonstrating how Polyglot Works:
1. Download Polyglot onto institutional servers
2. Polyglot can then take advantage of all of the software
contained on the servers to make conversions
3. Can be utilized through either desktop or web based version
4. Once Polyglot is set up, you can drag and drop files to
NOTE: Anyone can test
Polyglot through the
Problems, Problems Indeed.
Functional File Converters
There are several other file converters out there file converters are usually file
type specific. Many of these converters are proprietary. There is nothing close
to a universal converter.
Examples of file format converters:
Converts Anything? Not quite.
● Same people designed web-based Conversion Software Registry (CSR)
for collecting information about software that are capable of file format
● Motivated by a community need for finding file format conversions
● Create a login and add softwares to the registry
● Currently has over 2000 softwares registered
● Over 260,000 possible conversions
● Contains I/O graph to see best possible conversions
● Also searchable by conversion and software
Conversion Software Registry
● Bajcsy P, Kooper R, Marini L, McHenry K, Ondrejcek M. A Framework for Understanding File Format Conversions. In: ACM
ICPS US Workshop on roadmap for Digital Preservation Interoperability Framework.; 2011.
● "ISDA Polyglot." Image and Spatial Data Analasys Division. U of Illinois, 2013. Web. 10 Nov. 2013. <http://isda.ncsa.illinois.
● Kenton McHenry and Peter Bajcsy "3D+Time File Formats.", Technical Report NCSA-ISDA10-001, October 15, 2010.
● McHenry K, Ondrejcek M, Marini L, Kooper R, Bajcsy P.Towards a Universal Viewer for Digital Content. In: International
Conference on Computer Science, Executable Paper Workshop.; 2011.
● McHenry K, Kooper R, Marini L, Bajcsy P. Designing a Scalable Cross Platform Imposed Code Reuse Framework. In:
Microsoft Research eScience Workshop. Berkeley, CA,; 2010.
● McHenry K, Kooper R, Bajcsy P.Taking Matters into Your Own Hands: Imposing Code Reusability for Universal File Format
Conversion. In: Microsoft Research eScience workshop. Pittsburg, PA,; 2009.
● McHenry K, Kooper R, Bajcsy P.Towards a Universal, Quantifiable, and Scalable File Format Converter. Oxford, UK; 2009.
● McHenry K, Bajcsy P. Framework Converts Files of Any Format.; 2009.
● Ondrejcek M, McHenry K, Bajcsy P. The Conversion Software Registry. Berkeley, CA; 2010
● "Towards a Universal File Format Converter." Analysis of Electronic Records, Document Appraisal Framework. NCSA at the
U of Illinois at Urbana-Champaign, 17 Feb. 2011. Web. 10 Nov. 2013. <http://isda.ncsa.uiuc.edu/NARA/conversion.html>.
● McHenry, K. and Bajcsy P. "An Overview of 3D Data Content, File Formats and Viewers.", Technical Report NCSA-ISDA08-
002, October 31, 2008.