Open@Fao presentation at the EADI Open For Development Project, 2012

Open For Development
EADI IMWG Conference 2012

Open @ FAO

Stephen.Katz@ fao.org (Twitter: @SteveK1958)
Chief, Knowledge Management and Library Services
Food and Agriculture Organization of the United Nations

Agenda
Open @ FAO

1 Context and History of Open @ FAO

Ongoing Practical Initiatives
2 • FAO Open Archive
• Open Data (data.fao.org)
• Data Governance and Standards

3
Issues, Challenges and Lessons Learned

4 Group Discussion

Open @ FAO : Food For Thought?

Food for Thought

(FAO)

• FAO is a specialized
agency of the United
Nations with its own
independent governance

• 190+ Member Countries

• HQs in Rome, Offices in
over 80 countries with
over 5000 staff.

(FAO)

• Collects, analyses, interprets and
disseminates information on
nutrition, food and agriculture

• Policy Advice

• Furnishes Technical Assistance

• A Neutral Forum for International
Cooperation

FAO has been in the
“knowledge” business
since 1946!

Our mandate....
Ensure that the world’s
knowledge of food and
agriculture is available to
those who need it when
they need it and in a form
which they can access
and use.

Open @ FAO : A Bit of History

 1995 – Central Publishing Unit Abolished
 1996 – SGML Repository Proposal; FAOSTAT on-line
 1997 – Document Repository (XML Compatible)
 2003 – Document Repository (PDF)
 2007 – Open Archive Proposal (Fedora Commons)
 2010 – Open Data Repository Proposal (data.fao.org)
 2012 – OpenArchive.Fao.Org; Data.Fao.Org

FAO Open Archive
Goals/Objectives

 To make FAO’s Global Public Goods openly
accessible from a single access point
 To be able to exchange data in an open and
standardized way
 To have a smooth/efficient workflow to
manage FAO’s Institutional memory
 To integrate e-publishing and library workflows

FAO Open Archive
Architecture
 Based on Open Source tools (Fedora
Commons and Java)
 Based on modern standards for data
management (MODS and FRBR)
 Allowing for easier management and sharing
of multilingual content

And this is what it looks like:

Open Archive Resources
Available at Start-up Time
Resource Type Number of Records
Full Text Documents 40,100

Photos and Videos 17,100

Audio Files 1,200

Open Data (data.fao.org)
Goals/Objectives
 To address fragmentation and duplication of
information systems and data presently distributed
across many organizational units
 http://data.fao.org: one-stop shop that aggregates,
integrates, and catalogues data from multiple sources
across FAO. Topics are related to nutrition, food and
agriculture and include statistics, maps, pictures,
documents and more.

Open Data (data.fao.org)
Guiding Principles
 Uniting FAO data with one brand : http://data.fao.org
 Engaging a Community : #FAOdata
 Mobile First
 Serve the data in the most convenient format
 Integrate, don't reimplement

data.fao.org - The Big Picture

Specialised
Website Services and Widgets application(s)
consume/provide

Orchestration and Integration

Search Catalogue Statistics Maps Content Infrastructure
Statistical Data
Full text Identity Warehouse Geospatial Documents Logging
Structured Metadata Raster Pictures Caching
Linked Data Time Series Vector Video Security
... Indicators Point Multimedia Audit
Observations Pages ...

Data Flow Architecture
Data
Data Source
Source

Ingest
Harmonise
Data
Integrate
Source
Enrich

Publish
Data
Source Data
Source

Some Numbers
356,000,000 Statistical values
2 Terabytes

1,500,000 Statistical Maps

734,000 Geo Layers
30 Terabytes

435 Documents

90 Pictures

25 Information Systems

Simple and few technologies 
 Opscode –Chef, Red Hat- RHQ, Nagios- Enterprises Nagios,
Jenkins-CI – Opensource – Jenkins, Apache Maven, Apache –
SVN, Atlassian – JIRA, Apache – Jmeter, Sourceforge Opensource –
Junit, Oracle- Java, Liferay - Liferay Portal, PostgreSQL – PostgreSQL
PostGIS, Pgbouncer – pgbouncer, Pentaho - Pentaho BI, Analytical
Labs – Saiku, Talend - Talend Open Studio, RedHat/Jboss -
Application Server, Vmware - Server Virtualization,
 Exceliance – HAProxy, RedHat- Enterprise Linux Server, GeoNetwork
OpenSource – GeoNetwork, OpenGeo, GeoSolutions, Fedora
Commons, Refractions Research – GeoServer, Ontotext - OWLIM,
RedHat/Jboss - JBoss AS / JEE, Alfresco - Activiti BPM Platform, jasig -
CAS Client, Batchwork Software - Doc2Doc, Google Code Opensource -
ZXing ("Zebra Crossing"), Highsoft Solutions AS – highcharts, Twitter –
Bootstrap, Liferay - Alloy-UI, JQuery - JQuery-ui, Geert de Deckere -
Graph Up and the there are all the SaaS products ….

Further questions on data.fao.org to the Project Manager: Karl.Morteo@fao.org

CIARD – a global movement
• All organizations that create and
 To make
possess public agricultural
agricultural research information disseminate
research and share it more widely
information • CIARD partners create coherence
and by a) coordinating their efforts, b)
knowledge promoting common formats, c)
truly acessible adopting open systems and
to all standards
• Create a global network of public
collections of data and information

Distributed Data Sets
• stats
• gene banks
• gis data
• blogs,
• journals
• open archives
• raw data
• technologies
• learning objects
• ………..

How to make value added services?
How to infer new knowledge?
How to organize collaboration?
Maybe we really need this?...

…to
• stats
• gene banks
• gis data
• blogs,
• journals
• open archives
• raw data
• technologies
• learning objects
• ………..

OpenAgris

 Aggregates different data sources to expand
knowledge about a topic
 Is a “linked-data” environment mashing-up
interlinked datasets to create an integrated
knowledge base
 OpenAgris uses the Agrovoc thesaurus as
backbone to interlink to other existing
datasets (DBPedia, WorldBank, Geopolitical
Ontology…)

Open Archive : Issues, Challenges, Lessons

 Unclear Policy Framework
 Unclear collection selection policy
 Variable quality standards (content, legal, editorial,
accountability)
 Licensing policy/conditions for re-use
 Working with partners and scientific journals
 Freely available but need attribution
 Supply vs demand (personal interest vs impact)
 Tension with Sales and Marketing needs

May Lead To Negative Consequences such as:
 Low credibility/trust, reputational risk, legal exposure?

Open Data : Issues, Challenges, Lessons

Well the same stuff as before really 
 Unclear Policy Framework
 Unclear collection selection policy
 Variable quality standards
 Licensing policy/conditions for re-use
 Working with partners
 Freely available but need attribution
 Supply vs demand (personal interest vs impact)
 Tension with Sales and Marketing needs


Open Data : Issues, Challenges, Lessons

But also:
 Every data-type has it’s own standards (e.g. OGC for GIS,
SDMX for stats, MODS for documents, IPTC for Photos)
 Aggregate data quality set by lowest common denominator
 Poor data governance leads to:
 Conflicting/contradictory data values from different sources
 Lack of agreement of definitions and concepts, and
 Insufficient metadata
 Comparing apples, pears and oranges (different units, different
assumptions, different contexts)

Thank you!

Time for Discussion

and soon for Lunch!

Open@Fao presentation at the EADI Open For Development Project, 2012

Recommended

Recommended

More Related Content

What's hot

What's hot (7)

Similar to Open@Fao presentation at the EADI Open For Development Project, 2012

Similar to Open@Fao presentation at the EADI Open For Development Project, 2012 (20)

Recently uploaded

Recently uploaded (20)

Open@Fao presentation at the EADI Open For Development Project, 2012

Editor's Notes