• Save
India Seminar
Upcoming SlideShare
Loading in...5
×
 

India Seminar

on

  • 1,260 views

 

Statistics

Views

Total Views
1,260
Views on SlideShare
1,260
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • As we all know, the earth is a sphere. So why is it that maps of the globe generally depict Australia at the bottom? Well, in Australia the cartographers that we generally know, all came from Europe and we were the antiopedians. But the isolation caused by the geographic position of Australia has been of advantage to us, in making us think latterally to overcome the tyrrany and challenges of distance. 15 years ago, the scientific sector considered it critical to have mirror sites of the major European and North American databases in Australia. This has been achieved and today we use the services with little thought to where the information is based. Our profession of librarianship and information services, does not include the technical aspect of ensuring mirror sites to resources, however the issues of custodianship, longevity of access to the resources, and accuracy of information retrieval is paramount. This seminar today will concentrate on this last point – accuracy of information retrieval.
  • Australia sees itself very much connected with Asia and the Sub Continent
  • Australia is an independent Western democracy with a population of more than 21 million. It is one of the world’s most urbanised countries, with about 70 per cent of the population living in the 10 largest cities. Most of the population is concentrated along the eastern seaboard and the south-eastern corner of the continent. Australia’s lifestyle reflects its mainly Western origins more
  • The six states of Australia are: * New South Wales (NSW) * Queensland (Qld) * South Australia (SA) * Tasmania (Tas) * Victoria (Vic) * Western Australia (WA) Each state has its own state Constitution, which divides the state's government into the same 3 divisions of legislature (parliamentary acts & regulations), executive (members of parliament), and judiciary. The Commonwealth Government works on the same basis. There are 10 Australian territories outside the borders of the states. Two mainland territories: * The Australian Capital Territory (ACT) * The Northern Territory (NT) and 7 other territories, allof which are islands except for the Antartica Territory. In these territories, a range of governmental matters are now handled by a locally-elected parliament. Point out sydney Canberra and Melbourne
  • A country's coat of arms is traditionally a way of depicting the values of a nation. In the Australian coat of arms the kangaroo and emu are the native animals that hold the shield with pride. Some say the kangaroo and emu were chosen to symbolise a nation moving forward. This is based on the common belief that neither animal can move backwards easily. A gold Commonwealth Star sits above the shield. Six of the star’s points represent the Australian states. The seventh point represents the territories. A wreath of gold and blue sits under the Commonwealth Star. Gold and blue are the Commonwealth Coat of Arms’ livery or identifying colours. Australia’s floral emblem, the golden wattle, frames the shield and supporters. A scroll contains the word ‘Australia’.
  • Canberra is nearly 300km from Sydney and some 650km from Melbourne – a similar distance between Pune and Mumbai. A planned city, it is laid out around an artificial lake. The Territory became self-governing in 1989. National government remains its main industry, but private sector employment has expanded and includes production of sophisticated scientific and communications equipment, and computer software. Canberra is a graceful and industrious city surrounded by a mostly untouched natural environment. ACT Demographics The population of the Australian Capital Territory at the last national Census was over 311,000 people – a tenth the size of Pune
  • Canberra is often referred to as the Bush Capital. This view of the National Library and with the Parliament House flag in the background, shows the greenery that helps to give this name to Canberra Lake Burley Griffin divides central Canberra , with the city centre, called the "Civic", on the north side and the parliamentary and embassy area on the south side. National institutions are spread on both the North and South side of the Lake. . http://www.canberraeguide.com/orientation.php
  • An image of Parliament House, Canberra.
  • This is the Indian High Commission in Canberra. It is situated very close to our Parliament House On 15 October 2009, The Hon. Stephen Smith MP, Australian Minister for Foreign Affairs spoke to the Asia Society, in Mumbai His topic was Australia and India: Convergence of interests I will now quote to you from that speech. India is once again assuming the mantle of global influence reflecting its economic size and its economic strength, its strategic weight, and its rich history. Australia recognises that global economic, strategic and political influence is shifting to the Asia-Pacific. It is clear that India is a major part of this transformation and that it will help to shape the global structures which will determine the future of our world. India’s rise is not only based on its population or its economic strength. Prime Minister Singh encapsulated India’s potential elegantly, when he said: “ I am convinced that the 21st century will be an Indian Century. The world will once again look at us with regard and respect, not just for the economic progress we make but for the democratic values we cherish and uphold and the principles of pluralism and inclusiveness we have come to represent which is India’s heritage”.1 Australia shares these values and these virtues. We also share India’s wish to play a constructive role in world affairs. These values and these aspirations lie at the heart of the new relationship between our two nations.
  • Read slide And so these bring us to our basic topic for today, metadata for resource discovery and
  • In 2006 the Guardian Newspaper in UK established the blog named Free our Data It is Guardian Technology campaign for free public access to data about the UK and its citizens Recently Sir Tim Berners-Lee and Mr Shadbolt presented an update to Cabinet on their work advising the Government on how to make data more accessible to the public.
  • On 25 January 2010, the government released their new beta version website http://data.gov.uk/ There is now a considerable difference between the amount of public sector information available in the UK and USA and other nations! Austrlia is following UK in this model
  • Metadata is structured data about other data to describe the content, quality, type, creation, and spatial information about a resource. Its purpose is to enable resources to be accessed and re-used where and when required and may not be visible in the presentation of that information to the user. It may be described as an electronic equivalent of a library catalogue card. Metadata can also describe offline electronic resources such as data stored on CD or tape media or physical resources such as paper or maps.
  • Resource discovery metadata is information in a structured format describing a resource or a collection of resources. Metadata not only helps find resources and data, but it then indicates how to interpret and use the data. Publishing metadata facilitates data sharing between organizations and stimulates cooperation and a coordinated, integrated approach to policy issues. GIS metadata has a spatial component such as the extent of the earth's surface the data covers. Metadata can describe GIS data, a GIS Web service, or an online metadata catalogue. Consistent online resources maximises opportunities for users to find the most relevant and comprehensive set of resources for their purposes, whether the resourcebe text, a still image, a moving image or a sound file, or a combination of these. Metadata can also be used to organise, store and retrieve items for information management purposes. Metadata is the key to the semantic web
  • There is only time today to give a brief overview of these projects, from the perspective of the work with which I was involved. My role within these projects varied The Business Definitions Registry involved a team of 12. initially I was contracted as one of the team of Definitions Researchers. This role was later expanded as my contracts were extended However, at Attorney General's Department and Land and Water Australia, I managed the project reporting to relevant management personel, who in turn reported to governing committee members.
  • This is the Centrelink homepage
  • The Commonwealth Services Delivery Agency (CSDA), known as Centrelink, was created by the Federal Government on 1 July 1997 to deliver a range of services to the Australian public on behalf of client government departments. It is a statutory agency which until recently was accountable to the Minister for Family and Community Services and theFederal Parliament through its Board. In October 2004 Centrelink became part of the new Department of Human Services along with five other Commonwealth service delivery agencies. Centrelink is a large organisation, being in the top one hundred of Australian companies in terms of size and turnover. It has Australia's largest single purpose call centre network, the world's ninth largest information technology (IT) network, employs more than 27,000 staff over 450 sites, distributes approximately $55 billion in social security and other payments on behalf of client agencies and has a recurrent budget of $1.6 billion. On an annual basis, Centrelink: • .......................... DONT READ has 6.3 million customers, approximately one-third of the Australian population; • administers more than 140 different products and services for 25 government agencies; • processes 9.6 million individual entitlements; • grants 3 million new claims; • sends more than 94 million letters to customers; • receives more than 25.5 million telephone calls; and, • receives 27.7 million web site views each year. ....................... Additionally, Centrelink: • has more than 1000 service delivery points ranging from large Customer Service Centres to small visiting services; • supports 14 million electronic customer records and 12 million electronic customer transactions on an average day; and, • provides personalised services in over 50 Indigenous and 100 international languages. Centrelink’s business is derived from partnerships with 25 government agencies. The services are delivered on a purchaser/provider basis, so close relationships with the agencies are crucial, as is the requirement to have a common understanding of terminology used in business. Major agencies in partnership with Centrelink include the Departments of Employment and Workplace Relations; Family and Community Services; Education, Science and Training; Veterans' Affairs; and Health and Ageing.
  • Estimates of government expenditure are referred to Senate committees as part of the annual budget cycle. This opportunity to examine the operations of government plays a key role in the parliamentary scrutiny by the executive.
  • We are now coming to the next hard part.
  • The adoption of ISO 11179 by Centrelink was to enable the organisation to not only internally manage definitions in a controlled consistent manner, but also to provide Centrelink with a framework facilitating the sharing of information across the three sectors of government, community and business organisations within Australia and internationally. ISO 11179 has already been adopted by the Australian Bureau of Statistics, the United States government, as well as several European nations. The use of ISO 11179 enables the exchange of information with these external agencies on an international basis as it adopts the recognised international standard for communicating data definitions. It was seen as having the potential to make Centrelink an international leader with regard to definitions in the area of community and social welfare services
  • Before we go further, we need to define what we are speaking about. All here will be familiar with the term IM as Information management: However, you may not have heard of MI – or Management Information. So basically MI is regarding the information from the business Management perspective: It includes those regular reports - often from financial or the human resources databases.
  • We have a saying in Australia, when something is beyond our immedaite understanding, That your reaction is one equal to a stunned Mullet! - Usually with its mouth left gaping open! This is how most people react to this particular standard!
  • A formalised conceptual description of those things of interest to the organisation, where formal control of their definitions will be of value to the organisation for business information and performance reporting.. Business definitions relate to information concepts, business services, business processes or other aspects of the organisation that need to be formally defined. It is essential for any organisation to have an enterprise-wide common understanding of the language used in its business operations. This is just as important in the sphere where data is used to report on business performance as it is in the sphere of high level strategic documentation. At all levels of business it is essential that the meaning of business language is understood in the same way by all those who need to use it. Many organisations, especially those as large and diverse as Centrelink, are beginning to realise that enormous benefits can be gained by managing their business language in a more structured and definitive way. A key aspect of doing this involves the organisation coming to agreement and committing to using the same terms in the same way and to associating agreed meanings to those terms. Centrelink’s language is certainly diverse and dynamic, and it is a vital element in the organisation’s business processes. A shared or common language can reduce the amount of time spent on duplicating and reinventing processes by making an organisation’s existing intellectual capital resources more visible and accessible. It can also be an effective support for reliable and useful management information. For instance, if the organisation needs to report on its business performance in delivering services, it needs to have a clear understanding of the services it is reporting on, the business rules surrounding the data about those services, and the context in which the services are provided and the data collected.
  • bjectives The objectives of this project are to: .1. Establish a metadata registry and associated process Establish the capability and a process to control the definition and administration of business definitions and to facilitate data exchange by operationalising a recognised international standard for metadata registries. This will support the organisation to work collaboratively and allow understanding, definition and management of MI needs and greater capacity to re-engineer Centrelink business; and, .2. Establish data lineage Improve the quality of information and provide data transparency by enabling feedback loops to data custodians so that action can be undertaken when information quality issues, including problems with ineffective information categorisations, source data inadequacies and misalignment with user needs are identified.
  • In contrast the thesaurus stores and defines terms needed to locate information about Centrelink's business; Uses words to describe relationships between words and hierarchy; Captures the language of the business to enable resource discovery.T he Business Definitions Registry Stores and defines concepts of interest to Centrelink's business. DONT READ The Registry covers the concepts used specifically in the business of Centrelink, especially terms that relate to business information concepts, business services, business processes or other aspects of the business that need to be formally defined. For example, to allow accurate and consistent management and performance reporting. The Registry has the capability to put concepts in context, provide purpose statements, usage statements, business rules, source authority and identify business custodians
  • The Registry has the capability to put concepts in context, provide purpose statements, usage statements, business rules, source authority and identify business custodians
  • This example defines the term Definition text : Age Pension: An income support payment for people who have reached age pension age who are not able to fully provide for themselves in their retirement. Source Document : GIVES THE ACT AND SECTION WITHIN HE ACT Guide to social security Law 3.4.1.10 Qualification for Age Pension, Social security Act s43 (1) Qualification for age pension Source Authority : STATES WHERE THE TERM HAS BEEN USED Department of Family and Community Servcies; Social Security Act 1991 Source Use : HOW IT IS USED IN Centrelinks business Qualification for Age Pension includes, having 10 year qualifying residence
  • As noted earlier, Within the Project my role was as a Definitions Researcher Our work entailed: Liaising with the various business areas sthat used the terms to clarify what they meant. Each business area would nominate their liaison person, ie their Subject Matter Expert. The definitions were checked against the terms used in Commonwealth Acts and regulations. Any inconsistancies were then discussed with the various business areas so that an accepted definition was reached. As you can see this is quite a complex procedure, but fascinating work. Often researching one term would take at least an hour, and sometimes several hours. After some months, my role changed and I went on to Advise and assisted in the development of workflows and procedures for the Research Definitions team; Wrote technical and policy documents for the project; and then Developed template documents and also technical procedure statements based on ISO 11179, to populate the database; and And Finally as the project prepared to go into production, I wrote and delivered training modules for the newly appionted permanent staff.
  • Centrelink’s Purpose is: Serving Australia by assisting people to become self-sufficient and supporting those in need. Centrelink provides services on behalf of more than 25 organisations. Centrelink delivers information, payments and services detailed in Business Partnership Agreements or similar arrangements. Centrelink has Business Partnership Agreements in place with the following Policy Departments: • Australian Government Department of Families, Housing, Community Services and Indigenous Affairs (FaHCSIA); • Australian Government Department of Education, Employment and Workplace Relations (DEEWR); • Australian Government Department of Agriculture, Fisheries and Forestry (DAFF); and • Australian Government Department of Health and Ageing (DoHA). Centrelink acts in partnership with other levels of government and the broader Australian community and distributes payments to Australian families, communities and individuals. These payments include income support and family assistance payments and payments under a range of rural assistance measures. The outcomes relate to Government welfare priorities.
  • In contatrast, This is the typical Thesaurus structure A typical thesaurus structure will include: SN Scope Notes UF use for BT Broad Term NT narrow term RTRelated term
  • How many here are familiar with metadata standards such as Dublin Core?
  • DONT READ The National Spatial and Information Management (NSIM) working group is the peak body for AusDIN. NSIM was formed through the merger of the National Spatial Information for National Security (NSINS) and the National Information Management Advisory Group (NIMAG) in 2006. The AusDIN Portal 2008 – 2011 Strategic Plan is governed by the NSIM Information Management strategic Plan. ............................. This 2 nd project to be discussed is related to the AusDIN Portal. AusDIN was established to facilitate a whole of government approach to Emergency Management, across the three tiers of Australian government.- that is, local level, state level and the Australian Government. The AusDIN steering committee is made up of an agreed nucleus of representatives from all tiers of government and .experts from chosen fields such as Country Fire Authority to address specific issues as required. This committee is able to initiate the development of papers to elevate issues in Emergency Management to the key government decision makers. T
  • A number of national emergency management data collections are currently produced by Australian organisations including Emergency Management Australia (EMA), the Australasian Fire and Emergency Service Authorities Council (AFAC), the Council of Ambulance Authorities (CAA ), [ the Steering Committee for the Review of Commonwealth/State Service Provision (SCRCSSP) and Macquarie University’s Natural Hazards Research Centre (NHRC)] . Additional resources are being continually added to these existing collections and new projects and data repositories are being discovered. There is also a large number of Australian state and territory emergency management data collections.
  • AusDIN Portal was developed using the Australian propritory database software accesspoint. Accesspoint uses AGLS – an Australian standard based on DC for the tags in its database. AGLS was used as the standard for the developmet of the subject specific profile, EM Metadata Application Profile.
  • The purpose of the Profile is to support data harmonisation thereby enhancing access, sharing re-use and repurposing of data and information resources and services within the Emergency Management Sector of Australia.
  • The lack of standardisation in the area of Emergency Management information resources poses an increasing number of problems in our ability to search, find and retrieve relevant and/or the specific resources. For example, relevant information can be missed because of the different types and formats they are held in, such as images, databases, and PDFs. Metadata can identify all these resources thus ensuring they are retrieved by search engines.
  • This diagram illustrates the interaction between the major standards, profiles and guidelines that influence The Profile. Also, impetus for The Profile was provided by Emergency Management Information Development Plan (EMIDP) , Australia 2006, which was published as Australian Bureau of Statistics (ABS) Catalogue No. 1385.0. The main driver of the plan was for the emergency management sector to have access to consistent and comparable information for research and evidence-based decision making and to assist in coordinating the activities of emergency management agencies by promoting a more unified national body of information within an appropriate conceptual framework.
  • This illustrates the two key standards that were used to develop the Profile I hope everyone here is familiar with Dublin Core. Yes? Dublin Core has 15 properties. AGLS has extended this number and limited the use of some for the Australian context. ISO 19115 on Geographic Information Metadata is very much more complex and refers to elements rather than properties. ANZLIC is than an Australian Profile of ISO 19115 .
  • The Emergency Management Metadata Application Profile is designed to facilitate the use of properties for the implementation of database applications by individual organisations and subject specialists within the Emergency Management Sector. Further interpretation will be required by each agency, in consideration of the software system used and the types of resources managed. Each property has both HTML and XHML syntax and examples; these generally follow the format used by AGLS Metadata Standard. Section 4 of The Profile provides examples using XML and has been developed to facilitate interoperability with other metadata profiles. The profile follows the AGLS Metadata Standard (AS 5044-2009) . These properties are extended with a sub-set of elements from the ANZLIC Metadata Profile of ISO 19115-2005, to enable interoperability between the two standards. It should be noted that this is not an application profile of ISO 19115. Properties that relate to geospatial information are based on descriptions from ANZLIC. We acknowledge that the Profile cannot be a profile of ANZLIC. However, the Profile
  • Minimal details about these properties are included in The Profile. The ANZLIC Profile should be consulted for any further clarification.
  • The Profile differentiates between resources that do have geo-spatial context/information and those that do not. The four obligation categories follow the AGLS metadata properties guidelines: Mandatory: these properties must be present in all metadata records; Conditional: these properties must be present under certain circumstances; Recommended: there may be valid reasons in particular circumstances not to include these properties, but the full implications must be understood and carefully weighed; and Optional: these properties are truly optional
  • This is an example of a property description with the profile
  • READ SCREEN FIRST While I was working on this project, a new version of ANZLIC was released. A panel consisting of staff from National Archives Australia, Geoscience Australia, OSDM, where I was also contracted at the time and so a member of this team, were responsible for reviewing the mapping. This document has now been released.
  • To show the links between the two standards, each property required an introductory note Such as this example for Publisher
  • There were a number of issues when I began the process of negotiation with IS) 19115 experts. At the time I commenced the project, DC did not include Administrative properties. These are now optional Also the DC geopspatial metadata properties were very rudimentary and so did not map to ISO 19115. This Emergency Management Metadata Application Profile provides a method of linking between the two standards that was not previously available.
  • As you no doubt are aware, Government departments and agencies can be very fluid – here today and gone tomorrow, depending on the priorities of the elected Government. The Australian Government agency, Land and Water Australia played an important role in providing communication channels between farming communities and researchers. Although the agency was disbanded from the end of last year, many of the activities that were undertaken by this agency have been adopted by other departments and agencies.
  • The project that I was involved with entailed the development of a thesaurus for the Australian Agriculture and Natural Resources On-line database, known as AANRO. AANRO is described as 'It is an integrated knowledge discovery tool for agriculture and natural resources.' This database is now under the agency Rural Industries Research and Development Corporation The database is managed by a Committee with approximately 45 members representing both the private and government sectors from all states and territories in Australia. I was employed for an initial part-time 3 month contract. This was later extended for another two weeks for me to complete the initial stages of the thesaurus.
  • As with any project, once I was in the job, further requirements were specified: At a meeting with AANRO representatives a further requirement was requested: To ensure that the AANRO thesaurus is compatible with the digital repository, which is currently being developed by DAFF and AANRO. This requirement imposes three new criteria: Interoperability with AGROVOC, the thesaurus to be used in the new repository; Interoperability with SKOS Thesaurus to be XML format compliant. S
  • The project was divided into the following sections Section 1: Survey of agricultural and related agencies and libraries throughout Australia. Section 2: Research bibliography Section 3: Review existing thesauri Section 4: Data analysis of the four thesauri identified as relevant to the set criteria Section 5: Identification and Business Case for the use and purchase of a thesaurus software application. So my first task was to survey of the thesauri used by agricultural and related agencies and libraries throughout Australia . The list of contacts that I compiled consisted of AANRO member organisations and state and Australian Parliamanetary libaries. Initial contact was made by phone to ensure that I emailed the survey to the correct person.
  • While I waited for the responses, I began the researching that was required . Firstly to ensure that that I understood the implications of terms such as SKOS SKOS stands for Simple Knowledge Organization System As you can see from the URL it is managed by w3C.org It is is an area of work developing specifications and standards to support the use of knowledge organization systems (KOS) such as thesauri, classification schemes, subject heading lists and taxonomies within the framework of the Semantic Web SKOS provides a standard way .. Read from slide
  • The way that I manage my projects is to document background research by developing a bibliography. This is a mind map representing the bibliography that was developed. A copy of the bibliography will be available from Mrs Meera Gaikaiwari if anyone is interested. I will now break this down a little.
  • The topics covered in this document include: Techniques on thesaurus development; Standards used with thesaurus development and maintenance; Eg Use of SKOS and XML requirements for thesaurus management; Thesaurus applications and their compatibility for use within DAFF and AANRO database environments General articles relating to on interoperability with thesauri and associated metadata.
  • Section 3 Of the 40 agencies contacted, atotal of 29 agency libraries were identified as having relevant taxonomies/thesauri. These were then sent the email survey This is a mind map of the responses that I received – there were: 4Australian – General subjects 4 thesauri 8 Australian Subject specific Thesauri 8 thesauri 4 Thesauri developed overseas and used in Australia, but recognised as being low on Australian terminology 7 Organisations using in-house taxonomies 2 Other databases examined and out of scope (i.e. do not use a taxonomy or thesaurus) Two databases were specifically mentioned by management for evaluation. Besides the thesauri that have been identified as useful, there were 7 dictionaries and taxonomies that are too specific to be in scope of the project. However, they are now noted as very useful Authority lists I will now break this down a little for you to see more clearly
  • These are the min These are the main groupings that I identified
  • These are the Austraian library based thesauri that were examined
  • These are the international Thesauri that were examined
  • And finally the GIS based projects that were reviewed.
  • Of these, the following four thesauri were identified as being within scope for use with the AANRO Thesaurus. All have been developed using internationally recognised thesaurus standards. 1.Universal Agricultural Language – a thesaurus developed and maintained by Department of Agriculture and Fisheries, Western Australia. This thesaurus includes terms from CABI, 2.Thesaurus, Department of Environment and Resource Management (DERM), Queensland. Subject areas were developed using subject Thesauri such as GEMET, Aqualine, CABI, GeoRef and other glossaries of terms 3.Tropical Savannas CRC Terms This is a subset of the AIATSIS Thesaurus. The terms, and their hierarchical structure have been taken from the AIATSIS Thesaurus. 4.Community Access to Natural Resources Index (CANRI) CANRI was developed specifically for use with the NSW Natural Resources Atlas. The developers liaised with international colleagues in EPA and US Geological Survey, who advised the use of GEMET as a basis for CANRI which is based on ISO 191139.
  • The format of review followed that used successfully by the developer of the Victoria Online Thesuarus and used the following headings Thesaurus Name Scope and application Standards followed Structure Language (e.g. this could be natural language, academic, official...) Evaluation
  • Thesaurus/data inoperability issues: The thesaurus currently used in AANRO databasewas received as a straight list of words. It is not available in hierarchical order. This had a major impact, as the new thesaurus was to be based on this file. The AANRO thesaurus was orignially based on the UAL. A copy of the UAL was finally received on 15 April. This delay did impact on the full analysis of the data at this time and its inclusion in the final list of thesauri meeting criteria. It was not available as a text file but It was sent in XML format which then required data cleansing to convert to txt format. The non-preferred terms for the UAL are only available in pdf. They have therefore not been included in this project. CANRI was initially evaluated as out of scope for the following reasons: It is no longer maintained As this project is to be completed within a very short time frame, geospatial terminology was considered out of scope. However, at a meeting on 29 April (days before the end of my contract) it was emphasised that terms from CANRI should be included. Some initial work was done. At a later stage check proved that 70% of CANRI terms are already included within the final thesaurus. The Parliamentary Library, Australian Parliament House, also has two major subject sections relevant to this project. However, again there was great difficulty in obtaining a copy of the thesaurus. Eventually it was obtained in a format that could not readily be transferred to a hierarchical structure, and so the document was considered out of scope for this reason.
  • Section 5 Identification of an appropriate thesaurus software application; Develop a Business Case for the use and purchase of a thesaurus software application. The purpose of acquiring a thesaurus application was to facilitate the development and future management of the AANRO Thesaurus. The project required an application that will allow the import and export of data (the thesaurus terms) in a hierarchical structure, in both txt and xml formats. The thesaurus terms are be developed in, or imported into, the application and then exported regularly as updates for inclusion in the AANRO database.
  • Two opensource Thesuarus applicatoins were recommended by LWA management for me to investigate. However, Temaratres was no longer available and Aiksaurus was out of scope as it did not alow xml format. Other librarians that I had contacted during the project recommended Multites. When followed up, this one was most successful. And was used for the project. It is often used with content management systems as well as library systems.

India Seminar India Seminar Presentation Transcript

  • Ontologies, Thesauri and metadata standards as resource discovery tools Presented by Jeanette Regan Regan Research Services Canberra, Australia An SLA Asian Chapter Seminar, Held at Persistent Systems Ltd, Pune India 12 February 2010
  • The world from Down Under http://upload.wikimedia.org/wikipedia/commons/c/cf/Worldmap_LandAndPolitical.jpg J Regan 12/2/2010 SLA Seminar Pune, India
  • Format of this seminar
    • Introduction
      • An Australian perspective
      • Australian Government Sector Information Sharing Strategies
    • Metadata
    J Regan 12/2/2010 SLA Seminar Pune, India
  • Format of this seminar (cont.)
    • III. Discussion of projects in the Australian Government Sector:
      • Development of a Business Definitions Registry, Centrelink
      • Development of the AusDIN Emergency Management Metadata Profile, Attorney General's Department
      • Development of the AANRO Thesaurus, Land and Water Australia
    J Regan 12/2/2010 SLA Seminar Pune, India
  • Our Region: South East Asia and Australasia J Regan 12/2/2010 SLA Seminar Pune, India
  • Australia In Brief
    • Population: 21 874.9
    • Is the only nation to govern an entire continent
    • Lies between 10° and 39° South latitude
    • Apart from Antarctica, Australia is the driest continent on earth
    J Regan 12/2/2010 SLA Seminar Pune, India
  • Australian States and Territories Canberra and Sydney J Regan 12/2/2010 SLA Seminar Pune, India
  • Australian Coat of Arms J Regan 12/2/2010 SLA Seminar Pune, India
  • About Canberra J Regan 12/2/2010 SLA Seminar Pune, India
  • The bush Capital J Regan 12/2/2010 SLA Seminar Pune, India
  • The Australian Government Parliament House Canberra Author: Noodle snacks (http://www.noodlesnacks.com/) J Regan 12/2/2010 SLA Seminar Pune, India
  • Indian High Commission Canberra Australia and India: A Convergence of Interests J Regan 12/2/2010 SLA Seminar Pune, India http://www.foreignminister.gov.au/speeches/2009/091015_asia_society.html Image: http://www.hcindia-u.org/high_commission_of_india.html
  • National Government Information Sharing Strategy The National Government Information Sharing Strategy offers the opportunity to unlock the engine room of government and improve the delivery of services to the Australian community. http://www.finance.gov.au/publications/national-government-information-sharing-strategy/index.html J Regan 12/2/2010 SLA Seminar Pune, India
  • Goals *Make it easy for the public to get access to government services * Improve government’s approaches to evidence-based policy and decision-making * Assist governments to deliver shared services to the community * Strengthen the agility and responsiveness of Australian governments to meet challenging needs * Manage government information as a strategic asset, providing more efficient and effective use of it. J Regan 12/2/2010 SLA Seminar Pune, India
  • http://www.freeourdata.org.uk/blog/ J Regan 12/2/2010 SLA Seminar Pune, India
  • J Regan 12/2/2010 SLA Seminar Pune, India
  • The Semantic Web And Resource Discovery J Regan 12/2/2010 SLA Seminar Pune, India
  • Resource discovery metadata
    • is information in a structured format describing a resource or a collection of resources.
    • Metadata not only helps find resources and data, but it then indicates how to interpret and use the data.
    • Publishing metadata facilitates data sharing between organizations and stimulates cooperation and a coordinated, integrated approach to policy issues.
    • GIS metadata has a spatial component such as the extent of the earth's surface the data covers.
    • Metadata can describe GIS data, a GIS Web service, or an online metadata catalogue.
    J Regan 12/2/2010 SLA Seminar Pune, India
  • Note: The Images used in this presentation are freely available on the internet unless otherwise noted .
  •  
  • Format of this seminar (cont.)
    • III. Discussion of projects in the Australian Government Sector:
      • Development of a Business Definitions Registry, Centrelink
      • Development of the AusDIN Emergency Management Metadata Profile, Attorney General's Department
      • Development of the AANRO Thesaurus, Land and Water Australia
    J Regan 12/2/2010 SLA Seminar Pune, India
  • J Regan 12/2/2010 SLA Seminar Pune, India
  • About Centrelink
    • Some statistics at the time of this project, 2005:
      • A Commonwealth Services Delivery Agency
      • Top 100 Australian company in terms of size and turnover
      • Australia's largest single purpose call centre
      • World's 9th largest IT network
      • Distributed $55 billion
      • Centrelink delivers information, payments and services detailed in Business Partnership Agreements or similar arrangements
    J Regan 12/2/2010 SLA Seminar Pune, India
  • About Centrelink (Cont)
    • Centrelink, as a Government agency, has a responsibility to report each year to the Parliament's Estimates Committee as part of the Parliamentary budget cycle.
    • These hearings provide the opportunity to examine the operations of Government and play a key role in the scrutiny by Parliament's Executive.
    J Regan 12/2/2010 SLA Seminar Pune, India
    • Centrelink has the challenge of making its enormous data consistant and accurate for reporting purposes.
    About Centrelink (Cont.) J Regan 12/2/2010 SLA Seminar Pune, India
  • How is this achieved by Centrelink? Through the use of good governance and international best practice – i.e. the use of standards J Regan 12/2/2010 SLA Seminar Pune, India
    • The purpose of a Business Definitions Registry (BDR) is to enable greater accuracy in reporting Management Information.
    • From the technical side:
    • It is a web-enabled database that supports a
      • Business Definitions Policy and
      • A Governance Process
    About ISO 11179: Business Definitions Registry (BDR) J Regan 12/2/2010 SLA Seminar Pune, India
  • Management Information
    • IM = Information Management
      • A method to use technology for collecting, processing and condensing information with a goal of efficient management. http://www.bitpipe.com/tlist/Information-Management.html
    • MI= Management Information [system]
      • An automated system designed to provide progress and status information to management as an aid to decision making. http://stats.oecd.org/glossary/detail.asp?ID=4412
    • or
    • MIS combines tech with business to get people the information they need to do their jobs better/faster/smarter
    J Regan 12/2/2010 SLA Seminar Pune, India
  • ISO 11179 Metadata Standard The stunned mullet ! J Regan 12/2/2010 SLA Seminar Pune, India
  • What is a Business Definitions Registry
    • BDR is a tool that provides the means for recording the information components that have been defined by business teams.
    • The process of populating definitions into BDR includes approval and mandating at the Centrelink Executive level.
    J Regan 12/2/2010 SLA Seminar Pune, India
  • Objectives of the Centrelin BDR Project
    • Establish a metadata registry and associated process
    • Establish data lineage
      • i.e. Improve the quality of information and provide data transparency
    J Regan 12/2/2010 SLA Seminar Pune, India
  • What is the difference between a thesaurus and a BDR? Firstly, we must understand the Information Supply Chain and how this affects the way MI is produced in Centrelink. J Regan 12/2/2010 SLA Seminar Pune, India
  • J Regan 12/2/2010 SLA Seminar Pune, India
  • J Regan 12/2/2010 SLA Seminar Pune, India
  • Business Definition – Intranet View J Regan 12/2/2010 SLA Seminar Pune, India
  • The Project Team
    • Project Manager
    • Definitions Researchers
    • Centrelink Representative
    • Subject Matter Experts
    • Technical Analysts
    • Data Modeller
    • Business Analysts
    J Regan 12/2/2010 SLA Seminar Pune, India
  • Acknowledgements
    • Mack, M. Christensen,J. Centrelink's Business Definitions Registry: A tool to aid g2g data interoperability, DAMA Conference, 2005
    • Chakravarti, P.Do you know the meaning of your business? Centrelink’s Business Definitions Registry
    • https://wiki.nla.gov.au/download/attachments/2876/Chakravati.ppt
  • Any Questions?
  • Centrelink's Purpose
    • Serving Australia by assisting people to become self-sufficient and supporting those in need.
    • Centrelink provides services on behalf of more than 25 organisations.
    • Centrelink delivers information, payments and services detailed in Business Partnership Agreements or similar arrangements
    J Regan 12/2/2010 SLA Seminar Pune, India
  • Business Definition – Admin View
  • Level of Metadata/Content Definitions Values Data/Content Eg. "Title" is 256 characters Eg Title is " Unemployment Benefit" Eg Web page describing who can receive the benefit
  • Thesaurus structure
    • Term :Adolescents
    • SN Persons between puberty and maturity
    • UF Juveniles
    • Teenagers
    • BT Age groups
    • NT Adolescent females
    • Adolescent males
    • Adolescent parents
    • RT Adolescent development
    • Adolescent health
    • Adolescent pregnancy
    • Juvenile delinquents
    • Youth
    J Regan 12/2/2010 SLA Seminar Pune, India
  • The Centrelink Thesaurus
    • Based on the standard ISO 2788
    • Stores and defines terms needed to locate information about Centrelink's business;
    • Uses words to describe relationships between words and hierarchy;
    • Captures the language of the business to enable resource discovery.
    J Regan 12/2/2010 SLA Seminar Pune, India
  • Centrelink view of the language continuum JRegan SLA Seminar Pune
  • Challenges
    • Professional skills and differences
    • Standards - ISO 2788 (for thesaurus development) and ISO 11179 (for data elements and metadata registries)
    • To gain Senior executive support, we need to demonstrate value of activities
    • Funding
    JRegan SLA Seminar Pune
  • 12 February 2010 JRegan SLA Seminar Pune AusDIN Portal
  • 12 February 2010 JRegan SLA Seminar Pune About AusDIN
    • The Australian Disaster Information Network
    • (AusDIN)
      • established to facilitate a whole of government approach to Emergency Management, across the three tiers of Australian government.
  • 12 February 2010 JRegan SLA Seminar Pune The AusDIN Portal
    • Provides a mechanism to index and link to disparate emergency management websites, documents and other information resources
  • 12 February 2010 JRegan SLA Seminar Pune Behind the web interface... The Emergency Management Metadata Application Profile i.e. a subject specific profile of a metadata standard for an application (database)
  • 12 February 2010 JRegan SLA Seminar Pune Purpose of the Profile:
    • To support Data Harmonisation which enhances
      • access, sharing, reuse and repurposing of data
      • information resources and services
  • 12 February 2010 JRegan SLA Seminar Pune The lack of standardisation throughout Australia, in the area of Emergency Management information resources, poses an increasing number of problems in our ability to search, find and retrieve relevant and/or the specific resources. Current issues:
  • 12 February 2010 JRegan SLA Seminar Pune
  • 12 February 2010 JRegan SLA Seminar Pune Relationship between the standards used
  • 12 February 2010 JRegan SLA Seminar Pune The Emergency Management Metadata Application Profile
    • Designed to facilitate the use of properties
      • for the implementation of database applications
      • by individual organisations and subject specialists
      • within the Emergency Management Sector
  • Why not use the ANZLIC Profile?
    • ANZLIC is especially relevant for use with datasets
    • Many mandatory elements
    • Most users do not acknowledge that this amount of detail is required especially for general resource discovery purposes
    12 February 2010 JRegan SLA Seminar Pune
  • 12 February 2010 JRegan SLA Seminar Pune The Profile includes properties in the following three groups: 15 properties about the resource or service being described, and 8 properties that describe the metadata record. (see Administrative metadata below) 11 properties about spatial datasets and data. The Emergency Management Metadata Application Profile
  • 12 February 2010 JRegan SLA Seminar Pune Non-Geographic Datasets
    • The core set of AGLS metadata about a resource includes:
      • Title
      • Date
      • Creator
      • Availability
      • Identifier
      • Description
  • 12 February 2010 JRegan SLA Seminar Pune Administrative metadata (metadata about metadata)
    • These details assist in understanding how the record may need to be interpreted when being reused, such as when data is transferred to another database.They include:
      • Metadata file identifier
      • Metadata language
      • Metadata character set
      • Metadata point of contact
      • Metadata date stamp
      • Metadata standard name
      • Metadata standard version
  • 12 February 2010 JRegan SLA Seminar Pune Geographic Datasets By using the core metadata recommended in the ANZLIC Profile interoperability will be enhanced, allowing users to understand without ambiguity the geographic data and the related metadata provided by either the producer or the distributor.
  • 12 February 2010 JRegan SLA Seminar Pune
    • Contains metadata answering the following questions:
      • ‘ Does a dataset on a specific topic exist (“what”)?’
      • 'For a specific place (“where”)?’
      • 'For a specific date or period (“when”)?’
      • 'A point of contact to learn more about or order the dataset (“who”)?’
    Geographic Datasets
  • 12 February 2010 JRegan SLA Seminar Pune Geographic Datasets
    • The properties in this category are:
      • Dataset title
      • Dataset reference date
      • Abstract describing the data
      • Dataset responsible party
      • Spatial representation type
      • Spatial resolution of the dataset
      • Dataset topic category
      • Geographic location of the dataset (by four coordinates or by description)
  • 12 February 2010 JRegan SLA Seminar Pune Geographic Datasets
    • And
      • Dataset language
      • Dataset character set
      • Metadata file parent identifier
  • 12 February 2010 JRegan SLA Seminar Pune Obligation levels for properties Mandatory : these properties must be present in all metadata records; Conditional : these properties must be present under certain circumstances; Recommended: there may be valid reasons in particular circumstances not to include these properties, but the full implications must be understood and carefully weighed; and Optional : these properties are truly optional
  • 12 February 2010 JRegan SLA Seminar Pune Metadata Property description
  • 12 February 2010 JRegan SLA Seminar Pune Metadata Property description (cont)
  • 12 February 2010 JRegan SLA Seminar Pune Crosswalks between standards
    • A crosswalk or mapping , identifies which 'tag' or 'element' or 'property' in one standard relates to one in the second standard
    • AGLS and ANZLIC identified this issue for the interoperability some years ago.
    • Version 2 of the mapping is now available http://www.osdm.gov.au/Metadata/ANZLIC+metadata+resources/default.aspx
  • 12 February 2010 JRegan SLA Seminar Pune Metadata Property notes
  • 12 February 2010 JRegan SLA Seminar Pune Issues with developing The Profile
    • This is a profile based on Dublin Core.
    • ISO 19115 advocates are often purists and think that their complex standard should be used for all information resources
    • Keeping Current
      • Standards are constantly evolving
  • 12 February 2010 JRegan SLA Seminar Pune Conclusion
    • Keep standards simple for users to use!
  • Any Questions? 12 February 2010 JRegan SLA Seminar Pune
  • Land and Water Australia (LWA) JRegan SLA Seminar Pune
  • 12 February 2010 Australian Agriculture and Natural Resources On-line database, known as AANRO. AANRO is '.. an integrated knowledge discovery tool for agriculture and natural resources.' JRegan SLA Seminar Pune
  • 12 February 2010
    • AANRO Theseaurus Development Project, LWA
    • Aims:
      • Develop an Australian specific agriculture and natural resource management thesaurus
      • Use an open source platform for thesaurus development
      • Make full use of existing thesauri such as AIATSIS Thesaurus, CABI, SLASH and others yet to be identified.
      • To provide a standard set of terminology for use with all Australian agriculture and natural resource management databases.
    JRegan SLA Seminar Pune
  • 12 February 2010 Further project requirements:
    • To ensure that the AANRO thesaurus is compatible with the digital repository, Fedora v3, which is currently being developed by Dept. Agriculture, Fisheries and Forestry and AANRO.
    • This requirement imposed three new criteria:
      • Interoperability with AGROVOC Thesaurus, developed by FAO, as this thesaurus is to be used in the new Departmental repository;
      • Interoperability with SKOS Simple Knowledge Organisation Systems (SKOS) http://www.w3.org/2004/02/skos/
      • Thesaurus to be XML format compliant.
    JRegan SLA Seminar Pune
  • 12 February 2010 Project process
    • Section 1: Survey of agricultural and related agencies and libraries throughout Australia.
    • Section 2: Research bibliography
    • Section 3: Review existing thesauri
    • Section 4: Data analysis of the four thesauri identified as relevant to the set criteria
    • Section 5: Identification and Business Case for the use and purchase of a thesaurus software application.
    JRegan SLA Seminar Pune
  • 12 February 2010 SKOS http://www.w3.org/2004/02/skos/
    • SKOS Simple Knowledge Organization System provides a standard way to represent knowledge organization systems using the Resource Description Framework (RDF). Encoding this information in RDF allows it to be passed between computer applications in an interoperable way.
    • Using RDF also allows knowledge organization systems to be used in distributed, decentralised metadata applications. Decentralised metadata is becoming a typical scenario, where service providers want to add value to metadata harvested from multiple sources.
    JRegan SLA Seminar Pune
  • 12 February 2010 Section 2: Research bibliography JRegan SLA Seminar Pune
  • 12 February 2010 Main subjects in the bibliography JRegan SLA Seminar Pune
  • 12 February 2010 Water licence definitions – from Water Dictionary An example of why we need thesauri to enhance database searching JRegan SLA Seminar Pune
  • 12 February 2010 An example of why we need thesauri to enhance database searching JRegan SLA Seminar Pune
  • 12 February 2010 Word relationships in a thesaurus Agroforestry Use For: Agriforestry Broader Term : Forestry Related term : Arboriculture Scope Note : land use system in which woody perennials are deliberately grown with agricultural crops, with or without animals.
  • 12 February 2010 Section 3: Review existing thesauri JRegan SLA Seminar Pune
  • 12 February 2010 JRegan SLA Seminar Pune
  • 12 February 2010
  • 12 February 2010 JRegan SLA Seminar Pune
  • 12 February 2010 Geographic Information System (GIS) based projects JRegan SLA Seminar Pune
  • 12 February 2010 Thesauri identified as in scope
    • Universal Agricultural Language (Western Australia)
    • Thesaurus, Department of Environment and Resource Management (DERM), Queensland.
    • Tropical Savannas CRC Terms
    • Community Access to Natural Resources Index (CANRI)
    JRegan SLA Seminar Pune
  • 12 February 2010 Evaluation headings:
    • Thesaurus Name
    • Scope and application
    • Standards followed
    • Structure
    • Language (e.g. this could be natural language, academic, official...)
    • Evaluation
    JRegan SLA Seminar Pune
  • 12 February 2010 Section 4: Data analysis of the four thesauri identified :
    • Many of the thesauri were in Content Management Systems (CMS) and these
    • caused major inoperability issues:
      • thesauri not available in hierarchic structure
      • Thesauri output not available in csv or xml
    • format
      • Thesauri available in printed output only
    JRegan SLA Seminar Pune
  • 12 February 2010 Section 5:
    • Identification of an appropriate thesaurus software application;
    • Develop a Business Case for the use and purchase of a thesaurus software application.
    JRegan SLA Seminar Pune
  • 12 February 2010 Thesauri applications reviewed
    • Tematres
      • http://tematres.r020.com.ar/index.en.html
      • no longer available online.
    • Aiksaurus
      • http://www.cs.utexas.edu/users/jared/aiksaurus.cgi
      • Out of scope - does not include xml format.
    • MultiTes
      • www.multites.com
      • application fulfilled project requirements and is used by a number of key thesaurus developers in Australia (highly recommended)
    JRegan SLA Seminar Pune
  • 12 February 2010 Any Questions ?? JRegan SLA Seminar Pune
  • Note of thanks: I would like to thank the seminar organiser, Meera Gaikaiwari, SLA Asian Chapter Country representative; Dr Anand Deshpande, Managing Director, Persistent Systems Inc, for providing the venue and refreshments And all those involved with ensuring the success of this seminar.