Ieee agropedia-published version


Published on

Architecture of Agropedia, paper published in IEEE Internet Computing, Vol14, Issue-5

Due to a lack of precise and exact terms of reference, the Internet features very little content related to agriculture. Agropedia, one of the world's first agricultural knowledge repositories built from semantic, collaborative, and social networking metaphors, bridges this gap via agricultural knowledge models. Creating vibrant and diverse communities requires careful planning as well as an open, yet flexible, protocol. Agropedia, through its peer-reviewed scientific content contributed by agricultural research institutions and its community-generated interactive folk knowledge, brings together the expert community and the user community.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Ieee agropedia-published version

  1. 1. web extraCollaborationArchitecture of the AgropediaPlatform: Creating Contentand Community throughCollaborationNagaraju Pappu, Canopus ConsultingRuna Sarkar, Indian Institute of Management, KolkataT.V. Prabhakar, Indian Institute of Technology, KanpurDue to a lack of precise and exact terms of reference, there is very little content related to agriculture on the Internet.Agropedia, one of the world’s first agricultural knowledge repositories built from semantic, collaborative, and socialnetworking metaphors, bridges this gap via agricultural knowledge models. Creating vibrant and diverse communitiesrequires careful planning as well as an open, yet flexible, protocol. Agropedia, through its reviewed, scientificcontent contributed by agricultural research institutions and its community-generated interactive folk knowledge,brings together the expert community and the user community. An excerpt of this work appears in IEEE InternetComputing’s September/October 2010 issue.M ost of India’s workforce is tied to the ag- state universities, and research organizations have come to- ricultural sector, which faces unprec- gether to support and promote knowledge exchanges between edented challenges. On one hand, eco- different stakeholders in the agriculture domain. Agropedia nomic growth, the rise of the middle class, is the result of this ongoing effort. While building Agrope- and population increases have raised de- dia, we recognized that there isn’t much agricultural content mand for food production by several or- on the Web—even on Wikipedia, of the more than 2 millionders of magnitude, and on the other, changing climatic pat- articles available, only about 3,000 relate to agriculture. Ag-terns, depletion of natural resources, overexploitation of ropedia is an attempt to fill this gap. It’s unique in many re-farmland, and general deforestation have caused an unprec- spects: first, it’s the result of a consortium of institutions andedented shortfall in agricultural production. Unfortunately, universities that provide edited and reviewed expert contentthere’s very little public awareness about these problems. and articles; second, it’s a way for expert content creators and Recognizing the need to create a comprehensive knowledge ultimate target users (farmers) by allowing many different or-dissemination platform, a consortium of ICT institutions, ganizations in between to collaborate, participate, and inter-Unless otherwise stated, bylined articles, as well as product and service descriptions, reflect the author’s or firm’s opinion.Inclusion does not necessarily constitute endorsement by the IEEE or the IEEE Computer Society.S E P T E M B E R /OC TO B E R 2 010 © 2010 IEEE I E E E I n ter n et C omp u ti n g w e b e x t r a i
  2. 2. web extra Linking and cross referencing content Building semantic indexes Inserting ontological entities For as links in the content Content Content transformation publishing Crop Extensions calendars Content Content to tags For acuisition access Knowledge Discussions Like Tools models Review and Ratings Blogs Extension edit content Provides Like materials Called Agrowiki Interaction Agropedia content Questions and Gyandara Create Called answers Is Is a Create Jandara Explicit Collaboration and Is knowldge social networking spaces Community Community Tacit knowledge Bridging the gap between of experts of users Input and product markets Include Include Extension Call center education units Extension Agri- operators Traders workers scientists Krishi Vigyan Kendras Agriculture From From research services Farmers Students NGOs Village knowledge State agriculture Agriculture centers universities research institutesFigure 1. Agropedia is a network of communities. These communities range from agricultural universities supplying edited, reviewedcontent to village-level extension workers supplying field-level information.act. To accomplish such collaboration at a large scale, Agrope- in Figure 1. It aspires to be a one-stop shop for any informa-dia uses knowledge models (a simplified ontology) to serve as tion related to Indian agriculture—an audio-visual encyclo-a precise vocabulary of communication, which is very similar pedia designed to transform the process of digital contentto the efforts in fields such as medicine but a first of its kind in creation and organization by making it an enchanting educa-the agriculture domain. tional experience. Agropedia differs most from the Wikipedia model in its use The two most important dimensions of Agropedia are itsof inherent semantic indexing capabilities and authorized con- content and its community, which are intertwined in manytent. Wikipedia also lacks the mechanisms for the end user ways. Agropedia’s expert community consists of individualscommunity (people interested only in using and reading the or organizations engaged in agricultural research and knowl-content) to participate in the feedback and content enrichment edge dissemination, such as state and central agricultural uni-process. Likewise, some of the more common ontology-driven versities, research institutions, extension workers, researchsemantic content systems don’t provide for collaboration or stations, and so forth. Its user community primarily consistscommunity participation. of people at nodal- or village-level government agricultural This article describes the experience we gained while creat- centers—call center representatives, students, farmers, trad-ing this platform. An excerpt appears in print, in the Septem- ers, and so on.ber/October 2010 issue of IEEE Internet Computing. Naturally, Agropedia’s content is also of two types: expert sourced content and community-contributed interactive con-Organization and Structure of agropedia tent. The Gyandhara, or expert, content is edited, reviewed,Agropedia is “all things agriculture,” meant for anyone con- created, and maintained by the participating institutions. Ag-nected with or interested in the agricultural domain, as shown ropedia also supports the creation and distribution of inter-ii I E E E I n ter n et C omp u ti n g w e b e x t r a w w / interne t
  3. 3. web extraactive content called Janadhara (folk-stream) from anyone in guidance or information to that farmer. Similarly, a user suchthe community through wikis, blogs, forums, questions and as an extension worker might want to summarize many ar-answers, reviews, comments on articles, and content tagging. ticles or papers written by experts, connect such content to- Agropedia allows content contribution by any member of gether, and synthesize it for common, general usage.the national agricultural research system institutions in India Effective collaboration assumes absence of hierarchical au-or anywhere in the world. It includes a wide range of content, thority structures and centralized power centers. One reasonincluding text, images, and multimedia elements such as au- why several attempts to build such socially significant systemsdio, video, or animations. have failed is because of the naïve assumption that anyone in Agropedia also provides a flexible authoring environment the chain could produce something for the direct consumptionand tools for content acquisition, processing, and publishing. of the final, target end user. The semantics of collaborationIt consists of social and collaborative networking features such means that we must produce for our nearest neighbor. If ev-as blogs, forums, wikis, questions and answers, content tag- eryone in the chain works for their nearest neighbor, we endging by users, commenting, and rating authors and content. up with a vibrant, participatory community that grows. The lack of consistent terms of references for the agricul-Content and Community Creation ture domain is perhaps the reason for the Internet’s dearth ofthrough Collaboration agricultural content. Precise and exact terms of reference areBuilding a platform like Agropedia goes beyond technology a fundamental requirement for goal-oriented communicationand software: the primary challenge is to enable an environ- and interaction. Without such terms, it’s extremely difficult forment that allows a community to grow, organize itself, create a member of the community to take something from an expert,its own content, and interact and collaborate using the underly- enhance it, add value to it, and pass it to someone the next leveling content repository as the primary vehicle of collaboration. down in the chain. In other words, humanization of knowledge As Clay Shirky noted, a community is very different from isn’t possible without a shared set of terms of audience ( Large-scale, content-rich systems in specialized domainsmunity.html). Audiences can be built, but communities cre- such as medicine have used semantic networks and special-ate themselves and grow. However, to develop they need an ized ontologies successfully ( of a constitution—a way to govern themselves, documentation.html). For example, the UMLS’s role in thefacilities to create their own languages of communication and creation, use, and propagation of medical articles on the Inter-interaction, and methods to recognize and reward contribu- net makes them accessible not only by experts but also by antions by members. At the same time, when the community be- ordinary user. Agropedia’s knowledge models take a big stepcomes too large and too diversified, it loses its focus (http:// in this direction by contributing to a precise and exact The best way to deal tural vocabulary.with this is to create a platform that serves not only as a com- IITK and FAO first developed a generic knowledge modelmunity network but would also allow formation of networks for crops, which was then used as a baseline model for cropof communities.1 knowledge models. Figure 2 shows a generic knowledge model Another challenge to consider is the awareness and capa- and a specific crop ontology.bility of the agricultural community in India to absorb, as- Knowledge models are used for a wide variety of applica-similate, and work with a technology-assisted medium. The tions, from content tagging to automatic cross-referencing oftechnology can’t become a barrier to participation—instead, it content to relationship discovery. V. Balaji and colleagues2 de-should enable and encourage it, irrespective of the user’s tech- scribed in detail the process of how the agricultural knowl-nical proficiency. edge models were created, their importance, and the various applications built via those knowledge models.Function and Role of Knowledge ModelsThe heart of Agropedia is its knowledge models (http://agro- The Function and Role of Folk, which agri- We expect that the interaction space provided through thecultural scientists and experts from participating institutions Janadhara stream will ultimately make it possible for the com-created as a set of concept maps ( munity to humanize the “expert” knowledge and make it moremap.html). These knowledge models are Agropedia’s lingua accessible for general, common use. In particular, it can helpfranca. In the early stages of development, we quickly real- add practical, field-level examples, case studies, and enhance-ized that a large-scale content creation effort for use by a di- ments to the main expert content. It’s also possible for usersverse community would require its own language of commu- to link content via their own user-defined tag system, discovernication. For example, a paper or article written by a scientist content in the repository, use the expert content as a primarywouldn’t be directly relevant to a farmer—however, it might reference material to illustrate specific issues for farmers, andbe useful to a person working in the agricultural research sta- keep the content alive and the community vibrant.tion or an extension worker who needs to provide essential Lasting communities make up and transmit their knowl-SEPTEMBER/OCTOBER 2010 I E E E I n ter n et C omp u ti n g w e b e x t r a iii
  4. 4. web extra Nitrogen Primary_nutrient is a Phosphorous Seed_and_sowing Potassium are Secondary_nutrient Field_preparation Water_management Essential_plant_nutrient Micro_nutrient usesProcess makeUseOf Organic_manure Nutrient_management are IPNM makeUseOf Biofertilizer Production_technology Fertilizer Crop hasProductionPractices Crop_weed_competition Protection_technology causes Grass_weed makeUseOf Weed are Sedge_weed isManagedBy Broad_leaf_weed Cultural_weed_control Disease Insect_pest Weed_management Chemical_weed_control usesProcess Biological_weed_control Mechanical_weed_controlFigure 2. Each crop has its own specific knowledge model. All the crop knowledge models use the generic crop model.edge, culture, and values using folklore, which is the basic idea intuitive and nonintrusive. Many agricultural scientists, re-behind the Janadhara. As Ananda Coomaraswamy pointed search workers, and government officers don’t understand theout, 3 “folk tradition is ‘folk’ only in respect to its transmis- complexities of modern software. Moreover, many of themsion, not its origin. Folklore and Philosophia Perinnis spring don’t have a high-speed, always-on connection to the Internet.from a common source.” Therefore, we designed content submission to be extremely easy and nonintrusive. Any authorized member can submitInclusion and User-Centric Design content via email or through the website in any format. AThe Nobel Prize-winning economist Amartya Sen described small back-office IT staff accepts this content and runs thethe economic, social, and cultural value of SwIkriti4 as a cul- content transformation tools. The transformed content is thenture of openness, tolerance, and inclusion. SwIkriti basically presented to the original author, who can make any necessarymeans that everyone, irrespective of his or her capacity, has a modifications, change links, cross-references, and ontologicalplace, function, and role. This is one of the central community entities, add or modify the tags, and so on. The original con-building principles behind Agropedia. Here, we describe how tent and all subsequent transformations are preserved so thatwe used this simple principle to make crucial design choices. the entire process can be repeated to generate the final content view at any time.The Knowledge Model Creation ProcessThe platform and toolset must be designed such that they don’t The Content Transformation Processdemand a sharp learning curve by the user community (be it The most successful software in the world hides its complex-composed of content creators or users). While designing and ity—for example, just imagine the complexity that Googlecreating the knowledge models, we introduced simple graphical hides behind one simple text box. Successful software design-tools like concept maps (CMAPs), which are easy and fun to ers intuitively understand that the application and its “fea-use. In fact, the agricultural scientists weren’t even aware that tures” shouldn’t be conspicuous. The software should presentthey were making ontologies. The underlying platform was de- what the user needs and wants—content, community, and col-signed in such a way that it converts the CMAPS to internal se- laborative environment—not menus, application buttons, helpmantic indexing schemes. We conducted simple workshops and screens, complex search, and navigational layouts. Thus, thedesigned a few guidelines5 on what makes good concept maps, Agropedia interface presents only content to users; its func-which is all that was required to generate the knowledge models. tionality is presented in the form of embedded links and navi- gation in content elements, much like present-day Web 2.0 ap-The Content Acquisition Process plications such as LinkedIn, Orkut, and Facebook.We designed Agropedia’s content acquisition process to be Agropedia’s underlying architecture decouples content au-iv I E E E I n ter n et C omp u ti n g w e b e x t r a w w / interne t
  5. 5. web extrathoring from content publishing. As mentioned earlier, the Content as required Simple tools or Transformation System APIs, usercontent must be transformed using automated or semiauto- from experts manual process tools user input generated contentmated tools so that it becomes useful for the final end user. Repository Editing, language E Ed Content Original nal End userFigure 3 shows the content transformation model. Submit- correction co cross-linking generated contentted content is checked into the main content repository in the Ba met asic me Basic metadata, bookmarks aut author, sou our source,user’s original format. Automated tools then convert it into Converted to categories ed t Navagational links for ontological CommentsAgropedia canonical format (an XML representation of the Content processing nt standard content entitiescontent). Then, the content is parsed and the relevant onto- format—XML/TeX team (BPO) User-defined tagslogical entities and relationships are automatically inserted as Publishing User ratingnavigational links. In the final step, the content is cross-linked Author information Ca ory Category ry information information rma rmation min the repository. date stamp Discussions System atContent Access, Community Building, and Web 2.0 Content System Content Content as ent C nt run time Track usage backs generated indexes seen by anThe content’s state of readiness, its author’s information, staistics rating Agropediawhether it’s reviewed or under review, and its status are shown useras part of the content (similar to Wikipedia). Along with thisinformation, the user community can also rate the content— Figure 3. The content as acquired goes through several automatichow many people have accessed the content and its author’s and semiautomatic transformations. The user only see the enrichedstanding are all shown as part of the content’s attributes. and cross-linked content. This model accomplishes many community building objec-tives: specifically, it lets the community be aware of the mostuseful content, and it encourages positive reinforcement and Web servers, front-end caching, CDN functionalityfeedback to the authors. Temple engine, UI generation Application components Blog, discussion forum, chat, rating, profile management,Agropedia Technology contact management, messagingAgropedia is built on open source platforms and is completely Access control Tracking and Content caching Search engineopen source itself. We realized that its main assets aren’t the /user management statistics mechanisms functionalitytechnology and software but its content and community. User information Usage statistics Content element Indexes data store data store data storeTherefore, our primary architectural goal was to design Ag- Content delivery environmentropedia to scale to large amounts of content and a very largeand diverse user base. Most importantly, it was crucial that User interfaceto manage Authentication, transformation workflow access control managementwe keep the content’s lifetime high. In general, content out- Content format Content cross-linkinglasts people and even the tools used to create it. This is why Indexers transformation and tagging enginewe chose an open source platform so that content isn’t locked SVN-based content Content element Knowledge models repository data store data storeup in “technology.” Interfaces for content submission: Many applications that use vendor-specific storage formats file upload, webDAV, Email. scanner/OCRend up paying recurring license costs just to access content. Content authoring/transformation environmentLocking content formats to any particular representation orchoosing binary/proprietary formats aren’t useful in the long Figure 4. Agropedia’s layered architecture. The content acquisition,run. We adopted an open content representation approach transformation, and indexing are separated from the contentthat let us use a metarepresentation scheme like Tex, SGML, presentation and user modeland Docbook to preserve semantic structures and typesettinginformation. This is equivalent to the content being stored asa program plus data, which enables us to produce documents can also tag the content and design their own tags to groupconforming to various formats and standards at runtime. content together, meaning Agropedia can support both taxon- To make Agropedia scale to serve a very large end-user omy-based navigation as well as folksonomy-based navigation.base, we decoupled the back end, content authoring, editing, Figure 4 shows how the content transformation tools takeand transformation environment from the content access ap- the original content and insert ontological entities and rela-plications. This allows deployment of the content databases tionships, cross-links to other content, and so on.on multiple machines and locations using content delivery net- The knowledge models, ontological entities, and relation-work (CDN) strategies. ships can be queried by using a simple API. We developed pre- The entities and relationships in the knowledge models sentation plugins and themes as presentation services on toplink up the content and index it under all applicable concepts, of the Drupal Environment, which is an open source contentwhich serve as basic expert-generated navigational aids. Users management system.SEPTEMBER/OCTOBER 2010 I E E E I n ter n et C omp u ti n g w e b e x t r a v
  6. 6. web extraA OPAALS_IITK%201.pdf. gropedia is a unique effort in the sense that it’s a col- 2. S. Patwar et al., “Towards a Novel Content Organisation in Agriculture laborative effort between many different institutions Using Semantictechnologies: A Study with Topic Maps as a Tool,” Int’l J. Metadata, Semantics, and Ontologies, vol. 4, no. 1–2, 2010, pp. 65–71. and organizations. It’s also unique from a technologi- 3. A.K. Coomaraswamy, “Nirukta = Hermeniea,” Perception of the Vedas,cal and community-enabling point of view. It took us more V.N. Misra, ed., Manohar Publishers, 2000, Chapter12.than two years to create the agricultural knowledge mod- 4. A. Sen, “Inequality, Instability and Voice,” Argumentative Indian, Pen-els and the fundamental vocabulary. Currently, we’re in the guin Books, 2005, Chapter 2. 5. M. Sini et al., “Building Knowledge Models for Agropedia Indica v 1.0evangelizing stage. The content transformation and automatic Requirements, Guidelines, Suggestions,” Indian Institute of Technology,cross-linking environment is still under beta. We plan to con- 2008; the knowledge models into a full-fledged ontology suchthat automatic discovery and inferencing applications can be Nagaraju Pappu works as chief technologist at Canopus Consulting,developed. We plan to use Subversion as the back-end content an enterprise architecture consulting services organization. He’srepository, which allows multiple people to edit, modify, and also a visiting faculty member at IIT-Kanpur and IIIT-Hyderbad.check-in the content even in an offline mode. Pappu specializes in large-scale enterprise systems architecture In Agropedia, the intelligent fusion of cutting-edge comput- and semantic and collaborative environments. Contact him ating metaphors and collaborative culture is creating a simple yet, or via symphony. Its social and economic relevance to a countrylike India, which faces major food production and distributionchallenges in the coming decades, simply can’t be overstated. Runa Sarkar is currently a faculty member in the economics group at the Indian Institute of Management, Calcutta. Her research in-Acknowledgments terests are digital ecosystems and the environmental sustainabil-The Agropedia is the result of an ongoing effort under the umbrella ity and social economic impact of ICT on communities. Contactof the National Agriculture Innovation Project (NAIP) of the Indian her at or via of Agricultural Research (ICAR). IIT-Kanpur is building theoverall technology, with ICRISAT being the primary coordinator andGBPant Agricultural University and University of Agricultural Scienc- T.V. Prabhakar is a professor of computer science at the Indianes, Dharward acting as domain partners. Food and Agriculture Orga- Institute of Technology, Kanpur. His research interests are da-nization (FAO), Rome, participated in building the knowledge models. tabases, software architecture, knowledge modeling, SemanticWe’re particularly grateful to V. Balaji, Johannes Keizer, Margherita Web, and collaborative computing. Contact him at, Antonella Picarella, and N.T. Yaduraju of these organizations or via the help and encouragement they rendered in building Agropedia.OPAALS, a network of excellence project under FP6 of the European This Web extra accompanies the article, “Agropedia: Human-Union, also influenced the roadmap for Agropedia through its focus ization of Agricultural Knowledge,” IEEE Internet Comput-on open knowledge systems. ing, vol. 14, no. 5, 2010, p. 57–59; http://doi.ieeecomputer 1. R. Rajagopalan and R. Sarkar, “Digital Ecosystems: Community Net- works or Networked Communities?” Proc. 1st Open Philosophies for Associative Autopoietic Digital Ecosystems (OPAALS) Conf., Tampere Selected CS articles and columns are also available for University of Technology, 2007; free at Engineering and Applying the Internet IEEE Internet Computing reports emerging tools, technologies, and applications implemented through the Internet to support a worldwide computing environment. For submission information and author guidelines, please visit I E E E I n ter n et C omp u ti n g w e b e x t r a w w / interne t