SlideShare a Scribd company logo
1 of 5
Download to read offline
GRID INFORMATION
        RETRIVAL SYSTEM
                          USING JAVA
INTRODUCTION
GRID IR is Information Retrieval on the grid! It is a new initiative to
bring together information retrieval techniques with grid computing.
IR or information retrieval is a field of research            concerned with
searching unstructured (or quasi-structured) data              such as text
documents and the retrieval of results pertinent to           a user’s query.
Modern web search engines are the most                         widely known
implementations of IR system.
Grid computing is the accomplishment of computational tasks on a set
of computers connected by a network. This is similar to distributed
computing, except with a more finely grained implementation for task
assignment and coordination among the grid elements.
Grid computing provides clustering of remotely distributed computing.
The principal focus of grid computing to date has been on maximizing
the use of available processor resources for compute-intensive
applications. Grid computing along with storage virtualization and
server virtualization enables a Utility Computing.
Applying the resources of many computers in a network to a single
problem at the same time – usually a scientific or technical problem that
requires a great number of computer processing cycles or access to
large amounts of data. Grid computing uses software to divide and farm
out pieces of a program to as many as several thousand computers.
A number of corporations, professional groups and university consortia
have developed frameworks and software for managing grid computing
projects. Grid computing is a model for allowing companies to use a
large number of computing resources on demand, no matter where they
are located.
Grid IR applies the tools of grid computing to IR to provide a common
infrastructure for distributed IR. It also brings the capabilities of IR to grid
computing. GRID IR is a newly proposed initiative to implement a
specific architecture for realizing IR on the open grid service architecture
(OGSA) grid-computing platform. Traditional IR models are broken into
constituent pieces and described as OGSA grid services. A model for
interaction among these services describes the GRID IR system.
AIM/OBJECTIVE OF THE SYSTEM
The main aim of grid IR is to allow users information needs to be
matched to documents by document collections, indexes and query
engines which all exist as grid services.
The project is implemented using JAVA. MS-ACCESS database is used
for indexing the keywords of the document.


PROPOSED SYSTEM SOFTWARE REQUIREMENTS
Operating system    :   Windows XP/2000
Software            :   JDK 1.3 or higher
Database            :   MS-ACCESS


PROPOSED SYSTEM HARDWARE REQUIREMENTS
Processor    : Intel Pentium PIII or higher
RAM          : 128 MB or higher
HDD          : 80 GB HDD
FDD          : 1.44 MB or higher
Monitor / Keyboard / CD drive

PROPOSED SYSTEM DESCRIPTION
Grid Computing is an advanced technology of distributed computing. A
Grid is a collection of computers, storage and other devices, which are
joined together by any means of communication like Internet and which
can be used to manage information and solve their problems among
themselves.
Grid Computing allows usage of the unutilized resources of other
systems. This is achieved by distributing the workload of the system to
the other systems in order to use their unused resources such as their
memory, processor, etc which results in balancing the workload,
decreasing the network traffic, bandwidth, etc.
This concept is used in our project to render a large image in a very
short time by distributing the image to many systems for using their
resources. As the workload is evenly distributed among the grid
network, even the large work can be done in a short time itself.
The main scope is that using the unused resources to complete the
work efficiently. This project helps to use the resources efficiently and
cost effective.
Grid Computing is about making large amounts of computing power
available for applications and users. Collaborative development of Java
Grid Engine technology provides the proper development framework to
ensure that Grid Engine technology meets the requirements of the
largest number of users.
Grid computing is a form of networking. Unlike conventional networks
that focus on communication among devices, grid computing harnesses
unused processing cycles of all computers in a network for solving
problems too intensive for any stand-alone machine.
A common example of a well-known grid computing project is the SETI
(Search for Extraterrestrial Intelligence) @Home project, in which PC
users worldwide donate unused processor cycles to help the search for
signs of extraterrestrial life by analyzing signals coming from outer
space.
The proposed project relies on individual users to volunteer to allow the
project to harness the unused processing power of the user's computer.
This method saves the project both money and resources.
This project in Java based Grid computing does require special imaging
software that is unique to the computing project for which the grid is
being used.
The basic idea of grid IR is to define an IR system in terms of three
functional components, implemented as grid services: the collection
manager service (CM), the indexing/searching service (IS), and the
Query processing service (QP).
These services are autonomous, and being grid services, they are
distributed. Since they can be used to create new IR systems or link
existing ones together in an interoperable network of IR services.
Information retrieval(IR) is the science and practice of identifying
documents or sub-documents that needs information needs. Usually, IR
deals with textual documents in semi-structured (e.g., HTML, XML) or
unstructured (plain text) format. In order to boost processing power,
institutions aggregated computing resources across the entire
institution.
The same idea of sharing resources has paved the way for grid
computing but with a far wider scale and scope. Grid computing, in
effect, provides a global reach to distributed computing.
It promises lower total computing costs along with on-demand, reliable,
and inexpensive access to the vast, available computing resources that
would other wise go unused.
GRID COMPUTING FEATURES
The requirements for grid-computing infrastructure can be described by
the following attributed:
•   Pooling of resources to increase utilization
•   Provisioning of work based on policies and dynamic requirements
•   Virtualization at every layer of the computing stack
•   Self-adaptive software that largely tunes and fixes itself
•   Unified management and provisioning.


                 PROPOSED SYSTEM MODULES
Java Grid project is divided into three modules server, client and
worker

1. SERVER MODULE
       User interface Job Scheduler
       Workload Management
       Resource Management
       Data Management


2. WORKER MODULE
       Job Requests Receiver
       Job Processing Manager
       Job Requests Sender


3. CLIENT MODULE
       Job Fragmenter
       Job Requests Sender
       Job Results Receiver
       Job Results Aggregator
GRID - MODULE DESCRIPTION
1. SERVER MODULE
   Server module, which maintains the number of clients and worker
   connected to the grid engine, amount of work load given to the
   worker, add grid node, remove grid node, data available in the
   clients.

2. CLIENT MODULE
   The given job is divided into job fragments and given to the grid
   server to process, client aggregate the resultant job fragments form
   the grid server. The purpose of the client is to divide and aggregate
   the job.

3. WORKER MODULE
   Worker process the job given by the grid server and then result is
   send to the grid server. Worker module runs server automatically
   identifies and connect the worker to the grid engine.
   In this project worker process the job such as rendering of images
   using pov ray software.

          GRID – CLIENT                   GRID - WORKER


                               G
         Job Fragmenter
                               R
                               I
                Job Requests
                               D
                                              Job Requests Receiver
       Job Requests Sender
                               S
                               E
                               R                  Job Processing
                                                     Manager
                               V
                               E
      Job Results Receiver     R
                                                Job Results Sender



    Job Results Aggregator

More Related Content

More from ncct

Biomedical Wearable Device For Remote Monitoring Ofphysiological Signals
Biomedical Wearable Device For Remote Monitoring Ofphysiological SignalsBiomedical Wearable Device For Remote Monitoring Ofphysiological Signals
Biomedical Wearable Device For Remote Monitoring Ofphysiological Signalsncct
 
Digital Water Marking For Video Piracy Detection
Digital Water Marking For Video Piracy DetectionDigital Water Marking For Video Piracy Detection
Digital Water Marking For Video Piracy Detectionncct
 
Self Repairing Tree Topology Enabling Content Based Routing In Local Area Ne...
Self Repairing Tree Topology Enabling  Content Based Routing In Local Area Ne...Self Repairing Tree Topology Enabling  Content Based Routing In Local Area Ne...
Self Repairing Tree Topology Enabling Content Based Routing In Local Area Ne...ncct
 
Cockpit White Box
Cockpit White BoxCockpit White Box
Cockpit White Boxncct
 
Rail Track Inspector
Rail Track InspectorRail Track Inspector
Rail Track Inspectorncct
 
Botminer Clustering Analysis Of Network Traffic For Protocol And Structure...
Botminer   Clustering Analysis Of Network Traffic For Protocol  And Structure...Botminer   Clustering Analysis Of Network Traffic For Protocol  And Structure...
Botminer Clustering Analysis Of Network Traffic For Protocol And Structure...ncct
 
Bot Robo Tanker Sound Detector
Bot Robo  Tanker  Sound DetectorBot Robo  Tanker  Sound Detector
Bot Robo Tanker Sound Detectorncct
 
Distance Protection
Distance ProtectionDistance Protection
Distance Protectionncct
 
Bluetooth Jammer
Bluetooth  JammerBluetooth  Jammer
Bluetooth Jammerncct
 
Crypkit 1
Crypkit 1Crypkit 1
Crypkit 1ncct
 
I E E E 2009 Java Projects
I E E E 2009  Java  ProjectsI E E E 2009  Java  Projects
I E E E 2009 Java Projectsncct
 
B E Projects M C A Projects B
B E  Projects  M C A  Projects  BB E  Projects  M C A  Projects  B
B E Projects M C A Projects Bncct
 
J2 E E Projects, I E E E Projects 2009
J2 E E  Projects,  I E E E  Projects 2009J2 E E  Projects,  I E E E  Projects 2009
J2 E E Projects, I E E E Projects 2009ncct
 
J2 M E Projects, I E E E Projects 2009
J2 M E  Projects,  I E E E  Projects 2009J2 M E  Projects,  I E E E  Projects 2009
J2 M E Projects, I E E E Projects 2009ncct
 
Engineering College Projects, M C A Projects, B E Projects, B Tech Pr...
Engineering  College  Projects,  M C A  Projects,  B E  Projects,  B Tech  Pr...Engineering  College  Projects,  M C A  Projects,  B E  Projects,  B Tech  Pr...
Engineering College Projects, M C A Projects, B E Projects, B Tech Pr...ncct
 
B E M E Projects M C A Projects B
B E  M E  Projects  M C A  Projects  BB E  M E  Projects  M C A  Projects  B
B E M E Projects M C A Projects Bncct
 
I E E E 2009 Java Projects, I E E E 2009 A S P
I E E E 2009  Java  Projects,  I E E E 2009  A S PI E E E 2009  Java  Projects,  I E E E 2009  A S P
I E E E 2009 Java Projects, I E E E 2009 A S Pncct
 
Advantages Of Software Projects N C C T
Advantages Of  Software  Projects  N C C TAdvantages Of  Software  Projects  N C C T
Advantages Of Software Projects N C C Tncct
 
Engineering Projects
Engineering  ProjectsEngineering  Projects
Engineering Projectsncct
 
Software Projects Java Projects Mobile Computing
Software  Projects  Java  Projects  Mobile  ComputingSoftware  Projects  Java  Projects  Mobile  Computing
Software Projects Java Projects Mobile Computingncct
 

More from ncct (20)

Biomedical Wearable Device For Remote Monitoring Ofphysiological Signals
Biomedical Wearable Device For Remote Monitoring Ofphysiological SignalsBiomedical Wearable Device For Remote Monitoring Ofphysiological Signals
Biomedical Wearable Device For Remote Monitoring Ofphysiological Signals
 
Digital Water Marking For Video Piracy Detection
Digital Water Marking For Video Piracy DetectionDigital Water Marking For Video Piracy Detection
Digital Water Marking For Video Piracy Detection
 
Self Repairing Tree Topology Enabling Content Based Routing In Local Area Ne...
Self Repairing Tree Topology Enabling  Content Based Routing In Local Area Ne...Self Repairing Tree Topology Enabling  Content Based Routing In Local Area Ne...
Self Repairing Tree Topology Enabling Content Based Routing In Local Area Ne...
 
Cockpit White Box
Cockpit White BoxCockpit White Box
Cockpit White Box
 
Rail Track Inspector
Rail Track InspectorRail Track Inspector
Rail Track Inspector
 
Botminer Clustering Analysis Of Network Traffic For Protocol And Structure...
Botminer   Clustering Analysis Of Network Traffic For Protocol  And Structure...Botminer   Clustering Analysis Of Network Traffic For Protocol  And Structure...
Botminer Clustering Analysis Of Network Traffic For Protocol And Structure...
 
Bot Robo Tanker Sound Detector
Bot Robo  Tanker  Sound DetectorBot Robo  Tanker  Sound Detector
Bot Robo Tanker Sound Detector
 
Distance Protection
Distance ProtectionDistance Protection
Distance Protection
 
Bluetooth Jammer
Bluetooth  JammerBluetooth  Jammer
Bluetooth Jammer
 
Crypkit 1
Crypkit 1Crypkit 1
Crypkit 1
 
I E E E 2009 Java Projects
I E E E 2009  Java  ProjectsI E E E 2009  Java  Projects
I E E E 2009 Java Projects
 
B E Projects M C A Projects B
B E  Projects  M C A  Projects  BB E  Projects  M C A  Projects  B
B E Projects M C A Projects B
 
J2 E E Projects, I E E E Projects 2009
J2 E E  Projects,  I E E E  Projects 2009J2 E E  Projects,  I E E E  Projects 2009
J2 E E Projects, I E E E Projects 2009
 
J2 M E Projects, I E E E Projects 2009
J2 M E  Projects,  I E E E  Projects 2009J2 M E  Projects,  I E E E  Projects 2009
J2 M E Projects, I E E E Projects 2009
 
Engineering College Projects, M C A Projects, B E Projects, B Tech Pr...
Engineering  College  Projects,  M C A  Projects,  B E  Projects,  B Tech  Pr...Engineering  College  Projects,  M C A  Projects,  B E  Projects,  B Tech  Pr...
Engineering College Projects, M C A Projects, B E Projects, B Tech Pr...
 
B E M E Projects M C A Projects B
B E  M E  Projects  M C A  Projects  BB E  M E  Projects  M C A  Projects  B
B E M E Projects M C A Projects B
 
I E E E 2009 Java Projects, I E E E 2009 A S P
I E E E 2009  Java  Projects,  I E E E 2009  A S PI E E E 2009  Java  Projects,  I E E E 2009  A S P
I E E E 2009 Java Projects, I E E E 2009 A S P
 
Advantages Of Software Projects N C C T
Advantages Of  Software  Projects  N C C TAdvantages Of  Software  Projects  N C C T
Advantages Of Software Projects N C C T
 
Engineering Projects
Engineering  ProjectsEngineering  Projects
Engineering Projects
 
Software Projects Java Projects Mobile Computing
Software  Projects  Java  Projects  Mobile  ComputingSoftware  Projects  Java  Projects  Mobile  Computing
Software Projects Java Projects Mobile Computing
 

Recently uploaded

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 

Recently uploaded (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 

Java Abs Grid Information Retrival System

  • 1. GRID INFORMATION RETRIVAL SYSTEM USING JAVA INTRODUCTION GRID IR is Information Retrieval on the grid! It is a new initiative to bring together information retrieval techniques with grid computing. IR or information retrieval is a field of research concerned with searching unstructured (or quasi-structured) data such as text documents and the retrieval of results pertinent to a user’s query. Modern web search engines are the most widely known implementations of IR system. Grid computing is the accomplishment of computational tasks on a set of computers connected by a network. This is similar to distributed computing, except with a more finely grained implementation for task assignment and coordination among the grid elements. Grid computing provides clustering of remotely distributed computing. The principal focus of grid computing to date has been on maximizing the use of available processor resources for compute-intensive applications. Grid computing along with storage virtualization and server virtualization enables a Utility Computing. Applying the resources of many computers in a network to a single problem at the same time – usually a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data. Grid computing uses software to divide and farm out pieces of a program to as many as several thousand computers. A number of corporations, professional groups and university consortia have developed frameworks and software for managing grid computing projects. Grid computing is a model for allowing companies to use a large number of computing resources on demand, no matter where they are located. Grid IR applies the tools of grid computing to IR to provide a common infrastructure for distributed IR. It also brings the capabilities of IR to grid computing. GRID IR is a newly proposed initiative to implement a specific architecture for realizing IR on the open grid service architecture (OGSA) grid-computing platform. Traditional IR models are broken into constituent pieces and described as OGSA grid services. A model for interaction among these services describes the GRID IR system.
  • 2. AIM/OBJECTIVE OF THE SYSTEM The main aim of grid IR is to allow users information needs to be matched to documents by document collections, indexes and query engines which all exist as grid services. The project is implemented using JAVA. MS-ACCESS database is used for indexing the keywords of the document. PROPOSED SYSTEM SOFTWARE REQUIREMENTS Operating system : Windows XP/2000 Software : JDK 1.3 or higher Database : MS-ACCESS PROPOSED SYSTEM HARDWARE REQUIREMENTS Processor : Intel Pentium PIII or higher RAM : 128 MB or higher HDD : 80 GB HDD FDD : 1.44 MB or higher Monitor / Keyboard / CD drive PROPOSED SYSTEM DESCRIPTION Grid Computing is an advanced technology of distributed computing. A Grid is a collection of computers, storage and other devices, which are joined together by any means of communication like Internet and which can be used to manage information and solve their problems among themselves. Grid Computing allows usage of the unutilized resources of other systems. This is achieved by distributing the workload of the system to the other systems in order to use their unused resources such as their memory, processor, etc which results in balancing the workload, decreasing the network traffic, bandwidth, etc. This concept is used in our project to render a large image in a very short time by distributing the image to many systems for using their resources. As the workload is evenly distributed among the grid network, even the large work can be done in a short time itself. The main scope is that using the unused resources to complete the work efficiently. This project helps to use the resources efficiently and cost effective.
  • 3. Grid Computing is about making large amounts of computing power available for applications and users. Collaborative development of Java Grid Engine technology provides the proper development framework to ensure that Grid Engine technology meets the requirements of the largest number of users. Grid computing is a form of networking. Unlike conventional networks that focus on communication among devices, grid computing harnesses unused processing cycles of all computers in a network for solving problems too intensive for any stand-alone machine. A common example of a well-known grid computing project is the SETI (Search for Extraterrestrial Intelligence) @Home project, in which PC users worldwide donate unused processor cycles to help the search for signs of extraterrestrial life by analyzing signals coming from outer space. The proposed project relies on individual users to volunteer to allow the project to harness the unused processing power of the user's computer. This method saves the project both money and resources. This project in Java based Grid computing does require special imaging software that is unique to the computing project for which the grid is being used. The basic idea of grid IR is to define an IR system in terms of three functional components, implemented as grid services: the collection manager service (CM), the indexing/searching service (IS), and the Query processing service (QP). These services are autonomous, and being grid services, they are distributed. Since they can be used to create new IR systems or link existing ones together in an interoperable network of IR services. Information retrieval(IR) is the science and practice of identifying documents or sub-documents that needs information needs. Usually, IR deals with textual documents in semi-structured (e.g., HTML, XML) or unstructured (plain text) format. In order to boost processing power, institutions aggregated computing resources across the entire institution. The same idea of sharing resources has paved the way for grid computing but with a far wider scale and scope. Grid computing, in effect, provides a global reach to distributed computing. It promises lower total computing costs along with on-demand, reliable, and inexpensive access to the vast, available computing resources that would other wise go unused.
  • 4. GRID COMPUTING FEATURES The requirements for grid-computing infrastructure can be described by the following attributed: • Pooling of resources to increase utilization • Provisioning of work based on policies and dynamic requirements • Virtualization at every layer of the computing stack • Self-adaptive software that largely tunes and fixes itself • Unified management and provisioning. PROPOSED SYSTEM MODULES Java Grid project is divided into three modules server, client and worker 1. SERVER MODULE User interface Job Scheduler Workload Management Resource Management Data Management 2. WORKER MODULE Job Requests Receiver Job Processing Manager Job Requests Sender 3. CLIENT MODULE Job Fragmenter Job Requests Sender Job Results Receiver Job Results Aggregator
  • 5. GRID - MODULE DESCRIPTION 1. SERVER MODULE Server module, which maintains the number of clients and worker connected to the grid engine, amount of work load given to the worker, add grid node, remove grid node, data available in the clients. 2. CLIENT MODULE The given job is divided into job fragments and given to the grid server to process, client aggregate the resultant job fragments form the grid server. The purpose of the client is to divide and aggregate the job. 3. WORKER MODULE Worker process the job given by the grid server and then result is send to the grid server. Worker module runs server automatically identifies and connect the worker to the grid engine. In this project worker process the job such as rendering of images using pov ray software. GRID – CLIENT GRID - WORKER G Job Fragmenter R I Job Requests D Job Requests Receiver Job Requests Sender S E R Job Processing Manager V E Job Results Receiver R Job Results Sender Job Results Aggregator