08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Konsep Dasar Information Retrieval - Edi faizal
1. INFORMATION RETIEVAL (IR)
Edi Faizal
21/483830/SPA/00795
MK “INFORMATION RETRIEVAL”; DOSEN PENGAMPU: AINA MUSDHOLIFAH, S.Kom., M.Kom., Ph.D.
PROGRAM STUDI S3 DOKTOR ILMU KOMPUTER 2021
2. Course Topic’s
A. Konsep Dasar IR (Information Retrieval)
1. IR vs Recommender System vs Search Engine
2. Jenis-jenis IR
B. Preparing IR
1. Crawling
2. Indexing
3. NLP pada IR
4. Representasi text pada IR
C. Metode klasifikasi pada IR
D. Metode clustering pada IR
E. Evaluation in IR
3. Part 1
KONSEP DASAR INFORMATION RETIEVAL
Edi Faizal
21/483830/SPA/00795
MK “INFORMATION RETRIEVAL”; DOSEN PENGAMPU: AINA MUSDHOLIFAH, S.Kom., M.Kom., Ph.D.
PROGRAM STUDI S3 DOKTOR ILMU KOMPUTER 2021
4. Outline
Konsep Dasar IR (Information Retrieval)
• IR vs RecSys vs Search Engine
• Jenis-Jenis IR
6. Information Retrieval (IR)
Information retrieval (IR) is finding material (usually documents) of an
unstructured nature (usually text) that satisfies an information need from
within large collections (usually stored on computers)
(Manning et al, 2009)
An information need is the topic about which the user desires to know
more about.
A document is relevant if the user perceives that it contains information
of value with respect to their personal information need.
What is a document? web pages, email, books, news stories, scholarly
papers, text messages, Powerpoint, PDF, forum postings, patents, IM
sessions, Tweets, question answer postings, image, audio, video etc.
A query is what the user conveys to the computer in an attempt to
communicate the information need.
9. Recommender System (RecSys/RSs)
A recommender system, or a recommendation system (sometimes replacing
'system' with a synonym such as platform or engine), is a subclass
of information filtering system that seeks to predict the "rating" or
"preference" a user would give to an item
(Ricci et al, 211)
Software tools and techniques providing suggestions for items to be of use to a user
“Item” is the general term used to denote what the system recommends to users
The suggestions relate to various decision-making processes, such as what items to
buy, what music to listen to, or what online news to read.
Designing and developing RSs is a multi-disciplinary effort that has benefited from
results obtained in various computer science fields especially machine learning and
data mining, information retrieval, and human-computer interaction
10. RecSys (cont…)
Primary model of RecSys:
• Prediction version of problem: memprediksi nilai peringkat
untuk kombinasi user-item, dengan asumsi data pelatihan
tersedia, yang menunjukkan preferensi user to item.
• Ranking version of problem : Proses merekomendasikan top-k
item untuk pengguna tertentu, atau menentukan top-k pengguna
untuk menargetkan item tertentu.
Operational and technical goals RecSys:
• Relevance: merekomendasikan item yang relevan dengan user.
• Novelty: membantu ketika item yang direkomendasikan adalah
sesuatu (item) yang belum pernah dilihat user di masa lalu.
• Serendipity: item yang direkomendasikan tidak terduga
(kebetulan) dan tidak diketahui sebelumnya.
• Increasing recommendation diversity: memberikan keragaman
rekomendasi, biasanya menyarankan daftar top-k item kepada
user.
Relevance Novelty
Serendipity
Increasing
Recom.
diversity
Prediction
version of
problem
Ranking
version of
problem
Goal of RecSys
14. Search Engine
“A program that searches for and identifies items in a database that
correspond to keywords or characters specified by the user, used especially
for finding particular sites on the World Wide Web.”
Salah satu aplikasi umum dari IR adalah search engine atau mesin
pencarian yang terdapat pada jaringan internet.
Pengguna dapat mencari halaman-halaman web yang dibutuhkannya
melalui search engine.
Contoh lain dari IR adalah sistem informasi perpustakaan
18. Jenis-jenis IR
Information retrieval models roughly fall into following paradigms:
Set theoretic models
Boolean model
Extended Boolean model
Algebraic models
Vector space model
Latent models
Latent semantic indexing (LSI), Random indexing, Topic modelling for IR
Probabilistic retrieval
Classic probabilistic retrieval: Binary independence model, BM11, BM25
Language models for IR, Semantic ad-hoc retrieval, Embedding models
20. Jenis-jenis IR (cont…)
An information retrieval comprises of the following four key elements:
• D − Document Representation.
• Q − Query Representation.
• F − A framework to match and establish a relationship between D and Q.
• R (q, di) − A ranking function that determines the similarity between the
query and the document to display relevant information.
There are three types of Information Retrieval (IR) models:
1. Classical IR Model
2. Non-Classical IR Model
3. Alternative IR Model
21. Jenis-jenis IR (cont…)
Classical IR Model
It is designed upon basic mathematical concepts and is the most widely-
used of IR models. Classic Information Retrieval models can be
implemented with ease.
Its examples include: Vector-space, Boolean and Probabilistic IR models.
In this system, the retrieval of information depends on documents
ontaining the defined set of queries. There is no ranking or grading of any
kind.
The different classical IR models take Document Representation, Query
representation, and Retrieval/Matching function into account in their
modelling.
22. Jenis-jenis IR (cont…)
Non-Classical IR Model
They differ from classic models in that they are built upon propositional
logic. Examples of non-classical IR models include:
Information Logic,
Situation Theory, and
Interaction models.
23. Jenis-jenis IR (cont…)
Alternative IR Model
These take principles of classical IR model and enhance upon to create
more functional models like :
Cluster model,
Alternative Set-Theoretic Models
Fuzzy Set model,
Latent Semantic Indexing (LSI) model,
Alternative Algebraic Models
Generalized Vector Space Model, etc.