Ads applications of ads

APPLICATIONS OF ADVANCED
DATA STRUCTURES

1 9/3/2012

Index-Sequential File Organization

 Index-Sequential files are files (which holds
information for data) ordered sequentially on a
search key.

 Main disadvantage is that performance degrades as
file size grows for lookups and sequential scans.

 Degradation can be fixed with reorganization of the
file. Reorganization require lot of overhead space so
frequent reorganization is undesirable.

2 9/3/2012

 An index speeds up certain queries or searches
because it stores information about where data is
stored on the disc. The index points directly to the
location of a record on the disc and can be used to
avoid searching a large file.
 The DBMS represents data as records in a table.
However, a disc stores data in blocks, or pages.
Many records may be placed in one block or one
record may be placed across many blocks.
 The computer can only transfer one block at a time
between main memory and the disc.

4 9/3/2012

 The problem for the DBMS is to decide in which
block each record should be placed and what
information should be stored in addition to the record
to allow the record to be retrieved easily.

5 9/3/2012

 But when the number of indexed values is large, the
index will not fit in one block. Therefore, the contents
of the index must be placed in two or more blocks.
 The solution to this problem is to create an index of
an index. That is, the single index is split into a
number of blocks and a new index is created that
indexes each block.
 The B+-Tree structure is an index of an index, called
multi-level index.

7 9/3/2012

Dynamic Multilevel Indexes Using B-Trees
and B+-Trees
 Because of the insertion and deletion problem, most
multi-level indexes use B-tree or B+-tree data
structures, which leave space in each tree node (disk
block) to allow for new index entries

 These data structures are variations of search trees
that allow efficient insertion and deletion of new search
values.

 In B-Tree and B+-Tree data structures, each node
corresponds to a disk block

 Each node is kept between half-full and completely full

9 9/3/2012

Dynamic Multilevel Indexes Using B-Trees
and B+-Trees
 An insertion into a node that is not full is quite
efficient; if a node is full the insertion causes a split
into two nodes
 Splitting may propagate to other tree levels
 A deletion is quite efficient if a node does not become
less than half full
 If a deletion causes a node to become less than half
full, it must be merged with neighboring nodes

10 9/3/2012

The nodes of a B+-tree. (a) Internal node of a B+-tree with q
–1 search values. (b) Leaf node of a B+-tree with q – 1 search
values and q – 1 data pointers.

11 9/3/2012

root

EMBRY
Index set

BOLEN CAMP FABER FOLKS

ADAMS-BERNE CAMP-DUTTON EMBRY-EVANS FOLKS-GADDIS

1 BOLEN-CAGE
3 4 FABER-FOLK
6

2 5

12 9/3/2012

B+ Tree Result
First level (root level) 6144

Node
Second level
3718 4161 7409 7422 7917
Third level
(leaf level)

2014 2019 3147 3718 3904 4161 4162 7422 7602 7917 8003 8193

6144 7329 7409 7418

empno lastname job …
7409 vicky CLERK …

13 9/3/2012

Advantages of using B+ Trees in database
 high fanout / low depth

 simple and consistent block storage

 high key density

14 9/3/2012

ABOUT GOOGLE SEARCH:
 Normally in the Google search:
 Every word matters (Except ‘stop words’). All the words that
you type in the search box are used by Google.

 Word order will also become more important, as the first
word entered will dictate which results are shown first.

 The search is case-insensitive i.e. Google does not find any
difference between CAPITAL and capital.

 Generally, punctuations or special characters like ~, !, @, #, $,
(, ), {, }, [, ], are ignored.
 Google ignores some words (stop words) such as I, a,
about, an, are, the, etc.,
16 9/3/2012

EARLIER GOOGLE SEARCH:
 We had to,
 Use the words that we think are most likely to appear on
the page.

 Use descriptive words. The accuracy of results depends
on the uniqueness of the description.

 Use as fewer words as possible. Since a combination of
many words may limit your search results.

17 9/3/2012

“Google took a much more active role in
leading searchers to not just the answer,
but also the question itself.”
18 9/3/2012

ABOUT GOOGLE INSTANT:
 When the user begins typing their query into the
Google search box, Google will display a short list of
predicted queries that are related to the letters the
user has started to type in.
 As the user types these predictions may change
depending on the characters being entered.
 Not only the suggestions, the search results also
keeps changing without the press of the Enter key as
the user enters queries.
 15 new technologies contribute to Google Instant
functionality.

19 9/3/2012

DATA STRUCTURE IN GI:

¢ a b c d e f g ………………………………………………….. z ¶

20 9/3/2012

This is a trie for keys
“A”,
“to”,
“tea”,
“ted”,
“ten”,
“i”,
“in”, and
“inn”.

21 9/3/2012

 When we need to do auto complete for the starting
characters, “te”, we need to get output tea, ted and
ten.
 Instead of checking regular expression match for all
the words in the database, it will make use of
transitions.
 First character is t. Then in the root element, it will
make transition for „t‟ so that it will reach the node
with data „t‟, then at node „t‟, it will make transition for
next node „e‟.
 At that point, we need to follow all paths from node
„e‟ to leaf nodes so that we can get the paths t->e-
>a, t->e->d and t->e->n.
This is the basic algorithm behind implementing an
22 9/3/2012
efficient auto complete.

FASTER SEARCHES:
 Before Google Instant, the typical searcher took
more than 9 seconds to enter a search term.

 We can see many examples of searches that takes
30-90 seconds to type.

 Using Google Instant can save 2-5 seconds per
search.

 If everyone uses Google Instant globally, Google
estimates that this will save more than 3.5 billion
seconds a day (that's 11 hours saved every second).
24 9/3/2012

SMARTER PREDICTIONS:

 Even when we don‟t know exactly what we are
looking for, predictions guides our search.

 The top prediction is shown in grey text in the
search box, so that we can stop typing as soon
as we see what we need.

25 9/3/2012

INSTANT RESULTS:

 As we start typing the query and the results appear
at once.

 But before the GI we had to type a full search term,
hit enter, and hope for the right result.

 Now the results appear instantly, helping us to head
to our search much more easier and faster in a
simpler way.

 It‟s really amazing that GI goes through more than
6000 words per second.
26 9/3/2012

CONTRIBUTING FACTORS:

 Query volume.

 Geography of searchers.

 Keywords or phrases mentioned.

video

27 9/3/2012

Ads applications of ads

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Ads applications of ads

Similar to Ads applications of ads (20)

More from Tech_MX

More from Tech_MX (20)

Recently uploaded

Recently uploaded (20)

Ads applications of ads