This document discusses various conventional indexing techniques used to improve the speed of data retrieval from databases and data warehouses. It describes dense indexing, sparse indexing, and multi-level or B-tree indexing. It explains that indexing provides pointers to the location of data, avoiding the need to sequentially scan entire data files. The document also covers hashing-based indexes and compares B-tree indexes, which support range queries, to hashing indexes, which are best for exact match queries.
Digital image processing is the use of computer algorithms to perform image processing on digital images. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing.
PPT apple breeding
Apple Advances Breeding , apple breeding of igkv, gangaram rana apple breeding , mutation breeding of apple, cultivation of apple, polyploide breeding of apple , apomaxis breeding of apple
Digital image processing is the use of computer algorithms to perform image processing on digital images. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing.
PPT apple breeding
Apple Advances Breeding , apple breeding of igkv, gangaram rana apple breeding , mutation breeding of apple, cultivation of apple, polyploide breeding of apple , apomaxis breeding of apple
An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators
Operates in an environment
Perceive its environment through sensors
Acts upon its environment through actuators/ effectors
Has Goals
A mango is a juicy stone fruit produced from numerous species of tropical trees belonging to the flowering plant genus Mangifera, cultivated mostly for their edible fruit. Most of these species are found in nature as wild mangoes. The genus belongs to the cashew family Anacardiaceae.
An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators
Operates in an environment
Perceive its environment through sensors
Acts upon its environment through actuators/ effectors
Has Goals
A mango is a juicy stone fruit produced from numerous species of tropical trees belonging to the flowering plant genus Mangifera, cultivated mostly for their edible fruit. Most of these species are found in nature as wild mangoes. The genus belongs to the cashew family Anacardiaceae.
Talk on Apache Kudu, presented by Asim Jalis at SF Data Engineering Meetup on 2/23/2016.
http://www.meetup.com/SF-Data-Engineering/events/228293610/
Big Data applications need to ingest streaming data and analyze it. HBase is great at ingesting streaming data but not good at analytics. HDFS is great at analytics but not at ingesting streaming data. Frequently applications ingest data into HBase and then move it to HDFS for analytics. What if you could use a single system for both use cases?
What if you could use a single system for both use cases? This could dramatically simplify your data pipeline architecture.
This is where Kudu comes in. Kudu is a storage system that lives between HDFS and HBase. It is good at both ingesting streaming data and good at analyzing it using Spark, MapReduce, and SQL.
Data Warehousing and Business Intelligence is one of the hottest skills today, and is the cornerstone for reporting, data science, and analytics. This course teaches the fundamentals with examples plus a project to fully illustrate the concepts.
Data Warehousing and Business Intelligence is one of the hottest skills today, and is the cornerstone for reporting, data science, and analytics. This course teaches the fundamentals with examples plus a project to fully illustrate the concepts.
Data Warehousing and Business Intelligence is one of the hottest skills today, and is the cornerstone for reporting, data science, and analytics. This course teaches the fundamentals with examples plus a project to fully illustrate the concepts.
NBITS is a best hadoop training institute providing customer project-based Training and Placements in Big Data Hadoop. NBITS provides Hadoop Training in Hyderabad by Real time experts faculty with 10+ yrs Experience.
This video covers an introduction to HBase in Azure. It covers what is HDInsight clusters, What are the available cluster types. What Microsoft Azure offers as Hadoop ecosystem components. The video focuses on HDInsight HBase cluster type and the need for HBase in Hadoop ecosystem to store NoSQL data and the available tools (such as: hbase shell) and commands to use to manipulate data within HBase tables.
The video covers the column families concept for engineers who come from RDBMS background.
This video helps any engineer with no Hadoop experience to understand what is the role of HBase in Hadoop and big data applications.
Domain Driven Design is a software development process that focuses on finding a common language for the involved parties. This language and the resulting models are taken from the domain rather than the technical details of the implementation. The goal is to improve the communication between customers, developers and all other involved groups. Even if Eric Evan's book about this topic was written almost ten years ago, this topic remains important because a lot of projects fail for communication reasons.
Relational databases have their own language and influence the design of software into a direction further away from the Domain: Entities have to be created for the sole purpose of adhering to best practices of relational database. Two kinds of NoSQL databases are changing that: Document stores and graph databases. In a document store you can model a "contains" relation in a more natural way and thereby express if this entity can exist outside of its surrounding entity. A graph database allows you to model relationships between entities in a straight forward way that can be expressed in the language of the domain.
In this talk I want to look at the way a multi model database that combines a document store and a graph database can help you to model your problems in a way that is understandable for all parties involved, and explain the benefits of this approach for the software development process.
The presentation covers how to get started to build big data solutions in Azure. Azure provides different Hadoop clusters for Hadoop ecosystem. The session covers the basic understanding of HDInsight clusters including: Apache Hadoop, HBase, Storm and Spark. The session covers how to integrate with HDInsight in .NET using different Hadoop integration frameworks and libraries. The session is a jump start for engineers and DBAs with RDBMS experience who are looking for a jump start working and developing Hadoop solutions. The session is a demo driven and will cover the basics of Hadoop open source products.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
1. DWH-Ahsan AbdullahDWH-Ahsan Abdullah
11
Data WarehousingData Warehousing
Lecture-26Lecture-26
Need for Speed:Need for Speed:
Conventional Indexing TechniquesConventional Indexing Techniques
Virtual University of PakistanVirtual University of Pakistan
Ahsan Abdullah
Assoc. Prof. & Head
Center for Agro-Informatics Research
www.nu.edu.pk/cairindex.asp
National University of Computers & Emerging Sciences, Islamabad
Email: ahsan1010@yahoo.com
2. DWH-Ahsan Abdullah
2
Need For Indexing: SpeedNeed For Indexing: Speed
Consider searching your hard disk using the Windows
SEARCH command.
Search goes into directory hierarchies.
Takes about a minute, and there are only a few thousand files.
Assume a fast processor and (even more importantly) a fast
hard disk.
Assume file size to be 5 KB.
Assume hard disk scan rate of a million files per second.
Resulting in scan rate of 5 GB per second.
Largest search engine indexes more than 8 billion pages
At above scan rate 1,600 seconds required to scan ALL pages.
This is just for one user!
No one is going to wait for 26 minutes, not even 26 seconds.
Hence, a sequential scan is simply not feasible.
No text goes to graphics
3. DWH-Ahsan Abdullah
3
Need For Indexing: Query ComplexityNeed For Indexing: Query Complexity
How many customers do I have in Karachi?
How many customers in Karachi made calls during
April?
How many customers in Karachi made calls to
Multan during April?
How many customers in Karachi made calls to
Multan during April using a particular calling
package?
4. DWH-Ahsan Abdullah
4
Need For Indexing: I/O BottleneckNeed For Indexing: I/O Bottleneck
Throwing hardware just speeds up the CPU
intensive tasks.
The problem is of I/O, which does not scales up
easily.
Putting the entire table in RAM is very very
expensive.
Therefore, index!
No text goes to graphics
5. DWH-Ahsan Abdullah
5
Indexing ConceptIndexing Concept
Purely physical concept, nothing to do with logical model.Purely physical concept, nothing to do with logical model.
Invisible to the end user (programmer), optimizer choosesInvisible to the end user (programmer), optimizer chooses
it, effects only the speed, not the answer.it, effects only the speed, not the answer.
With the library analogy, the time complexity to find aWith the library analogy, the time complexity to find a
book? The average time takenbook? The average time taken
Using a card catalog organized in many different ways i.e.Using a card catalog organized in many different ways i.e.
author, topic, title etc and is sorted.author, topic, title etc and is sorted.
A little bit of extra time to first check the catalog, but itA little bit of extra time to first check the catalog, but it
“gives” a pointer to the shelf and the row where book is“gives” a pointer to the shelf and the row where book is
located.located.
The catalog has no data about the book, just an efficientThe catalog has no data about the book, just an efficient
way of searching.way of searching.
No text goes to graphics
9. DWH-Ahsan Abdullah
9
Dense Index: Adv & Dis AdvDense Index: Adv & Dis Adv
Advantage:Advantage:
A dense index, if fits in the memory, is veryA dense index, if fits in the memory, is very
efficient in locating a record given a keyefficient in locating a record given a key
Disadvantage:Disadvantage:
A dense index, if too big and doesn’t fit into theA dense index, if too big and doesn’t fit into the
memory, will be expensive when used to find amemory, will be expensive when used to find a
record given its keyrecord given its key
No text goes to graphics
10. DWH-Ahsan Abdullah
10
Sparse Index
10
30
50
70
90
110
130
150
170
190
210
230
Data File
20
10
40
30
60
50
80
70
100
90
Normally keepsNormally keeps
only one key peronly one key per
data blockdata block
Some keys in theSome keys in the
data file will notdata file will not
have an entry inhave an entry in
the index filethe index file
Sparse Index: ConceptSparse Index: Concept
11. DWH-Ahsan Abdullah
11
Sparse Index: Adv & Dis AdvSparse Index: Adv & Dis Adv
Advantage:Advantage:
A sparse index uses less space at the expense ofA sparse index uses less space at the expense of
somewhat more time to find a record given itssomewhat more time to find a record given its
keykey
Support multi-level indexing structureSupport multi-level indexing structure
Disadvantage:Disadvantage:
Locating a record given a key has differentLocating a record given a key has different
performance for different key valuesperformance for different key values
No text goes to graphics
13. DWH-Ahsan Abdullah
13
B-tree Indexing: ConceptB-tree Indexing: Concept
Can be seen as a general form of multi-levelCan be seen as a general form of multi-level
indexes.indexes.
Generalize usual (binary) search trees (BST).Generalize usual (binary) search trees (BST).
Allow efficient and fast exploration at the expense ofAllow efficient and fast exploration at the expense of
using slightly more space.using slightly more space.
Popular variant: BPopular variant: B++
-tree-tree
Support more efficiently queries like:Support more efficiently queries like:
SELECT * FROM R WHERE a = 11SELECT * FROM R WHERE a = 11
SELECT * FROM R WHERE 0<= b and b<42SELECT * FROM R WHERE 0<= b and b<42
15. DWH-Ahsan Abdullah
15
B-tree Indexing: LimitationsB-tree Indexing: Limitations
If a table is large and there are fewer unique values.
Capitalization is not programmatically enforced
(meaning case-sensitivity does matter and
“FLASHMAN" is different from “Flashman").
Outcome varies with inter-character spaces.
A noun spelled differently will result in different
results.
Insertion can be very expensive.
Nothing will go to graphics
16. DWH-Ahsan Abdullah
16
B-tree Indexing: Limitations ExampleB-tree Indexing: Limitations Example
Given that MOHAMMED is the most common first name in Pakistan,
a 5-million row Customers table would produce many screens of
matching rows for MOHAMMED AHMAD, yet would skip potential
matching values such as the following:
VALUE MISSED REASON MISSED
Mohammed Ahmad Case sensitive
MOHAMMED AHMED AHMED versus AHMAD
MOHAMMED AHMAD Extra space between names
MOHAMMED AHMAD DR DR after AHMAD
MOHAMMAD AHMAD Alternative spelling of MOHAMMAD
17. DWH-Ahsan Abdullah
17
Hash Based IndexingHash Based Indexing
You may recall that in internal memory, hashingYou may recall that in internal memory, hashing
can be used to quickly locate a specific key.can be used to quickly locate a specific key.
The same technique can be used on externalThe same technique can be used on external
memory.memory.
However, advantage over search trees is smaller inHowever, advantage over search trees is smaller in
external search than internal.external search than internal. WHY?WHY?
Because part of search tree can be brought intoBecause part of search tree can be brought into
the main memory.the main memory.
18. DWH-Ahsan Abdullah
18
Hash Based Indexing: ConceptHash Based Indexing: Concept
In contrast to B-tree indexing, hash based indexes do notIn contrast to B-tree indexing, hash based indexes do not
(typically) keep index values in sorted order.(typically) keep index values in sorted order.
Index entry is found by hashing on index value requiringIndex entry is found by hashing on index value requiring
exact match.exact match.
SELECT * FROM Customers WHERE AccttNo= 110240SELECT * FROM Customers WHERE AccttNo= 110240
Index entries kept in hash organized tables rather than B-Index entries kept in hash organized tables rather than B-
tree structures.tree structures.
Index entry contains ROWID values for each rowIndex entry contains ROWID values for each row
corresponding to the index value.corresponding to the index value.
Remember few numbers in real-life to be useful for hashing.Remember few numbers in real-life to be useful for hashing.
19. DWH-Ahsan Abdullah
19
.
records
.
key → h(key) disk block
Note on terminology:
The word "indexing" is often used
synonymously with "B-tree indexing".
Hashing as Primary IndexHashing as Primary Index
20. DWH-Ahsan Abdullah
20
key → h(key)
Index
recordkey
Can always be transformed to a secondary index using
indirection, as above.
Indexing the Index
Hashing as Secondary IndexHashing as Secondary Index
21. DWH-Ahsan Abdullah
21
Indexing (using B-trees) good for rangeIndexing (using B-trees) good for range
searches, e.g.:searches, e.g.:
SELECT * FROM R WHERE A > 5SELECT * FROM R WHERE A > 5
Hashing good for match based searches,Hashing good for match based searches,
e.g.:e.g.:
SELECT * FROM R WHERE A = 5SELECT * FROM R WHERE A = 5
B-tree vs. Hash IndexesB-tree vs. Hash Indexes
22. DWH-Ahsan Abdullah
22
Primary Key vs. Primary IndexPrimary Key vs. Primary Index
Relation Students
Name ID dept
AHMAD 123 CS
Akram 567 EE
Numan 999 CS
Primary Key & Primary Index:Primary Key & Primary Index:
PK is ALWAYS unique.PK is ALWAYS unique.
PI can be unique, but does not have to be.PI can be unique, but does not have to be.
In DSS environment, very few queries are PK based.In DSS environment, very few queries are PK based.
23. DWH-Ahsan Abdullah
23
Primary index selection criteria:Primary index selection criteria:
Common join and retrieval key.Common join and retrieval key.
Can be unique UPI or non-unique NUPI.Can be unique UPI or non-unique NUPI.
Limits on NUPI.Limits on NUPI.
Only one primary index per table (for hash-basedOnly one primary index per table (for hash-based
file system).file system).
Primary Indexing: CriterionPrimary Indexing: Criterion
24. DWH-Ahsan Abdullah
24
Primary Indexing Criteria: ExamplePrimary Indexing Criteria: Example
What should be the primary index of the call table for aWhat should be the primary index of the call table for a
large telecom company?large telecom company?
call_id decimal (15,0) NOT NULL
caller_no decimal (10,0) NOT NULL
call_duration decimal (15,2) NOT NULL
call_dt date NOT NULL
called_no decimal (15,0) NOT NULL
Call TableCall Table
No simple answer!!No simple answer!!
25. DWH-Ahsan Abdullah
25
Almost all joins and retrievals will occur throughAlmost all joins and retrievals will occur through
the caller _no foreign key.the caller _no foreign key.
Use caller_no as a NUPI.Use caller_no as a NUPI.
In case of non uniform distribution on caller_noIn case of non uniform distribution on caller_no
oror
if phone number have very large number ofif phone number have very large number of
outgoing calls (e.g., an institutional number couldoutgoing calls (e.g., an institutional number could
easily have several thousand calls).easily have several thousand calls).
Use call_id as UPI for good data distribution.Use call_id as UPI for good data distribution.
Primary IndexingPrimary Indexing
26. DWH-Ahsan Abdullah
26
For a hash-based file system, primary index isFor a hash-based file system, primary index is
free!free!
No storage cost.No storage cost.
No index build required.No index build required.
OLTP databases use a page-based file systemOLTP databases use a page-based file system
and therefore do not deliver this performanceand therefore do not deliver this performance
advantage.advantage.
Primary IndexingPrimary Indexing
Editor's Notes
As the complexity of the queries increases, involve table joins, requires aggregates the processing time also increases correspondingly. So the problem is one of knowledge and time.
The point of using an index is to increase the speed and efficiency of searches of the database. Without some sort of index, a user’s query must sequentially scan the database, finding the records matching the parameters in the WHERE clause.