SlideShare a Scribd company logo
Improving RDF Query Performance using
In-Memory Virtual Columns in Oracle Database
Eugene Inseok Chong Matthew Perry Souripriya Das
New England Development Center
Oracle
Nashua, NH, USA
firstname.lastname@oracle.com
Abstract— Many RDF Knowledge Graph Stores use IDs to
represent triples to save storage and for ease of maintenance.
Oracle is no exception. While this design is good for a small
footprint on disk, it incurs overhead in query processing, as it
requires joins with the value table to return results or process
aggregates/filter/order-by queries. It becomes especially
problematic as the result size increases or the number of projected
variables increases. Depending on queries, the value table join
could take up most of the query processing time. In this paper, we
propose to use in-memory virtual columns to avoid value table
joins. It has advantages in that it does not increase the footprint on
disk, and at the same time it avoids the value table joins by
utilizing in-memory virtual columns. The idea is to materialize the
values only in memory and utilize the compression and vector
processing that come with Oracle Database In-Memory
technology. Typically, the value table in RDF is small compared to
the triples table. Therefore, its footprint is manageable in memory,
especially with compression in columnar format. The same
mechanism can be applied to any application where there exists a
one-to-one mapping between an ID and its value, such as data
warehousing or data marts. The mechanism has been
implemented in Oracle 18c. Experimental results using the
LUBM1000 benchmark show up to two orders of magnitude query
performance improvement.
Keywords—
-
Virtual Columns.
I. INTRODUCTION
Resource Description Framework (RDF) [1] and its query
language, SPARQL [2], have drawn a lot of attention due to the
capabilities of representing and querying knowledge graphs,
which have many applications such as linked data, intelligent
search, and logical reasoning. While it is straightforward to
formulate a query using SPARQL, the query processing is
challenging, as it frequently requires a large number of self-
joins. Many researchers have investigated the problem of how
to efficiently process RDF queries [4,5,6,7]. Due to the large size
of URIs, often times self-joins are executed using small size IDs
to reduce the size of intermediate join results and improve query
performance. Therefore, the RDF triples table is often
normalized so that the triples table contains IDs and the value
table (also known as dictionary table or symbol table) is kept
separately. Many RDF stores [8,13,15,19] adopt this approach.
In Oracle, the underlying RDF tables are normalized into a
triples table where subject, predicate, object and named graph
IDs are stored and a value table where all relevant information
for those IDs are stored, like value type, literal type, string
values, etc. [3].
Typically, RDF queries are processed using only IDs to get
the self-join results, and then the IDs are joined with the value
table to get the final results. However, when the query needs to
process aggregate, filter, or order-by expressions, or when the
query returns a large number of results to users, joining the
triples table with its value table could incur a significant
performance hit because the join is performed for every variable
that is returned or used in an expression. As more variables are
returned, more joins are performed. For some queries, the value
table join could consume more than 90% of the query processing
time. This behavior was observed when we experimented with
customer data by measuring the difference in execution time
before the value table join and after the join. We can remove
these joins by complete de-normalization of values, but it incurs
large persistent storage requirements as well as anomalies
associated with data redundancy, such as integrity, consistency,
and so on. It also incurs a large intermediate join result size
because URI and literal values would need to be carried through
all the join operations.
To eliminate the join between the triples table and the value
table, we propose in-memory materialization of values by
utilizing in-memory virtual columns. The in-memory virtual
columns can speed up query performance without increasing
disk storage requirements. The values corresponding to IDs in
the triples table are materialized in memory. All these values
come from the value table, and there must be many duplicates
because the same IDs are used in a number of places in the
triples table. These values are materialized in columnar format,
and the same values are compressed away. Given a triples/quad
table (GID, SID, PID, OID), where GID is a graph ID, SID
subject ID, PID predicate ID, and OID object ID, we materialize
in memory the virtual columns using functions, GetVal(GID),
GetVal(SID), GetVal(PID), and GetVal(OID), where GetVal
(ID) does a lookup in the values table and returns the
corresponding RDF resource value.
Our early prototype [4] showed promising results, hence we
have implemented the approach in Oracle Database 18c. Our
experiments on LUBM [9] benchmarks show up to two orders
of magnitude query performance improvement. The RDF in-
1814
2019 IEEE 35th International Conference on Data Engineering (ICDE)
2375-026X/19/$31.00 ©2019 IEEE
DOI 10.1109/ICDE.2019.00197
memory virtual column approach can be applied to other
application areas such as data mart/data warehousing
star/snowflake schema queries where frequent joins with
dimension tables are common so that the join between the fact
table and dimension table can be eliminated. In fact, the
approach can be applied to any applications as long as there is a
one-to-one mapping between the ID and its value. For example,
the following snowflake schema query [11] from Wikipedia can
be reduced to one without joins by defining in-memory virtual
columns Year, Country, Brand, Product_Category in the
Fact_Sales table:
SELECT B.Brand, G.Country,SUM(F.Units_Sold)
FROM Fact_Sales F
INNER JOIN Dim_Date D ON F.Date_Id=D.Id
INNER JOIN Dim_Store S ON F.Store_Id=S.Id
INNER JOIN Dim_Geography G ON S.Geography_Id = G.Id
INNER JOIN Dim_Product P ON F.Product_Id = P.Id
INNER JOIN Dim_Brand B ON P.Brand_Id = B.Id
INNER JOIN Dim_Product_Category C ON P.Product_Category_Id
= C.Id
WHERE D.Year = 1997 AND C.Product_Category = 'tv'
GROUP BY B.Brand, G.Country;
This query is translated as follows using the virtual columns:
SELECT F.Brand, F.Country, SUM(F.Units_Sold)
FROM Fact_Sales F
WHERE F.Year = 1997 AND F.Product_Category = 'tv'
GROUP BY F.Brand, F.Country;
Our approach is not specific to Oracle. It could be applied to
any application where similar configurations are used. In
Section 2, we discuss related work, and in Section 3, we describe
our RDF in-memory processing. The in-memory virtual column
processing is described in Section 4. Section 5 describes
SPARQL to SQL translation. Section 6 discusses memory
footprint, and Section 7 describes our experimental study. We
conclude in Section 8.
II. RELATED WORK
RDF in-memory processing utilizes two different in-
memory structures, one using pointers (memory addresses)
embedded in the data structure so that the traversal is done
without any joins, such as in [14], and the other using IDs in a
relational table structure so that the traversal is done via joins,
such as in [8]. Some systems [13] do not use memory addresses
to enable reloading the memory structure from disk. Both
approaches have pros and cons. While the first method
mimicking the graph structure works well for processing path
queries, it is not very efficient for set-oriented operations, such
as aggregates. The second approach is very cumbersome in
handling path queries, as it requires joins, and sometimes the
intermediate join results can be large, slowing down the query
performance. HDT files [19] uses adjacency lists to alleviate
some of these problems, but it is read-only.
Much research [3,5,6,7,16] has been published on efficiently
processing self-joins utilizing indexes, column stores and some
other auxiliary structures such as materialized views. Typical
RDF data has a small number of distinct predicates compared to
subjects or objects, and many RDF queries have constants on
predicates. Hence, the data is sometimes partitioned on the
predicate so that only relevant data is accessed [5].
Whatever underlying data structure is adopted, it usually
maintains a separate dictionary for strings to represent URIs and
literals. Therefore, a join is required to get the values to present
to users or process aggregates, filters, or order-by queries. Our
paper focuses on removing this join to accelerate the query
processing. Systems using sequence numbers or plain numbers
as IDs would have small footprint in memory and faster load
time, but it would be difficult to integrate new data from other
sources because the dictionary table needs to be consulted to
generate or lookup an ID for a resource. Oracle uses hash IDs,
therefore unique IDs can be obtained by applying a function to
the resource value. This approach makes data integration more
efficient because unique IDs for resources can quickly be
generated without consulting the dictionary table. However, the
8-byte ID entails a bigger footprint and more processing during
load. It will also burden join processing as the bigger IDs
produce bigger intermediate results. The elimination of joins to
get resource values will help overall query processing.
III. RDF IN-MEMORY PROCESSING
In-memory processing is increasingly used as memory cost
is dropping and performance improvement across different
workloads is desired without much tuning. The RDF in-memory
processing utilizes the Oracle Database In-Memory Column
Store (IMC) [10, 18]. Frequently accessed columns from the
triples table and the value table are loaded into memory. RDF
queries often perform hash joins and the hash joins require a full
scan of triples and value tables. The in-memory column store
accelerates these table scans. In addition, the in-memory column
store employs compression and uses 4-byte dictionary code
instead of values. It also does smart scans using in-memory
storage index where min and max values of each column in the
in-memory segment unit called IMCU (in-memory compression
unit) are stored. In addition, it uses Bloom filter [12] joins and
SIMD (Single Instruction Multiple Data) vector processing for
queries with filter. The SIMD filters a number of rows in a single
instruction.
If insufficient memory is available to load all the requested
data into memory, Oracle IMC will partially load the data. While
it would be ideal if all data fits in memory, partial in-memory
population also delivers some performance improvement [4].
In Oracle Database 18c, enabling and disabling the RDF in-
memory population are controlled by the following PL/SQL
APIs:
EXEC SEM_APIS.ENABLE_INMEMORY(TRUE);
EXEC SEM_APIS.DISABLE_INMEMORY;
The argument ‘TRUE’ means that we wait until the data is
populated in memory. When on-disk data is changed due to
insert, delete, or update, background processes automatically
modify the in-memory data by creating a new IMCU.
1815
Table 1: Quads/Triples Table Table 2: Value Table (VALUE$)
IV. ELIMINATION OF VALUE JOIN USING RDF IN-
MEMORY VIRTUAL COLUMN
RDF query execution spends significant time joining with
the value table to get column values. Materializing values can
avoid these joins. However, materializing values violates the
normalization principle, and string value materialization, in
particular, becomes prohibitively expensive due to space
requirements. Therefore, instead of materializing on disk we do
it in memory. By populating the column values in memory as
virtual columns [17], we can retrieve values without joining
with the value table. We add virtual columns to the triples table,
and the values for these virtual columns are materialized in
memory. We need values for subject ID (SID), predicate ID
(PID), object ID (OID) and graph ID (GID). For example, the
value for the subject, SVAL, is obtained by the function
GetVal(SID). These values are organized in columnar format
and compressed. Table 3 shows the structure at a conceptual
level for the triples table (Table 1) and the value table (Table
2). A 4-byte dictionary code is actually stored in memory and a
separate symbol table is maintained in memory to map the
dictionary code to its value. The virtual columns are stored in
the in-memory segment called IMEU (in-memory expression
unit).
There are many duplicates in SVAL, PVAL, OVAL, and
GVAL, and these duplicates are compressed away. All queries
will work on the triples table only. Note that this kind of
materialization is possible only if there is a one-to-one mapping
between the ID and its value.
Here is one of the virtual column functions. It extracts values
from the value table given an ID:
FUNCTION GetVal (i_id NUMBER)
RETURN VARCHAR2 DETERMINISTIC IS
r_val VARCHAR2(4000);
BEGIN
EXECUTE IMMEDIATE
'SELECT /*+ index(m C_PK_VID) */ VAL
FROM VALUE$ m
WHERE ID = :1' INTO r_val USING i_id;
RETURN r_val;
END;
Here is how the virtual column SVAL is defined using the
function GetVal():
EXECUTE IMMEDIATE
'ALTER TABLE VALUE$ ADD
SVAL GENERATED ALWAYS AS
(GetVal(SID))
VIRTUAL INMEMORY';
Once the virtual columns are defined, the virtual column
name and its virtual column function name can be used
interchangeably in the query to retrieve the value from memory.
In other words, if a query contains GetVal(SID), the subject
value is fetched directly from memory instead of executing the
virtual column function. In this case, either SVAL or
GetVal(SID) is used to get the value.
In general, any application that utilizes in-memory virtual
columns can identify columns that are essential for fast query
performance and materialize only those columns in memory.
The columns to be materialized in memory can be determined
by the query workload.
ID VAL
101 <ns:g1>
201 <ns:s1>
302 <ns:s2>
402 <ns:p1>
403 <ns:p2>
611 "100"^^xsd:decimal
612 “200”^^xsd:decimal
723 "2000-01-02T01:00:01"^^xsd:dateTime
GID SID PID OID
101 201 402 611
101 302 403 723
101 302 402 612
GID SID PID OID GVAL SVAL PVAL OVAL
101 201 402 611 <ns:g1> <ns:s1> <ns:p1> "100"^^xsd:decimal
101 302 403 723 <ns:g1> <ns:s2> <ns:p2> "2000-01-
02T01:00:01"^^xsd:dateTime
101 302 402 612 <ns:g1> <ns:s2> <ns:p1> "200"^^xsd:decimal
Table 3: Quads/Triples Table in Memory
1816
V. SPARQL TO SQL TRANSLATION
As the underlying triples table and the value table are stored
in the relational database, all SPARQL queries are translated
into equivalent SQL queries against the triples table and value
table. Typically, an RDF query is processed first via self-joins
using IDs followed by joins with the value table. The in-
memory virtual column employs late materialization, hence the
4-byte dictionary code is used for interim processing until the
full value is needed. All value table joins are replaced with
fetching virtual columns from the triples table. The SPARQL-
to-SQL query translation routines maintain a few HashMaps to
map the SPARQL query variables to virtual columns and to
triple patterns in the SPARQL query. Because the same variable
can appear in more than one triple pattern, we need to keep in
the HashMap the variable along with its position in the triple
pattern so that the correct value is fetched. For example, in the
following triple pattern:
{ ?s <p1> ?o. ?t <p2> ?s }
The value of the variable ?s in the first triple is fetched from
SVAL while in the second triple it is fetched from OVAL.
VI. DEALING WITH MEMORY REQUIREMENT
The size of typical user applications’ RDF data that we have
observed is about a few hundred million triples. This size of data
should fit in memory easily. The 242 million triples table
(242,297,052 triples) for LUBM data we use in our experiment
requires 8.99 GB (8,991,866,880 bytes) of memory including
the in-memory virtual columns. Its size on disk is 5.55 GB
(5,557,166,080 bytes). The actual memory requirement depends
on the data characteristics such as the extent of value repetition
in the triples. Because the in-memory columnar representation
gives better compression than the on-disk row format, the data
size in memory can be smaller than the on-disk size in some
cases. With increasing memory size available these days, it will
not be a problem fitting billions of triples in memory, especially
on server machines.
In-memory data is fetched from the memory while out-of-
memory data is fetched from the disk. For out-of-memory data,
the virtual columns are automatically assembled using the data
on disk. If a large amount of data resides on disk, it may
deteriorate the query performance. However, in Oracle
Database, the RDF data is partitioned into separate datasets
based on user-defined groupings [3], and the in-memory
population is controlled at the partition/subpartition level so that
only relevant datasets are populated in memory. If a query
suffers from significant performance degradation due to on-disk
virtual column fetches, the query can resort to in-memory real
columns only using the option DISABLE_IM_VIRTUAL_COL
so that the query is processed without using the virtual columns.
VII. EXPERIMENTS
A. Hardware Setup
The RDF in-memory virtual column performance is
conducted on a virtual machine with 256 GB memory and 2TB
of disk space. It has 32 CPUs. The machines use Oracle Linux
Server 6 operating system. The database used is Oracle
Database 18c Enterprise Edition - 64bit Production. The timing
values are an average of three runs for each query and the
timing resolution is 10ms as Linux default.
B. RDF In-Memory Virtual Columns Performance
This experiment is conducted to check the performance of
the RDF in-memory virtual columns (IMVC) against RDF non-
in-memory (non-IM) configuration. LUBM1K benchmark is
used. The LUBM1K data set contains the total 242,297,052
rows including entailment. The LUBM benchmark queries are
used for evaluation. Because the server is shared, the maximum
SGA we can have is 140 GB and INMEMORY_SIZE is set to
60 GB. The numbers in the parentheses in Figures 1-4 represent
the number of rows in the result set. Figures 1 and 2 show the
execution time for sequential run in logarithmic scale for both
configurations. In the warm run, some non-IM queries with a
small result set run faster as the IMVC still does full scan in
memory. However, some non-IM queries require tuning.
The timing values for Q3 and Q10 are 0.00 for both
configurations and for Q1 IMVC shows 0.00 because the timing
values were measured only up to hundredths of a second. The
Figures 3 and 4 show the execution time in logarithmic scale
for parallel run with degree 32. The timing values for the
queries Q1, Q3, and Q10, in IMVC show 0.00 for the warm run.
The performance improvement of the in-memory virtual
columns against non-IM shows 43x gain (cold) and 50x gain
(warm) for Q8 in sequential run, and 20x gain (Q8) and 144x
gain (Q12) for parallel run.
The parallel run with degree 32 requires a lot of memory as
more inter-process communication is needed. Because in-
memory virtual columns require more memory than non-IM
configuration, for some queries we ran out of memory and
therefore some data was written onto disk, causing performance
degradation.
Figure 1: Sequential execution time (in sec, log scale) for
LUBM benchmark queries (cold run)
1817
Figure 2: Sequential execution time (in sec, log scale) for
LUBM benchmark queries (warm run)
Figure 3: Parallel execution time (in sec, log scale) for
LUBM benchmark queries (cold run)
Figure 4: Parallel execution time (in sec, log scale) for
LUBM benchmark queries (warm run)
As more values are fetched and the number of variables in-
creases, a bigger performance gain will be achieved. However,
IMVC does not control self-joins of the triples table. Therefore,
if a non-IM query produces a better execution plan of self-joins
using indexes, it could outperform the IMVC performance as
can be seen in Q2, Q9, and Q13 above. In general, in-memory
based query processing provides consistently good
performance without tuning and it does not show erratic
behavior on different workloads.
Figure 5: Execution time (in sec, log scale) for fetching all
values
1818
We have fetched all values from the triples table to check
its impact as more values are fetched. Figure 5 shows the
execution time for fetching all values. It shows 41x
improvement for sequential run and 436x gain (986.05 vs. 2.26
sec) for parallel run against non-IM.
VIII.CONCLUSION AND FUTURE WORK
Efficient materialization of RDF data in memory
significantly improves query performance. In-memory
materialization using virtual columns does not increase
persistent storage requirements, and its columnar format is also
good for compression. We have shown that this approach can
make a significant performance enhancement. Though we have
applied the scheme to RDF data, it has potential to be applied
to any area where a one-to-one mapping is maintained between
ID and its value. In sum, by materializing one-to-one join
operations in memory, we have achieved up to two orders of
magnitude performance improvement.
While this paper provides a viable solution to value table
joins in RDF query processing and reduces the possibility of
generating poor execution plans by reducing the overall number
of joins in the query, it does not propose a solution to speed up
or reduce the number of self-joins on the triples table. It could
be interesting to develop a new scheme to handle self-joins
along the same lines by eliminating actual joins.
REFERENCES
[1] RDF 1.1 Concepts and Abstract Syntax.
https://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/, Feb. 2014.
[2] SPARQL 1.1 Query Language. https://www.w3.org/TR/sparql11-query/,
Mar. 2013.
[3] E. I. Chong, S. Das, G. Eadon, and J. Srinivasan. An Efficient SQL-based
RDF Querying Scheme. In Proc. of VLDB Conference, 1216–1227, 2005.
[4] E. I. Chong. Balancing Act to improve RDF Query Performance in Oracle
Database. Invited Talk, LDBC 8th TUC meeting, Jun. 2016.
[5] D. J. Abadi, A. Marcus, S. Madden, and K. J. Hollenbach. Scalable
Semantic Web Data Management using Vertical Partitioning. In Proc. of
VLDB Conference, 411-422, 2007.
[6] C. Weiss, P. Karras, and A. Bernstein. Hexastore: Sextuple Indexing for
Semantic Web Data Management. In Proc. of VLDB Conference, 1008-1019,
2008.
[7] T. Neumann and G. Weikum. RDF3X: a RISCstyle Engine for RDF. In
Proc. of VLDB Conference, 647-659, 2008.
[8] Orri Erling, Virtuoso, a Hybrid RDBMS/Graph Column Store, Bulletin of
the IEEE Computer Society Technical Committee on Data Engineering,
35(1), 3-8, 2012.
[9] Lehigh University Benchmark. http://swat.cse.lehigh.edu/projects/lubm/,
Jul. 2005.
[10] Oracle Database In-Memory Guide.
http://docs.oracle.com/database/122/INMEM/title.htm, Jan. 2017.
[11] Snowflake schema. https://en.wikipedia.org/wiki/Snowflake_schema
[12] B.H. Bloom. Space/Time Trade-Offs in Hash Coding with Allowable
Errors. CACM, 13(7), 422-426. 1970.
[13] M. Janik and K. Kochut, BRAHMS: A WorkBench RDF Store And High
Performance Memory System for Semantic Association Discovery, In Proc.
of ISWC Confer-ence, 2005.
[14] R. Binna, W. Gassler, E. Zangerle, D. Pacher, G. Specht, SpiderStore:
Exploiting Main Memory for Efficient RDF Graph Representation and Fast
Querying, Workshop on Semantic Data Management, 2010.
[15] B. Motik, Y. Nenov, R. Piro, I. Horrocks and D. Olteanu, Parallel
Materialisation of Datalog Programs in Centralised, Main-Memory RDF
Systems, In Proceedings of the Twenty-Eighth AAAI Conference on Artificial
Intelligence, 129-137, 2014.
[16] T. Neumann and G. Weikum. Scalable Join Processing on Very Large
RDF Graphs. In Proceedings of the 35th SIGMOD International Conference
on Management of Data, 627-640, New York, NY, USA, 2009.
[17] A. Mishra, et al., Accelerating Analytics with Dynamic In-Memory
Expressions, In Proc. of VLDB Conference, 1437–1448, 2016.
[18] Lahiri, T. et al. Oracle Database In-Memory: A Dual Format In-Memory
Database, In Proc. of ICDE Conference, 1253-1258, 2015.
[19] Fernández J.D., Martínez-Prieto M.A., Gutierrez C. Compact
Representation of Large RDF Data Sets for Publishing and Exchange. In Proc.
of ISWC Conference, 193-208, 2010.
1819

More Related Content

What's hot

Database Basics
Database BasicsDatabase Basics
Database Basics
ProdigyView
 
SAP ABAP data dictionary
SAP ABAP data dictionarySAP ABAP data dictionary
SAP ABAP data dictionary
Revanth Nagaraju
 
Intro to Data warehousing lecture 10
Intro to Data warehousing   lecture 10Intro to Data warehousing   lecture 10
Intro to Data warehousing lecture 10
AnwarrChaudary
 
SAS Programming Notes
SAS Programming NotesSAS Programming Notes
SAS Programming Notes
Gnana Murthy A
 
Asp.net interview questions
Asp.net interview questionsAsp.net interview questions
Asp.net interview questions
Akhil Mittal
 
SAS Online Training Hyderabad India
SAS Online Training Hyderabad IndiaSAS Online Training Hyderabad India
SAS Online Training Hyderabad IndiaSrinivasa Rao
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
yasir873
 
Database Basics Theory
Database Basics TheoryDatabase Basics Theory
Database Basics Theory
sunmitraeducation
 
IMPORT AND EXPORT UTILITIES IN MS-ACCESS
IMPORT AND EXPORT UTILITIES IN MS-ACCESSIMPORT AND EXPORT UTILITIES IN MS-ACCESS
IMPORT AND EXPORT UTILITIES IN MS-ACCESS
23HARSHU
 
DBMS_Ch1
 DBMS_Ch1 DBMS_Ch1
DBMS_Ch1
Azizul Mamun
 
Database management system by Neeraj Bhandari ( Surkhet.Nepal )
Database management system by Neeraj Bhandari ( Surkhet.Nepal )Database management system by Neeraj Bhandari ( Surkhet.Nepal )
Database management system by Neeraj Bhandari ( Surkhet.Nepal )Neeraj Bhandari
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
Sayed Ahmed
 
Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SAS
izahn
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
Omid Vahdaty
 
Introduction to Database system
Introduction to Database systemIntroduction to Database system
Introduction to Database system
Putu Sundika
 
Implementation Issue with ORDBMS
Implementation Issue with ORDBMSImplementation Issue with ORDBMS
Implementation Issue with ORDBMS
Sandeep Poudel
 
Physical database design(database)
Physical database design(database)Physical database design(database)
Physical database design(database)welcometofacebook
 
Graph db as metastore
Graph db as metastoreGraph db as metastore
Graph db as metastore
Haris Khan
 
Data Warehousing and Bitmap Indexes - More than just some bits
Data Warehousing and Bitmap Indexes  - More than just some bitsData Warehousing and Bitmap Indexes  - More than just some bits
Data Warehousing and Bitmap Indexes - More than just some bits
Trivadis
 

What's hot (20)

Unit08 dbms
Unit08 dbmsUnit08 dbms
Unit08 dbms
 
Database Basics
Database BasicsDatabase Basics
Database Basics
 
SAP ABAP data dictionary
SAP ABAP data dictionarySAP ABAP data dictionary
SAP ABAP data dictionary
 
Intro to Data warehousing lecture 10
Intro to Data warehousing   lecture 10Intro to Data warehousing   lecture 10
Intro to Data warehousing lecture 10
 
SAS Programming Notes
SAS Programming NotesSAS Programming Notes
SAS Programming Notes
 
Asp.net interview questions
Asp.net interview questionsAsp.net interview questions
Asp.net interview questions
 
SAS Online Training Hyderabad India
SAS Online Training Hyderabad IndiaSAS Online Training Hyderabad India
SAS Online Training Hyderabad India
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
 
Database Basics Theory
Database Basics TheoryDatabase Basics Theory
Database Basics Theory
 
IMPORT AND EXPORT UTILITIES IN MS-ACCESS
IMPORT AND EXPORT UTILITIES IN MS-ACCESSIMPORT AND EXPORT UTILITIES IN MS-ACCESS
IMPORT AND EXPORT UTILITIES IN MS-ACCESS
 
DBMS_Ch1
 DBMS_Ch1 DBMS_Ch1
DBMS_Ch1
 
Database management system by Neeraj Bhandari ( Surkhet.Nepal )
Database management system by Neeraj Bhandari ( Surkhet.Nepal )Database management system by Neeraj Bhandari ( Surkhet.Nepal )
Database management system by Neeraj Bhandari ( Surkhet.Nepal )
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SAS
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
 
Introduction to Database system
Introduction to Database systemIntroduction to Database system
Introduction to Database system
 
Implementation Issue with ORDBMS
Implementation Issue with ORDBMSImplementation Issue with ORDBMS
Implementation Issue with ORDBMS
 
Physical database design(database)
Physical database design(database)Physical database design(database)
Physical database design(database)
 
Graph db as metastore
Graph db as metastoreGraph db as metastore
Graph db as metastore
 
Data Warehousing and Bitmap Indexes - More than just some bits
Data Warehousing and Bitmap Indexes  - More than just some bitsData Warehousing and Bitmap Indexes  - More than just some bits
Data Warehousing and Bitmap Indexes - More than just some bits
 

Similar to Icde2019 improving rdf query performance using in-memory virtual columns in oracle database

Dremel Paper Review
Dremel Paper ReviewDremel Paper Review
Dremel Paper Review
Arinto Murdopo
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
ijiert bestjournal
 
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEEMEMTECHSTUDENTPROJECTS
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
IEEEMEMTECHSTUDENTSPROJECTS
 
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSQUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
ijcsit
 
Query Optimization for Big Data Analytics
Query Optimization for Big Data AnalyticsQuery Optimization for Big Data Analytics
Query Optimization for Big Data Analytics
AIRCC Publishing Corporation
 
Column store databases approaches and optimization techniques
Column store databases  approaches and optimization techniquesColumn store databases  approaches and optimization techniques
Column store databases approaches and optimization techniques
IJDKP
 
IRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description FrameworkIRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description Framework
IRJET Journal
 
Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases
IJECEIAES
 
Evaluation criteria for nosql databases
Evaluation criteria for nosql databasesEvaluation criteria for nosql databases
Evaluation criteria for nosql databases
Ebenezer Daniel
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
ijscai
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQL
IJSCAI Journal
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
ijscai
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQL
IJSCAI Journal
 
Improving performance of decision support queries in columnar cloud database ...
Improving performance of decision support queries in columnar cloud database ...Improving performance of decision support queries in columnar cloud database ...
Improving performance of decision support queries in columnar cloud database ...
Serkan Özal
 
Efficient Log Management using Oozie, Parquet and Hive
Efficient Log Management using Oozie, Parquet and HiveEfficient Log Management using Oozie, Parquet and Hive
Efficient Log Management using Oozie, Parquet and Hive
Gopi Krishnan Nambiar
 
Comparative study of relational and non relations database performances using...
Comparative study of relational and non relations database performances using...Comparative study of relational and non relations database performances using...
Comparative study of relational and non relations database performances using...
IAEME Publication
 
De-duplicated Refined Zone in Healthcare Data Lake Using Big Data Processing ...
De-duplicated Refined Zone in Healthcare Data Lake Using Big Data Processing ...De-duplicated Refined Zone in Healthcare Data Lake Using Big Data Processing ...
De-duplicated Refined Zone in Healthcare Data Lake Using Big Data Processing ...
CitiusTech
 
Enhancing keyword search over relational databases using ontologies
Enhancing keyword search over relational databases using ontologiesEnhancing keyword search over relational databases using ontologies
Enhancing keyword search over relational databases using ontologies
csandit
 
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
cscpconf
 

Similar to Icde2019 improving rdf query performance using in-memory virtual columns in oracle database (20)

Dremel Paper Review
Dremel Paper ReviewDremel Paper Review
Dremel Paper Review
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
 
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
 
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSQUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
 
Query Optimization for Big Data Analytics
Query Optimization for Big Data AnalyticsQuery Optimization for Big Data Analytics
Query Optimization for Big Data Analytics
 
Column store databases approaches and optimization techniques
Column store databases  approaches and optimization techniquesColumn store databases  approaches and optimization techniques
Column store databases approaches and optimization techniques
 
IRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description FrameworkIRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description Framework
 
Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases Performance Benchmarking of Key-Value Store NoSQL Databases
Performance Benchmarking of Key-Value Store NoSQL Databases
 
Evaluation criteria for nosql databases
Evaluation criteria for nosql databasesEvaluation criteria for nosql databases
Evaluation criteria for nosql databases
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQL
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQL
 
Improving performance of decision support queries in columnar cloud database ...
Improving performance of decision support queries in columnar cloud database ...Improving performance of decision support queries in columnar cloud database ...
Improving performance of decision support queries in columnar cloud database ...
 
Efficient Log Management using Oozie, Parquet and Hive
Efficient Log Management using Oozie, Parquet and HiveEfficient Log Management using Oozie, Parquet and Hive
Efficient Log Management using Oozie, Parquet and Hive
 
Comparative study of relational and non relations database performances using...
Comparative study of relational and non relations database performances using...Comparative study of relational and non relations database performances using...
Comparative study of relational and non relations database performances using...
 
De-duplicated Refined Zone in Healthcare Data Lake Using Big Data Processing ...
De-duplicated Refined Zone in Healthcare Data Lake Using Big Data Processing ...De-duplicated Refined Zone in Healthcare Data Lake Using Big Data Processing ...
De-duplicated Refined Zone in Healthcare Data Lake Using Big Data Processing ...
 
Enhancing keyword search over relational databases using ontologies
Enhancing keyword search over relational databases using ontologiesEnhancing keyword search over relational databases using ontologies
Enhancing keyword search over relational databases using ontologies
 
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES
 

More from Jean Ihm

Oracle Spatial Studio: Fast and Easy Spatial Analytics and Maps
Oracle Spatial Studio:  Fast and Easy Spatial Analytics and MapsOracle Spatial Studio:  Fast and Easy Spatial Analytics and Maps
Oracle Spatial Studio: Fast and Easy Spatial Analytics and Maps
Jean Ihm
 
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your Data
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your DataBuild Knowledge Graphs with Oracle RDF to Extract More Value from Your Data
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your Data
Jean Ihm
 
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
Jean Ihm
 
When Graphs Meet Machine Learning
When Graphs Meet Machine LearningWhen Graphs Meet Machine Learning
When Graphs Meet Machine Learning
Jean Ihm
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for Graphs
Jean Ihm
 
How To Visualize Graphs
How To Visualize GraphsHow To Visualize Graphs
How To Visualize Graphs
Jean Ihm
 
Gain Insights with Graph Analytics
Gain Insights with Graph Analytics Gain Insights with Graph Analytics
Gain Insights with Graph Analytics
Jean Ihm
 
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
Jean Ihm
 
Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1) Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1)
Jean Ihm
 
An Introduction to Graph: Database, Analytics, and Cloud Services
An Introduction to Graph:  Database, Analytics, and Cloud ServicesAn Introduction to Graph:  Database, Analytics, and Cloud Services
An Introduction to Graph: Database, Analytics, and Cloud Services
Jean Ihm
 

More from Jean Ihm (10)

Oracle Spatial Studio: Fast and Easy Spatial Analytics and Maps
Oracle Spatial Studio:  Fast and Easy Spatial Analytics and MapsOracle Spatial Studio:  Fast and Easy Spatial Analytics and Maps
Oracle Spatial Studio: Fast and Easy Spatial Analytics and Maps
 
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your Data
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your DataBuild Knowledge Graphs with Oracle RDF to Extract More Value from Your Data
Build Knowledge Graphs with Oracle RDF to Extract More Value from Your Data
 
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
 
When Graphs Meet Machine Learning
When Graphs Meet Machine LearningWhen Graphs Meet Machine Learning
When Graphs Meet Machine Learning
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for Graphs
 
How To Visualize Graphs
How To Visualize GraphsHow To Visualize Graphs
How To Visualize Graphs
 
Gain Insights with Graph Analytics
Gain Insights with Graph Analytics Gain Insights with Graph Analytics
Gain Insights with Graph Analytics
 
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
 
Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1) Introduction to Property Graph Features (AskTOM Office Hours part 1)
Introduction to Property Graph Features (AskTOM Office Hours part 1)
 
An Introduction to Graph: Database, Analytics, and Cloud Services
An Introduction to Graph:  Database, Analytics, and Cloud ServicesAn Introduction to Graph:  Database, Analytics, and Cloud Services
An Introduction to Graph: Database, Analytics, and Cloud Services
 

Recently uploaded

一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 

Recently uploaded (20)

一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 

Icde2019 improving rdf query performance using in-memory virtual columns in oracle database

  • 1. Improving RDF Query Performance using In-Memory Virtual Columns in Oracle Database Eugene Inseok Chong Matthew Perry Souripriya Das New England Development Center Oracle Nashua, NH, USA firstname.lastname@oracle.com Abstract— Many RDF Knowledge Graph Stores use IDs to represent triples to save storage and for ease of maintenance. Oracle is no exception. While this design is good for a small footprint on disk, it incurs overhead in query processing, as it requires joins with the value table to return results or process aggregates/filter/order-by queries. It becomes especially problematic as the result size increases or the number of projected variables increases. Depending on queries, the value table join could take up most of the query processing time. In this paper, we propose to use in-memory virtual columns to avoid value table joins. It has advantages in that it does not increase the footprint on disk, and at the same time it avoids the value table joins by utilizing in-memory virtual columns. The idea is to materialize the values only in memory and utilize the compression and vector processing that come with Oracle Database In-Memory technology. Typically, the value table in RDF is small compared to the triples table. Therefore, its footprint is manageable in memory, especially with compression in columnar format. The same mechanism can be applied to any application where there exists a one-to-one mapping between an ID and its value, such as data warehousing or data marts. The mechanism has been implemented in Oracle 18c. Experimental results using the LUBM1000 benchmark show up to two orders of magnitude query performance improvement. Keywords— - Virtual Columns. I. INTRODUCTION Resource Description Framework (RDF) [1] and its query language, SPARQL [2], have drawn a lot of attention due to the capabilities of representing and querying knowledge graphs, which have many applications such as linked data, intelligent search, and logical reasoning. While it is straightforward to formulate a query using SPARQL, the query processing is challenging, as it frequently requires a large number of self- joins. Many researchers have investigated the problem of how to efficiently process RDF queries [4,5,6,7]. Due to the large size of URIs, often times self-joins are executed using small size IDs to reduce the size of intermediate join results and improve query performance. Therefore, the RDF triples table is often normalized so that the triples table contains IDs and the value table (also known as dictionary table or symbol table) is kept separately. Many RDF stores [8,13,15,19] adopt this approach. In Oracle, the underlying RDF tables are normalized into a triples table where subject, predicate, object and named graph IDs are stored and a value table where all relevant information for those IDs are stored, like value type, literal type, string values, etc. [3]. Typically, RDF queries are processed using only IDs to get the self-join results, and then the IDs are joined with the value table to get the final results. However, when the query needs to process aggregate, filter, or order-by expressions, or when the query returns a large number of results to users, joining the triples table with its value table could incur a significant performance hit because the join is performed for every variable that is returned or used in an expression. As more variables are returned, more joins are performed. For some queries, the value table join could consume more than 90% of the query processing time. This behavior was observed when we experimented with customer data by measuring the difference in execution time before the value table join and after the join. We can remove these joins by complete de-normalization of values, but it incurs large persistent storage requirements as well as anomalies associated with data redundancy, such as integrity, consistency, and so on. It also incurs a large intermediate join result size because URI and literal values would need to be carried through all the join operations. To eliminate the join between the triples table and the value table, we propose in-memory materialization of values by utilizing in-memory virtual columns. The in-memory virtual columns can speed up query performance without increasing disk storage requirements. The values corresponding to IDs in the triples table are materialized in memory. All these values come from the value table, and there must be many duplicates because the same IDs are used in a number of places in the triples table. These values are materialized in columnar format, and the same values are compressed away. Given a triples/quad table (GID, SID, PID, OID), where GID is a graph ID, SID subject ID, PID predicate ID, and OID object ID, we materialize in memory the virtual columns using functions, GetVal(GID), GetVal(SID), GetVal(PID), and GetVal(OID), where GetVal (ID) does a lookup in the values table and returns the corresponding RDF resource value. Our early prototype [4] showed promising results, hence we have implemented the approach in Oracle Database 18c. Our experiments on LUBM [9] benchmarks show up to two orders of magnitude query performance improvement. The RDF in- 1814 2019 IEEE 35th International Conference on Data Engineering (ICDE) 2375-026X/19/$31.00 ©2019 IEEE DOI 10.1109/ICDE.2019.00197
  • 2. memory virtual column approach can be applied to other application areas such as data mart/data warehousing star/snowflake schema queries where frequent joins with dimension tables are common so that the join between the fact table and dimension table can be eliminated. In fact, the approach can be applied to any applications as long as there is a one-to-one mapping between the ID and its value. For example, the following snowflake schema query [11] from Wikipedia can be reduced to one without joins by defining in-memory virtual columns Year, Country, Brand, Product_Category in the Fact_Sales table: SELECT B.Brand, G.Country,SUM(F.Units_Sold) FROM Fact_Sales F INNER JOIN Dim_Date D ON F.Date_Id=D.Id INNER JOIN Dim_Store S ON F.Store_Id=S.Id INNER JOIN Dim_Geography G ON S.Geography_Id = G.Id INNER JOIN Dim_Product P ON F.Product_Id = P.Id INNER JOIN Dim_Brand B ON P.Brand_Id = B.Id INNER JOIN Dim_Product_Category C ON P.Product_Category_Id = C.Id WHERE D.Year = 1997 AND C.Product_Category = 'tv' GROUP BY B.Brand, G.Country; This query is translated as follows using the virtual columns: SELECT F.Brand, F.Country, SUM(F.Units_Sold) FROM Fact_Sales F WHERE F.Year = 1997 AND F.Product_Category = 'tv' GROUP BY F.Brand, F.Country; Our approach is not specific to Oracle. It could be applied to any application where similar configurations are used. In Section 2, we discuss related work, and in Section 3, we describe our RDF in-memory processing. The in-memory virtual column processing is described in Section 4. Section 5 describes SPARQL to SQL translation. Section 6 discusses memory footprint, and Section 7 describes our experimental study. We conclude in Section 8. II. RELATED WORK RDF in-memory processing utilizes two different in- memory structures, one using pointers (memory addresses) embedded in the data structure so that the traversal is done without any joins, such as in [14], and the other using IDs in a relational table structure so that the traversal is done via joins, such as in [8]. Some systems [13] do not use memory addresses to enable reloading the memory structure from disk. Both approaches have pros and cons. While the first method mimicking the graph structure works well for processing path queries, it is not very efficient for set-oriented operations, such as aggregates. The second approach is very cumbersome in handling path queries, as it requires joins, and sometimes the intermediate join results can be large, slowing down the query performance. HDT files [19] uses adjacency lists to alleviate some of these problems, but it is read-only. Much research [3,5,6,7,16] has been published on efficiently processing self-joins utilizing indexes, column stores and some other auxiliary structures such as materialized views. Typical RDF data has a small number of distinct predicates compared to subjects or objects, and many RDF queries have constants on predicates. Hence, the data is sometimes partitioned on the predicate so that only relevant data is accessed [5]. Whatever underlying data structure is adopted, it usually maintains a separate dictionary for strings to represent URIs and literals. Therefore, a join is required to get the values to present to users or process aggregates, filters, or order-by queries. Our paper focuses on removing this join to accelerate the query processing. Systems using sequence numbers or plain numbers as IDs would have small footprint in memory and faster load time, but it would be difficult to integrate new data from other sources because the dictionary table needs to be consulted to generate or lookup an ID for a resource. Oracle uses hash IDs, therefore unique IDs can be obtained by applying a function to the resource value. This approach makes data integration more efficient because unique IDs for resources can quickly be generated without consulting the dictionary table. However, the 8-byte ID entails a bigger footprint and more processing during load. It will also burden join processing as the bigger IDs produce bigger intermediate results. The elimination of joins to get resource values will help overall query processing. III. RDF IN-MEMORY PROCESSING In-memory processing is increasingly used as memory cost is dropping and performance improvement across different workloads is desired without much tuning. The RDF in-memory processing utilizes the Oracle Database In-Memory Column Store (IMC) [10, 18]. Frequently accessed columns from the triples table and the value table are loaded into memory. RDF queries often perform hash joins and the hash joins require a full scan of triples and value tables. The in-memory column store accelerates these table scans. In addition, the in-memory column store employs compression and uses 4-byte dictionary code instead of values. It also does smart scans using in-memory storage index where min and max values of each column in the in-memory segment unit called IMCU (in-memory compression unit) are stored. In addition, it uses Bloom filter [12] joins and SIMD (Single Instruction Multiple Data) vector processing for queries with filter. The SIMD filters a number of rows in a single instruction. If insufficient memory is available to load all the requested data into memory, Oracle IMC will partially load the data. While it would be ideal if all data fits in memory, partial in-memory population also delivers some performance improvement [4]. In Oracle Database 18c, enabling and disabling the RDF in- memory population are controlled by the following PL/SQL APIs: EXEC SEM_APIS.ENABLE_INMEMORY(TRUE); EXEC SEM_APIS.DISABLE_INMEMORY; The argument ‘TRUE’ means that we wait until the data is populated in memory. When on-disk data is changed due to insert, delete, or update, background processes automatically modify the in-memory data by creating a new IMCU. 1815
  • 3. Table 1: Quads/Triples Table Table 2: Value Table (VALUE$) IV. ELIMINATION OF VALUE JOIN USING RDF IN- MEMORY VIRTUAL COLUMN RDF query execution spends significant time joining with the value table to get column values. Materializing values can avoid these joins. However, materializing values violates the normalization principle, and string value materialization, in particular, becomes prohibitively expensive due to space requirements. Therefore, instead of materializing on disk we do it in memory. By populating the column values in memory as virtual columns [17], we can retrieve values without joining with the value table. We add virtual columns to the triples table, and the values for these virtual columns are materialized in memory. We need values for subject ID (SID), predicate ID (PID), object ID (OID) and graph ID (GID). For example, the value for the subject, SVAL, is obtained by the function GetVal(SID). These values are organized in columnar format and compressed. Table 3 shows the structure at a conceptual level for the triples table (Table 1) and the value table (Table 2). A 4-byte dictionary code is actually stored in memory and a separate symbol table is maintained in memory to map the dictionary code to its value. The virtual columns are stored in the in-memory segment called IMEU (in-memory expression unit). There are many duplicates in SVAL, PVAL, OVAL, and GVAL, and these duplicates are compressed away. All queries will work on the triples table only. Note that this kind of materialization is possible only if there is a one-to-one mapping between the ID and its value. Here is one of the virtual column functions. It extracts values from the value table given an ID: FUNCTION GetVal (i_id NUMBER) RETURN VARCHAR2 DETERMINISTIC IS r_val VARCHAR2(4000); BEGIN EXECUTE IMMEDIATE 'SELECT /*+ index(m C_PK_VID) */ VAL FROM VALUE$ m WHERE ID = :1' INTO r_val USING i_id; RETURN r_val; END; Here is how the virtual column SVAL is defined using the function GetVal(): EXECUTE IMMEDIATE 'ALTER TABLE VALUE$ ADD SVAL GENERATED ALWAYS AS (GetVal(SID)) VIRTUAL INMEMORY'; Once the virtual columns are defined, the virtual column name and its virtual column function name can be used interchangeably in the query to retrieve the value from memory. In other words, if a query contains GetVal(SID), the subject value is fetched directly from memory instead of executing the virtual column function. In this case, either SVAL or GetVal(SID) is used to get the value. In general, any application that utilizes in-memory virtual columns can identify columns that are essential for fast query performance and materialize only those columns in memory. The columns to be materialized in memory can be determined by the query workload. ID VAL 101 <ns:g1> 201 <ns:s1> 302 <ns:s2> 402 <ns:p1> 403 <ns:p2> 611 "100"^^xsd:decimal 612 “200”^^xsd:decimal 723 "2000-01-02T01:00:01"^^xsd:dateTime GID SID PID OID 101 201 402 611 101 302 403 723 101 302 402 612 GID SID PID OID GVAL SVAL PVAL OVAL 101 201 402 611 <ns:g1> <ns:s1> <ns:p1> "100"^^xsd:decimal 101 302 403 723 <ns:g1> <ns:s2> <ns:p2> "2000-01- 02T01:00:01"^^xsd:dateTime 101 302 402 612 <ns:g1> <ns:s2> <ns:p1> "200"^^xsd:decimal Table 3: Quads/Triples Table in Memory 1816
  • 4. V. SPARQL TO SQL TRANSLATION As the underlying triples table and the value table are stored in the relational database, all SPARQL queries are translated into equivalent SQL queries against the triples table and value table. Typically, an RDF query is processed first via self-joins using IDs followed by joins with the value table. The in- memory virtual column employs late materialization, hence the 4-byte dictionary code is used for interim processing until the full value is needed. All value table joins are replaced with fetching virtual columns from the triples table. The SPARQL- to-SQL query translation routines maintain a few HashMaps to map the SPARQL query variables to virtual columns and to triple patterns in the SPARQL query. Because the same variable can appear in more than one triple pattern, we need to keep in the HashMap the variable along with its position in the triple pattern so that the correct value is fetched. For example, in the following triple pattern: { ?s <p1> ?o. ?t <p2> ?s } The value of the variable ?s in the first triple is fetched from SVAL while in the second triple it is fetched from OVAL. VI. DEALING WITH MEMORY REQUIREMENT The size of typical user applications’ RDF data that we have observed is about a few hundred million triples. This size of data should fit in memory easily. The 242 million triples table (242,297,052 triples) for LUBM data we use in our experiment requires 8.99 GB (8,991,866,880 bytes) of memory including the in-memory virtual columns. Its size on disk is 5.55 GB (5,557,166,080 bytes). The actual memory requirement depends on the data characteristics such as the extent of value repetition in the triples. Because the in-memory columnar representation gives better compression than the on-disk row format, the data size in memory can be smaller than the on-disk size in some cases. With increasing memory size available these days, it will not be a problem fitting billions of triples in memory, especially on server machines. In-memory data is fetched from the memory while out-of- memory data is fetched from the disk. For out-of-memory data, the virtual columns are automatically assembled using the data on disk. If a large amount of data resides on disk, it may deteriorate the query performance. However, in Oracle Database, the RDF data is partitioned into separate datasets based on user-defined groupings [3], and the in-memory population is controlled at the partition/subpartition level so that only relevant datasets are populated in memory. If a query suffers from significant performance degradation due to on-disk virtual column fetches, the query can resort to in-memory real columns only using the option DISABLE_IM_VIRTUAL_COL so that the query is processed without using the virtual columns. VII. EXPERIMENTS A. Hardware Setup The RDF in-memory virtual column performance is conducted on a virtual machine with 256 GB memory and 2TB of disk space. It has 32 CPUs. The machines use Oracle Linux Server 6 operating system. The database used is Oracle Database 18c Enterprise Edition - 64bit Production. The timing values are an average of three runs for each query and the timing resolution is 10ms as Linux default. B. RDF In-Memory Virtual Columns Performance This experiment is conducted to check the performance of the RDF in-memory virtual columns (IMVC) against RDF non- in-memory (non-IM) configuration. LUBM1K benchmark is used. The LUBM1K data set contains the total 242,297,052 rows including entailment. The LUBM benchmark queries are used for evaluation. Because the server is shared, the maximum SGA we can have is 140 GB and INMEMORY_SIZE is set to 60 GB. The numbers in the parentheses in Figures 1-4 represent the number of rows in the result set. Figures 1 and 2 show the execution time for sequential run in logarithmic scale for both configurations. In the warm run, some non-IM queries with a small result set run faster as the IMVC still does full scan in memory. However, some non-IM queries require tuning. The timing values for Q3 and Q10 are 0.00 for both configurations and for Q1 IMVC shows 0.00 because the timing values were measured only up to hundredths of a second. The Figures 3 and 4 show the execution time in logarithmic scale for parallel run with degree 32. The timing values for the queries Q1, Q3, and Q10, in IMVC show 0.00 for the warm run. The performance improvement of the in-memory virtual columns against non-IM shows 43x gain (cold) and 50x gain (warm) for Q8 in sequential run, and 20x gain (Q8) and 144x gain (Q12) for parallel run. The parallel run with degree 32 requires a lot of memory as more inter-process communication is needed. Because in- memory virtual columns require more memory than non-IM configuration, for some queries we ran out of memory and therefore some data was written onto disk, causing performance degradation. Figure 1: Sequential execution time (in sec, log scale) for LUBM benchmark queries (cold run) 1817
  • 5. Figure 2: Sequential execution time (in sec, log scale) for LUBM benchmark queries (warm run) Figure 3: Parallel execution time (in sec, log scale) for LUBM benchmark queries (cold run) Figure 4: Parallel execution time (in sec, log scale) for LUBM benchmark queries (warm run) As more values are fetched and the number of variables in- creases, a bigger performance gain will be achieved. However, IMVC does not control self-joins of the triples table. Therefore, if a non-IM query produces a better execution plan of self-joins using indexes, it could outperform the IMVC performance as can be seen in Q2, Q9, and Q13 above. In general, in-memory based query processing provides consistently good performance without tuning and it does not show erratic behavior on different workloads. Figure 5: Execution time (in sec, log scale) for fetching all values 1818
  • 6. We have fetched all values from the triples table to check its impact as more values are fetched. Figure 5 shows the execution time for fetching all values. It shows 41x improvement for sequential run and 436x gain (986.05 vs. 2.26 sec) for parallel run against non-IM. VIII.CONCLUSION AND FUTURE WORK Efficient materialization of RDF data in memory significantly improves query performance. In-memory materialization using virtual columns does not increase persistent storage requirements, and its columnar format is also good for compression. We have shown that this approach can make a significant performance enhancement. Though we have applied the scheme to RDF data, it has potential to be applied to any area where a one-to-one mapping is maintained between ID and its value. In sum, by materializing one-to-one join operations in memory, we have achieved up to two orders of magnitude performance improvement. While this paper provides a viable solution to value table joins in RDF query processing and reduces the possibility of generating poor execution plans by reducing the overall number of joins in the query, it does not propose a solution to speed up or reduce the number of self-joins on the triples table. It could be interesting to develop a new scheme to handle self-joins along the same lines by eliminating actual joins. REFERENCES [1] RDF 1.1 Concepts and Abstract Syntax. https://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/, Feb. 2014. [2] SPARQL 1.1 Query Language. https://www.w3.org/TR/sparql11-query/, Mar. 2013. [3] E. I. Chong, S. Das, G. Eadon, and J. Srinivasan. An Efficient SQL-based RDF Querying Scheme. In Proc. of VLDB Conference, 1216–1227, 2005. [4] E. I. Chong. Balancing Act to improve RDF Query Performance in Oracle Database. Invited Talk, LDBC 8th TUC meeting, Jun. 2016. [5] D. J. Abadi, A. Marcus, S. Madden, and K. J. Hollenbach. Scalable Semantic Web Data Management using Vertical Partitioning. In Proc. of VLDB Conference, 411-422, 2007. [6] C. Weiss, P. Karras, and A. Bernstein. Hexastore: Sextuple Indexing for Semantic Web Data Management. In Proc. of VLDB Conference, 1008-1019, 2008. [7] T. Neumann and G. Weikum. RDF3X: a RISCstyle Engine for RDF. In Proc. of VLDB Conference, 647-659, 2008. [8] Orri Erling, Virtuoso, a Hybrid RDBMS/Graph Column Store, Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 35(1), 3-8, 2012. [9] Lehigh University Benchmark. http://swat.cse.lehigh.edu/projects/lubm/, Jul. 2005. [10] Oracle Database In-Memory Guide. http://docs.oracle.com/database/122/INMEM/title.htm, Jan. 2017. [11] Snowflake schema. https://en.wikipedia.org/wiki/Snowflake_schema [12] B.H. Bloom. Space/Time Trade-Offs in Hash Coding with Allowable Errors. CACM, 13(7), 422-426. 1970. [13] M. Janik and K. Kochut, BRAHMS: A WorkBench RDF Store And High Performance Memory System for Semantic Association Discovery, In Proc. of ISWC Confer-ence, 2005. [14] R. Binna, W. Gassler, E. Zangerle, D. Pacher, G. Specht, SpiderStore: Exploiting Main Memory for Efficient RDF Graph Representation and Fast Querying, Workshop on Semantic Data Management, 2010. [15] B. Motik, Y. Nenov, R. Piro, I. Horrocks and D. Olteanu, Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF Systems, In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 129-137, 2014. [16] T. Neumann and G. Weikum. Scalable Join Processing on Very Large RDF Graphs. In Proceedings of the 35th SIGMOD International Conference on Management of Data, 627-640, New York, NY, USA, 2009. [17] A. Mishra, et al., Accelerating Analytics with Dynamic In-Memory Expressions, In Proc. of VLDB Conference, 1437–1448, 2016. [18] Lahiri, T. et al. Oracle Database In-Memory: A Dual Format In-Memory Database, In Proc. of ICDE Conference, 1253-1258, 2015. [19] Fernández J.D., Martínez-Prieto M.A., Gutierrez C. Compact Representation of Large RDF Data Sets for Publishing and Exchange. In Proc. of ISWC Conference, 193-208, 2010. 1819