Actian Matrix is a massively scalable columnar database that can scale out horizontally on commodity hardware. It uses a leader node and compute nodes architecture with dynamic slicing of data to distribute query processing across nodes. This allows queries on very large datasets to be processed rapidly in parallel. Matrix has demonstrated excellent performance and scalability on the TPC-H benchmark, processing a 100GB dataset on 16 nodes in a very short time. Its ability to scale out nearly linearly with added nodes and disks is the secret to its strong performance on big data workloads.
4. Amazon RedshiftはMatrixを採用
“Actian has an industry leading solution and is
rethinking database cloud – we’re excited to
back such a strong team.”
- Jeff Blackburn, SVP of Business Development for Amazon
Amazon Redshift is the fastest
growing service in their portfolio
Selected after deep evaluation
against all competitors based on
price-performance value
proposition of Actian’s platform
Amazon Redshift service has over
1000 new customers since
launch in Feb ’13
Actian complements Redshift
with on-premise, high-scale
analytics suite and support
Actian Analytics Platform Underpins AMAZON
REDSHIFT
Created New Cloud Service driving $50M+ revenue annually.
5. 本当にスケールするの?
CentOS 6.4 64bit
Intel Xeon L5640
2.26GHz (2 cores only)
8GB
SATA SSD * 2
(RAID0 全ノード共有)
Actian Matrix 5.1
Virtual
Gigabit Ethernet
x N = ?
12. Queryの実行例
• TPC-H Q16
select
p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt
from
partsupp, part
where
p_partkey = ps_partkey
and p_brand <> 'Brand#15'
and p_type not like 'STANDARD POLISHED%'
and p_size in (3, 8, 49, 19, 29, 9, 47, 32)
and ps_suppkey not in (
select s_suppkey
from supplier
where s_comment like '%Customer%Complaints%'
)
group by
p_brand, p_type, p_size
order by
supplier_cnt desc, p_brand, p_type, p_size;
17. 上手に圧縮、サイズは1/2以下!
create table lineitem (
l_orderkey int8 not null encode delta sortkey distkey,
l_partkey int4 not null,
l_suppkey int4 not null encode mostly16,
l_linenumber int4 not null encode mostly8,
l_quantity numeric(19,2) not null encode bytedict,
l_extendedprice numeric(19,2) not null encode mostly32,
l_discount numeric(19,2) not null encode mostly8,
l_tax numeric(19,2) not null encode mostly8,
l_returnflag char(1) not null encode runlength,
l_linestatus char(1) not null encode runlength,
l_shipdate date not null encode delta,
l_commitdate date not null encode delta,
l_receiptdate date not null encode delta,
l_shipinstruct char(25) not null encode bytedict,
l_shipmode char(10) not null encode bytedict,
l_comment varchar(44) not null
);
create table lineitem (
l_orderkey int8 not null sortkey distkey,
l_partkey int4 not null,
l_suppkey int4 not null,
l_linenumber int4 not null,
l_quantity numeric(19,2) not null,
l_extendedprice numeric(19,2) not null,
l_discount numeric(19,2) not null,
l_tax numeric(19,2) not null,
l_returnflag char(1) not null,
l_linestatus char(1) not null,
l_shipdate date not null,
l_commitdate date not null,
l_receiptdate date not null,
l_shipinstruct char(25) not null,
l_shipmode char(10) not null,
l_comment varchar(44) not null
);
行数: 600,037,902
40,204 MB 18,900 MB !
18. 隠し味はUDF(ユーザー定義関数)
• PL/pgSQLの例
CREATE OR REPLACE FUNCTION f_echo(_text varchar)
RETURNS varchar AS
$$
BEGIN
return _text;
END;
$$
LANGUAGE plpgsql;
• C++の例
#include "padb_udf.hpp"
PADB_UDF_VERSION(charcount);
extern "C"
{
padb_udf::int_t charcount ( padb_udf::ScalarArg &aux,
padb_udf::varchar_t *target, padb_udf::varchar_t *tst )
{
padb_udf::int_t ret = 0;
if ( tst->len != 1 )
{
aux.throwError( __func__,"probe length must be = 1" );
}
char ch = tst->str[0];
for ( padb_udf::len_t ix = 0; ix < target->len; ix++ )
{
if ( target->str[ix] == ch )
{
ret++;
}
}
return aux.retIntVal( ret );
}
}
CREATE OR REPLACE FUNCTION charcount
(target_string varchar, search_character varchar)
RETURNS int
AS '/tmp/scalar_charcount.o'
LANGUAGE C STABLE;
• ユーザーのビジネスロジック組み込み
• サードパーティ製分析関数
• ODI(On-Demand Integration)の実現
25. こんなクエリが書けちゃう
select word, sum(wordcount) from odi_hadoop_import(
with jobname('googlebooks_job')
masternode('hadoop01.com')
inputdir('/user/dhirama/googlebooksModifyDemo')
padb_schema('googlebooksWords')
delimiter('¥t')
)
where word like 'another%'
group by word;