Speaker: Pavel Pontryagin, Senior Engineer at Peter-Service
Video: http://www.youtube.com/watch?v=Sf81x5V8xKY&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=7
Data volume grows and in telecommunication area it is painful to support and scale RDBMS systems. This presentation shows how we switched from SQL to NoSQL. This will be an overview of aspects: * how we model schema for call data using NoSQL vs SQL. * what hardware architecture we use * NoSQL vs SQL insert-select performance * how we store graph data using C*.
C* Summit EU 2013: Using Cassandra in a Telco Storage System
1. Using Cassandra in a Telco Storage System
Pavel Pontryagin, senior
developer
#CASSANDRA
EU
CASSANDRASUMMITE
U
2. What we will discuss today is how we switched
from SQL to NoSQL
* What hardware architecture we use
* How we model schema for call, client and
graph data using NoSQL vs SQL
* NoSQL vs SQL insert-select performance
#CASSANDRA
EU
CASSANDRASUMMITE
U
3. SQL
What and how much is Telco data
#CASSANDRA
EU
CASSANDRASUMMITE
U
4. * What is CDR? Under the hood.
#CASSANDRA
EU
CASSANDRASUMMITE
U
5. Scaling data in a traditional schema
500 millions/day
100 millions/day
#CASSANDRA
EU
CASSANDRASUMMITE
U
6. Unanswered questions. How to:
* Speed up processing
* Increase reliability
* Decrease costs
* Unify units
* Add new functionality
Solution for us is
Switch to Cassandra
#CASSANDRA
EU
CASSANDRASUMMITE
U
9. Data modeling. Calls
SQL data
started
numA
20131015210458
765-23-14 765-23-18
Query
numB
Event
type
VOICE
Lac
Cell
IMSI
IMEI
2901 35140 IMSI1 IMEI1
SELECT
started, numA, numB
FROM
events_1003_main
WHERE
started > sysdate -1 and numA = ‘765-23-14’;
#CASSANDRA
EU
CASSANDRASUMMITE
U
10. noSQL data
uuid1
“raw” data CALLS_201301
20131015210458,765-23-14,765-23-18,VOICE,2901,35140
index cf CALLS_MSISDN_201301
201301.765-‐23-‐14
{2013-‐10-‐15
21/04/58:uuid1}
numB
201301.765-‐23-‐18
{2013-‐10-‐15
21/04/58:uuid1}
numA.IMSI1.IMEI1
SliceQuery<String, Composite, String> sliceQuery; //define query
sliceQuery.setColumnFamily(“CALLS_MSISDN_201301”); //define CF
sliceQuery.setKey(“201301.765-23-18”); //Query given msidn for given date
Composite start = new Composite();
start.addComponent(0, “201301”, AbstractComposite.ComponentEquality.EQUAL); // from date
Composite end = new Composite();
end.addComponent(0, “201303”, AbstractComposite.ComponentEquality.GREATER_THAN_EQUAL); // to
Query
date
sliceQuery.setRange(start, end, false, 1000);
QueryResult<ColumnSlice<Composite, String>> qr = sliceQuery.execute(); //get the result
#CASSANDRA
EU
CASSANDRASUMMITE
U
11. Using SolR
… where num like ‘%123’
Clone
Keep using
+SolR indexing
#CASSANDRA
EU
CASSANDRASUMMITE
U
12. noSQL data
uuid1
“raw” data CALLS_201301
20131015210458,765-23-14,765-23-18,VOICE,2901,35140
index cf CALLS_LAC_CELL_201301
2013.05.01 15:15
Query
#CASSANDRA
EU
{2901:35140: uuid1}
20131015210458
SliceQuery<String, Composite, Date> sliceQuery; //define query
sliceQuery.setColumnFamily(“CALLS_LAC_CELL_201301”); //define CF
sliceQuery.setKey(“2013.05.01 15:15”); //Query given time slice
Composite start = new Composite();
start.addComponent(0, “2901”, AbstractComposite.ComponentEquality.EQUAL); // from date
Composite end = new Composite();
end.addComponent(0, “2901”, AbstractComposite.ComponentEquality.GREATER_THAN_EQUAL); // to date
sliceQuery.setRange(start, end, false, 1000);
QueryResult<ColumnSlice<Composite, Date >> qr = sliceQuery.execute(); //get the result
CASSANDRASUMMITE
U
13. Data modeling. Subscribers
SQL data
started
numA
20131015210458
765-23-14 765-23-18
Query
numB
Event
type
VOICE
Lac
Cell
IMSI
IMEI
2901 35140 IMSI1 IMEI1
SELECT distinct dn.msisdn, tl.name, t.full FROM SUBS_DATA.DCT_NUMS dn, SUBS_DATA.SUBS_NUMS_HIST snh,
subs_data.clnt_attrs_hist cah, SUBS_DATA.DCT_NAMES t, dicts.telcos tl
Where (dn.num_id = subn_num_id) and
(cah.clnt_clnt_id = snh.clnt_clnt_id) and
(t.name_id = cah.reg_name_id) and
(tl.telco_id = dn.tlco_tlco_id) and
(snh.started > to_date('10.09.2012', 'DD.MM.YYYY')) and
(snh.started < to_date('17.09.2013', 'DD.MM.YYYY'))
order by dn.msisdn;
#CASSANDRA
EU
CASSANDRASUMMITE
U
14. noSQL “raw” subscribers data
first slice
uuid1
Pavel, 765-23-14, address1, document1, 2, 21-01-2013, 1(active_status)
uuid2
Vadim, 765-23-16, address2, document2, 2, 21-01-2013, 1(active_status)
update
uuid3
Pavel,
index cf SUBS_NAME 765-23-14, address1, document1, 2, 21-01-2014, 0(active_status)
Pavel
Vadim
{21-01-2013: document1: 1}
uuid1
{21-01-2013: document2: 1}
{21-01-2014: document1: 0}
uuid3
uuid2
#CASSANDRA
EU
CASSANDRASUMMITE
U
16. noSQL graph data
First record from the first switch
uuid
started
numA
numB
uuid1
2013/01/12
msisdn1
msisdn2
Event
type
VOICE
subsciber
Event
type
VOICE
subsciber
1
Second record from the second switch
uuid
started
numA
numB
uuid2
2013/01/12
msisdn1
msisdn2
CALLS_EDGES column family
2013/01/12.msisdn1
2013/01/12.msisdn2
#CASSANDRA
EU
{VOICE: msisdn2: 1:uuid1}
OUT
{VOICE: msisdn1: 1:uuid1}
IN
1
{VOICE: msisdn2: 2:uuid2}
OUT
{VOICE: msisdn1: 2:uuid2}
IN
CASSANDRASUMMITE
U
17. Summary
* Loading speed
Write,
rec/sec
Oracle
Cassandra
10 000
5 000 per node
* Query speed
seconds
Oracle
Cassandra
* Fun
#CASSANDRA
EU
> 1, depends on plans, statistics etc.
< 1, stable
CASSANDRASUMMITE
U
18. What we discussed today…
* How to model "traditional" data
* How to add full-text search to intensive data load
* A couple of hardware issues
#CASSANDRA
EU
CASSANDRASUMMITE
U
#CASSANDRA
EU
CASSANDRASUMMITE
U