标题:
Architecture and Practice for DAL (5) Data Sharding
Architecture and Practice for Data Access Layer (5) Data Sharding
联动优势数据访问层DAL架构和实践之五:分片数据分片
说明:
How to implement a dalet to access sharding databases.
和已有DAL软件(如许超前DAL手机之家、陈思儒Amoeba/贺贤懋Cobar等)不一样,在前端访问方式的选择上,抛弃JDBC方式,而是为同一个dalet数据服务,同时提供自定义TCP长连接和HTTP长连接两种接口。
因而通过抛弃JDBC可以获得多方面的好处——
1)可减少S端协议解析和查询分析的开销;
2)也简化C端编程。
3)后端存储就不再限于RDB了,而可以是任意NOSQL、文件、缓存、甚至是Tuxedo等在线服务。
4)可以实现无状态了,更容易横向扩展。
5)从接口上就可消除join等关键字的误用,避免引起服务端负担过重。
CKAN : 資料開放平台技術介紹 (CAKN : Technical Introduction to Open Data Portal)Jian-Kai Wang
以「技術背景」,「CKAN 架構」,「客製化模版與模組」與「客製化頁面與語言轉換」等四大主軸介紹臺灣疾管署開放資料平台採用之 CKAN 系統架構。
平台 : https://data.cdc.gov.tw
日期 : 2016/09/02
The content consists of (1) background of system operations, (2) the architecture of ckan, (3) customized module and template, (4) customized pages and language translation.
Platform : https://data.cdc.gov.tw
Date : 09/02/2016
30. Hive 语句
QUERIES
SELECT …. FROM …..WHERE……
GROUP BY….ORDER BY….LIMIT
JOIN
DDL
database 、 table(location 、 Partitions)
funciton 、 index 、 view
DML
Loading files into tables
Inserting data into Hive Tables from queries
Writing data into filesystem from queries
31. Hive 改进
Not support Multi-distinct
bash:
cat t_mall_buy.txt | awk ‘{print uid}’ | sort | uniq | wc –l
cat t_mall_buy.txt | awk ‘{print urs}’ | sort | uniq | wc –l
mysql :
SELECT count(distinct uid), count(distint urs)
FROM t_mall_buy
hive:
SELECT count(distinct uid) FROM t_mall_buy
SELECT count(distinct urs) FROM t_mall_buy
补丁支持
https://issues.apache.org/jira/browse/HIVE-287
https://issues.apache.org/jira/browse/HIVE-474
32. Hive 改进
不支持 exists in 子查询
support NOT IN and NOT LIKE syntax in version 0.8
Not support Index
Not available until 0.7 release
Not support Insert into
INSERT INTO syntax is only available starting in version
0.8