my experience in spark tunning. All tests are made in production environment(600+ node hadoop cluster). The tunning result is useful for Spark SQL use case.
MongoDB Background and specifics ,
also I provide how to use Mongod Security .
and Basic MongoDB operation by pymongo
我這份文件有介紹MONGODB的特性及限制,Sharding 及 Replicate 的觀悠,Security怎麼作,怎麼用
Build 1 trillion warehouse based on carbon databoxu42
Apache CarbonData & Spark Meetup
Build 1 trillion warehouse based on CarbonData
Huawei
Apache Spark™ is a unified analytics engine for large-scale data processing.
CarbonData is a high-performance data solution that supports various data analytic scenarios, including BI analysis, ad-hoc SQL query, fast filter lookup on detail record, streaming analytics, and so on. CarbonData has been deployed in many enterprise production environments, in one of the largest scenario it supports queries on single table with 3PB data (more than 5 trillion records) with response time less than 3 seconds!
my experience in spark tunning. All tests are made in production environment(600+ node hadoop cluster). The tunning result is useful for Spark SQL use case.
MongoDB Background and specifics ,
also I provide how to use Mongod Security .
and Basic MongoDB operation by pymongo
我這份文件有介紹MONGODB的特性及限制,Sharding 及 Replicate 的觀悠,Security怎麼作,怎麼用
Build 1 trillion warehouse based on carbon databoxu42
Apache CarbonData & Spark Meetup
Build 1 trillion warehouse based on CarbonData
Huawei
Apache Spark™ is a unified analytics engine for large-scale data processing.
CarbonData is a high-performance data solution that supports various data analytic scenarios, including BI analysis, ad-hoc SQL query, fast filter lookup on detail record, streaming analytics, and so on. CarbonData has been deployed in many enterprise production environments, in one of the largest scenario it supports queries on single table with 3PB data (more than 5 trillion records) with response time less than 3 seconds!
The Construction and Practice of Apache Pegasus in Offline and Online Scenari...acelyc1112009
A presentation in Apache Pegasus meetup in 2022 from Wei Wang.
Apache Pegasus is a horizontally scalable, strongly consistent and high-performance key-value store.
Know more about Pegasus https://pegasus.apache.org, https://github.com/apache/incubator-pegasus
11. MyFOX—数据查询
SELECT IF(INSTR(f.keyword,' ') >
0, UPPER(TRIM(f.keyword)), CONCAT(b.brand_name,'
',UPPER(TRIM(f.keyword)))) AS f0,
SUM(f.search_num) AS f1,
SUM(f.uv) AS f2,
ROUND(SUM(f.search_num) / SUM(f.uv), 2) AS f3,
AVG(f.uv) AS f4
FROM f
INNER JOIN dim_brand b ON f.keyword_brand_id = b.brand_id
WHERE f.keyword_type_id = 1 AND f.keyword != ''
AND keyword_cat_id IN ('50002535')
AND thedate <= '2011-03-10'
AND thedate >= '2011-03-08'
GROUP BY f0
ORDER BY SUM(f.search_num) DESC LIMIT 0, 1500
13. MyFOX路由层—语义理解
WHERE thedate <= '2011-03-10'
AND thedate > '2011-03-07'
AND toprank_id IN (2, 3)
2 3
{"toprank_id":"2", {"toprank_id":“3",
2011-03-08
"thedate":"2011-03-08"} "thedate":"2011-03-08"}
{"toprank_id":"2", {"toprank_id":“3",
2011-03-09
"thedate":"2011-03-09"} "thedate":"2011-03-09"}
{"toprank_id":"2", {"toprank_id":“3",
2011-03-10
"thedate":"2011-03-10"} "thedate":"2011-03-10"}
14. MyFOX路由层—字段改写
SELECT a AS f0,
SUM(f.search_num) AS f1,
SUM(f.uv) AS f2,
ROUND(SUM(f.search_num) / SUM(f.uv), 2) AS
f3,
AVG(f.uv) AS f4
• AVG(a)
• 1 + SUM(a)
• SELECT a FROM … ORDER BY b
• 重复查询列