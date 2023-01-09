9.
is NOT…
a database:
- Does not store any data
Instead, it employs a ‘Connector’ Architecture
10.
Connector Architecture
Presto
- Connectors enable reading from external data
sources
- Can query data in different formats in same query
Text
Text Connector
Parquet Parquet Connector
MySQL
MySQL Connector
JSON
JSON
Connector
11.
is NOT…
a transactional query engine:
- Not designed for queries common in application
development:
- ie: point-lookups
Instead, designed for analytic queries
- ie: full table scans and aggregations
- Note: Indices would not speed up these queries
12.
Coordinator
Presto Architecture
Result
Worker
Worker
Worker
Queue Plan Schedule
Processor
External Data
Sources
Read Data
Read Data
Processor
Processor
Optimize
29.
Glossary
TableScan - Scans the underlying dataset for the tables for data, using partitions (if any).
Project - Select speci
fi
ed columns from the scanned data, could also transform projected column.
ScanProject - Combines table scans and column projections into one operator
Filter - Filters out data not matching provided predicates
Aggregate (Partial) - Aggregates data on a single worker
Aggregate (Final) - Aggregation of the aggregates
Limit (Partial) - Applies limits on the data scanned on a single node
Limit (Final) - Apply a limit on the limits
LocalExchange (Single) - Used to read data from another stage
LocalExchange (Round Robin) - Used to read data from multiple stages
31.
Reading Presto Query
Plans
SELECT shop_id, COUNT(1)
FROM hive
.sensitive_partitioned_monorail
.monorail_shopify_admin_page_view_1
WHERE _partition_yyyy_mm_dd_hh >= '2019-07-25'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 100;
EXPLAIN (type distributed)
42.
Parquet File Format
• Columnar data format.
• Each parquet
fi
le is made of multiple “row groups”.
• Each “row group” is made of multiple “data pages”.
• Makes queries that only need a subset of columns
ef
fi
cient.
• Metadata on a
fi
le and row group level.
Reference: https://parquet.apache.org/documentation/latest/
49.
Partitioning
• Data is stored and separated into different folders called
“partitions” on disk.
• ex. partition_key=value
• There can be multiple layers of partitioning *
• ex. partition_key_1=value_1/partition_key_2=value_2/etc.
• To see the partitions for a table
• SELECT * FROM catalog.schema.”table_name$partitions"
Caveat:
• Too many partitions can lead to sub-optimal performance.
50.
Partitioning
• We store our monorail data with partitions year, month, day, hour
• ie. path_to_data/year=2019/month=01/day=02/hour=03
• Bad partitioning would be if we partitioned by minute as well.
51.
File Sizes
• Number of
fi
les == number of initial splits
• Find a balance for reading metadata and data
• If
fi
les are too small, your query will be degraded by I/O
overhead, reading more metadata than data
• The ideal
fi
le size is to match the HDFS cache block size
(128mb).
52.
File Sizes
• Buttt what about thick
fi
les?
• Bigger row groups (multiple rows).
• More likely to run into memory issues.
53.
Sorted Data
• Presto can read metadata about the row groups.
• These include min, max, count stats for each row group.
• Based on the metadata, presto can skip row groups.
Caveat:
• The initial sorting of the data when writing is costly.
54.
Sorted Data
• Presto can read metadata about the row groups.
• These include min, max, count stats for each row group.
• Based on the metadata, presto can skip row groups.
• Can only sort:
• Within bucketed tables.
• On a
fi
le level.
Caveat:
• The initial sorting of the data when writing is costly.