Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)

Testing tools
Wojciech.Biela@teradata.com
Łukasz.Osipiuk@teradata.com
Karol.Sobczak@teradata.com

Why we need them
● Certified distro
● Enterprise support
● Quarterly releases
● Product testing - Tempto
● Performance testing - Benchto

Tempto
Product test framework
github.com/prestodb/tempto
Łukasz Osipiuk
lukasz.osipiuk@teradata.com

What is Tempto?
● End-to-end product testing framework
● Targeted to software engineers
● For automation
● Tests easy to define
● Focus on test code
● Focus on database systems
● So far used for testing
○ Presto
○ internal projects

How is test defined?
● Java
● SQL convention based

Example – Java based test
public class SimpleQueryTest extends ProductTest {
private static class SimpleTestRequirements implements RequirementsProvider{
public Requirement getRequirements(Configuration config) {
return new ImmutableHiveTableRequirement(NATION);
}
}
@Inject
Configuration configuration;
@Test(groups = {"smoke", "query"})
@Requires(SimpleTestRequirements.class)
public void selectCountFromNation()
{
assertThat(query("select count(*) from nation"))
.hasRowsCount(1)
.hasRows(row(25));
}
}

Example – Convention based test
allRows.sql:
-- database: hive; tables: blah
SELECT * FROM sample_table
allRows.result:
-- delimiter: |; ignoreOrder: false; types: BIGINT,VARCHAR
1|A|
2|B|
3|C|

Tempto architecture
user provided
library provided
TestNG
TestNG
listeners
utils
tests
requirements requirement
fulfillers

Tempto architecture
● Works well
● Extensible
● Well knownTestNG
TestNG
listeners
utils
tests
fulfillers

Tempto architecture
● Tempto specific extension of TestNG
execution framework
● Requirements management
● Tests filtering
● Injecting dependencies
● Extended logging
TestNG
utils
tests
fulfillers
TestNG
listeners

Tempto architecture
● Test code :)
○ Java
○ SQL-convention basedTestNG
utils
fulfillers
TestNG
listeners
tests

Tempto architecture
● Declarative requirements
● Fulfilled by test framework via
pluggable fulfillers
● e.g. mutableTable(
Tpch.NATION,
LOADED,
“hive”)
● Test level and suite level
● Cleanup
TestNG
utils
TestNG
listeners
tests
fulfillers

Tempto architecture
● extra assertions
● various tools
○ HDFS client
○ SSH client
○ JDBC query executor
TestNG
TestNG
listeners
tests
fulfillers
utils

Executable runner
java -jar target/presto-product-tests-0.120-SNAPSHOT-executable.jar --help
usage: Presto product tests
--config-local <arg> URI to Test local configuration YAML file.
--report-dir <arg> Test reports directory
--groups <arg> Test groups to be run
--excluded-groups <arg> Test groups to be excluded
--tests <arg> Test patterns to be included
-h,--help Shows help message
● All dependencies embedded
● User provides cluster details through yaml config.

Configuration
hdfs:
username: hdfs
webhdfs:
host: master
port: 50070
tests:
hdfs:
path: /product-test
databases:
default:
alias: presto
hive:
jdbc_driver_class: org.apache.hive.jdbc.HiveDriver
jdbc_url: jdbc:hive2://master:10000
jdbc_user: hdfs
jdbc_password: na
jdbc_pooling: false
jdbc_jar: test-framework-hive-jdbc-all.jar
presto:
jdbc_driver_class: com.facebook.presto.jdbc.PrestoDriver
jdbc_url: jdbc:presto://localhost:8080/hive/default
jdbc_user: hdfs
jdbc_password: na
jdbc_pooling: false

Benchto
macro benchmarking framework
github.com/teradata/benchto (very soon)
Karol Sobczak
karol.sobczak@teradata.com

Goals
● Easy and manageable way to define benchmarks
● Run and analyze macro benchmarks in clustered environment
● Repeatable benchmarking of Hadoop SQL engines, most importantly Presto
○ also used for Hive, Teradata components
● Transparent, trusted framework for benchmarking

Benchmarks - model
BenchmarkRun QueryExecution
Measurement
Aggregated
Measurement
Measurement
n n
1
n
1
n

Benchmarks - execution
before-benchmark-macros
prewarm
benchmark
.
.
execution-0
execution-1
execution-n
after-benchmark-macros

Defining benchmarks - structure
● Convention based defining of benchmark through descriptors (YAML format)
and query SQL files
$ tree .
.
├── application-presto-devenv.yaml
├── application-td-hdp.yaml
├── benchmarks
│ ├── presto
│ │ ├── concurrency-insert-multi-table.yaml
│ │ ├── concurrency.yaml
│ │ ├── linear-scan.yaml
│ │ ├── tpch.yaml
│ │ └── types.yaml
│ └── querygrid-presto-ansi
│ └── concurrency.yaml
└── sql
├── presto
│ ├── dev-zero
│ │ ├── create-alltypes.sql
│ │ └── create-lineitem.sql
│ ├── linear-scan
│ │ ├── selectivity-0.sql
│ │ ├── selectivity-100.sql
...

Defining benchmarks - descriptor
● Descriptor is YAML configuration file with various properties and user defined
variables
$ cat benchmarks/presto/concurrency.yaml
datasource: presto
query-names: presto/linear-scan/selectivity-${selectivity}.sql
schema: tpch_100gb_orc
database: hive
concurrency: ${concurrency_level}
runs: ${concurrency_level}
prewarm-runs: 3
before-benchmark: drop-caches
variables:
1:
selectivity: 10, 100
concurrency_level: 10
2:
3:

Defining benchmarks – SQL file templating
● SQL files can use keys defined in YAML configuration file – templates are
based on FreeMarker
$ cat sql/presto/tpch/q14.sql
SELECT 100.00 * sum(CASE
WHEN p.type LIKE 'PROMO%'
THEN l.extendedprice * (1 - l.discount)
ELSE 0
END) / sum(l.extendedprice * (1 - l.discount)) AS promo_revenue
FROM
"${database}"."${schema}"."lineitem" AS l,
"${database}"."${schema}"."part" AS p
WHERE
l.partkey = p.partkey
AND l.shipdate >= DATE '1995-09-01'
AND l.shipdate < DATE '1995-09-01' + INTERVAL '1' MONTH

Future work
● (Tempto) Support for complex concurrent tests execution
● (Benchto) Automatic regression detection
● (Benchto) Customized dashboards (e.g. overall performance analysis)
● (Benchto) Hardware and configuration awarness
● (Benchto) More complex benchmarking scenarios
● (Benchto) Support for complex concurrency scenarios
● (Benchto) Scheduling mechanism

Benchto GUI
● Visualization of benchmarks results
● Linking between tools (Grafana, Presto UI)
● Comparison of multiple benchmarks

Grafana monitoring
● We use Grafana dashboard with Graphite
● Benchmark/executions life-cycle events are showed on dashboards
● Provides good visibility into state of the cluster

Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)

More Related Content

What's hot

Viewers also liked

Similar to Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)

Recently uploaded

Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)