Testing tools
Wojciech.Biela@teradata.com
Łukasz.Osipiuk@teradata.com
Karol.Sobczak@teradata.com
Why we need them
● Certified distro
● Enterprise support
● Quarterly releases
● Product testing - Tempto
● Performance testing - Benchto
Tempto
Product test framework
github.com/prestodb/tempto
Łukasz Osipiuk
lukasz.osipiuk@teradata.com
What is Tempto?
● End-to-end product testing framework
● Targeted to software engineers
● For automation
● Tests easy to define
● Focus on test code
● Focus on database systems
● So far used for testing
○ Presto
○ internal projects
How is test defined?
● Java
● SQL convention based
Example – Java based test
public class SimpleQueryTest extends ProductTest {
private static class SimpleTestRequirements implements RequirementsProvider{
public Requirement getRequirements(Configuration config) {
return new ImmutableHiveTableRequirement(NATION);
}
}
@Inject
Configuration configuration;
@Test(groups = {"smoke", "query"})
@Requires(SimpleTestRequirements.class)
public void selectCountFromNation()
{
assertThat(query("select count(*) from nation"))
.hasRowsCount(1)
.hasRows(row(25));
}
}
Example – Convention based test
allRows.sql:
-- database: hive; tables: blah
SELECT * FROM sample_table
allRows.result:
-- delimiter: |; ignoreOrder: false; types: BIGINT,VARCHAR
1|A|
2|B|
3|C|
Tempto architecture
user provided
library provided
TestNG
TestNG
listeners
utils
tests
requirements requirement
fulfillers
Tempto architecture
● Works well
● Extensible
● Well knownTestNG
TestNG
listeners
utils
tests
requirements requirement
fulfillers
Tempto architecture
● Tempto specific extension of TestNG
execution framework
● Requirements management
● Tests filtering
● Injecting dependencies
● Extended logging
TestNG
utils
tests
requirements requirement
fulfillers
TestNG
listeners
Tempto architecture
● Test code :)
○ Java
○ SQL-convention basedTestNG
utils
requirements requirement
fulfillers
TestNG
listeners
tests
Tempto architecture
● Declarative requirements
● Fulfilled by test framework via
pluggable fulfillers
● e.g. mutableTable(
Tpch.NATION,
LOADED,
“hive”)
● Test level and suite level
● Cleanup
TestNG
utils
TestNG
listeners
tests
requirements requirement
fulfillers
Tempto architecture
● extra assertions
● various tools
○ HDFS client
○ SSH client
○ JDBC query executor
TestNG
TestNG
listeners
tests
requirements requirement
fulfillers
utils
Executable runner
java -jar target/presto-product-tests-0.120-SNAPSHOT-executable.jar --help
usage: Presto product tests
--config-local <arg> URI to Test local configuration YAML file.
--report-dir <arg> Test reports directory
--groups <arg> Test groups to be run
--excluded-groups <arg> Test groups to be excluded
--tests <arg> Test patterns to be included
-h,--help Shows help message
● All dependencies embedded
● User provides cluster details through yaml config.
Configuration
hdfs:
username: hdfs
webhdfs:
host: master
port: 50070
tests:
hdfs:
path: /product-test
databases:
default:
alias: presto
hive:
jdbc_driver_class: org.apache.hive.jdbc.HiveDriver
jdbc_url: jdbc:hive2://master:10000
jdbc_user: hdfs
jdbc_password: na
jdbc_pooling: false
jdbc_jar: test-framework-hive-jdbc-all.jar
presto:
jdbc_driver_class: com.facebook.presto.jdbc.PrestoDriver
jdbc_url: jdbc:presto://localhost:8080/hive/default
jdbc_user: hdfs
jdbc_password: na
jdbc_pooling: false
Benchto
macro benchmarking framework
github.com/teradata/benchto (very soon)
Karol Sobczak
karol.sobczak@teradata.com
Goals
● Easy and manageable way to define benchmarks
● Run and analyze macro benchmarks in clustered environment
● Repeatable benchmarking of Hadoop SQL engines, most importantly Presto
○ also used for Hive, Teradata components
● Transparent, trusted framework for benchmarking
Benchmarks - model
BenchmarkRun QueryExecution
Measurement
Aggregated
Measurement
Measurement
n n
1
n
1
n
Benchmarks - execution
before-benchmark-macros
prewarm
benchmark
.
.
execution-0
execution-1
execution-n
after-benchmark-macros
Benchmarks - execution
before-benchmark-macros
prewarm
benchmark
.
.
execution-0
execution-1
execution-n
after-benchmark-macros
Benchmarks - execution
before-benchmark-macros
prewarm
benchmark
.
.
execution-0
execution-1
execution-n
after-benchmark-macros
Benchmarks - execution
before-benchmark-macros
prewarm
benchmark
.
.
execution-0
execution-1
execution-n
after-benchmark-macros
Defining benchmarks - structure
● Convention based defining of benchmark through descriptors (YAML format)
and query SQL files
$ tree .
.
├── application-presto-devenv.yaml
├── application-td-hdp.yaml
├── benchmarks
│ ├── presto
│ │ ├── concurrency-insert-multi-table.yaml
│ │ ├── concurrency.yaml
│ │ ├── linear-scan.yaml
│ │ ├── tpch.yaml
│ │ └── types.yaml
│ └── querygrid-presto-ansi
│ └── concurrency.yaml
└── sql
├── presto
│ ├── dev-zero
│ │ ├── create-alltypes.sql
│ │ └── create-lineitem.sql
│ ├── linear-scan
│ │ ├── selectivity-0.sql
│ │ ├── selectivity-100.sql
...
Defining benchmarks - descriptor
● Descriptor is YAML configuration file with various properties and user defined
variables
$ cat benchmarks/presto/concurrency.yaml
datasource: presto
query-names: presto/linear-scan/selectivity-${selectivity}.sql
schema: tpch_100gb_orc
database: hive
concurrency: ${concurrency_level}
runs: ${concurrency_level}
prewarm-runs: 3
before-benchmark: drop-caches
variables:
1:
selectivity: 10, 100
concurrency_level: 10
2:
selectivity: 10, 100
concurrency_level: 20
3:
selectivity: 10, 100
concurrency_level: 50
Defining benchmarks – SQL file templating
● SQL files can use keys defined in YAML configuration file – templates are
based on FreeMarker
$ cat sql/presto/tpch/q14.sql
SELECT 100.00 * sum(CASE
WHEN p.type LIKE 'PROMO%'
THEN l.extendedprice * (1 - l.discount)
ELSE 0
END) / sum(l.extendedprice * (1 - l.discount)) AS promo_revenue
FROM
"${database}"."${schema}"."lineitem" AS l,
"${database}"."${schema}"."part" AS p
WHERE
l.partkey = p.partkey
AND l.shipdate >= DATE '1995-09-01'
AND l.shipdate < DATE '1995-09-01' + INTERVAL '1' MONTH
Future work
● (Tempto) Support for complex concurrent tests execution
● (Benchto) Automatic regression detection
● (Benchto) Customized dashboards (e.g. overall performance analysis)
● (Benchto) Hardware and configuration awarness
● (Benchto) More complex benchmarking scenarios
● (Benchto) Support for complex concurrency scenarios
● (Benchto) Scheduling mechanism
Questions?
Benchto GUI
● Visualization of benchmarks results
● Linking between tools (Grafana, Presto UI)
● Comparison of multiple benchmarks
Grafana monitoring
● We use Grafana dashboard with Graphite
● Benchmark/executions life-cycle events are showed on dashboards
● Provides good visibility into state of the cluster

Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)

  • 1.
  • 2.
    Why we needthem ● Certified distro ● Enterprise support ● Quarterly releases ● Product testing - Tempto ● Performance testing - Benchto
  • 3.
  • 4.
    What is Tempto? ●End-to-end product testing framework ● Targeted to software engineers ● For automation ● Tests easy to define ● Focus on test code ● Focus on database systems ● So far used for testing ○ Presto ○ internal projects
  • 5.
    How is testdefined? ● Java ● SQL convention based
  • 6.
    Example – Javabased test public class SimpleQueryTest extends ProductTest { private static class SimpleTestRequirements implements RequirementsProvider{ public Requirement getRequirements(Configuration config) { return new ImmutableHiveTableRequirement(NATION); } } @Inject Configuration configuration; @Test(groups = {"smoke", "query"}) @Requires(SimpleTestRequirements.class) public void selectCountFromNation() { assertThat(query("select count(*) from nation")) .hasRowsCount(1) .hasRows(row(25)); } }
  • 7.
    Example – Conventionbased test allRows.sql: -- database: hive; tables: blah SELECT * FROM sample_table allRows.result: -- delimiter: |; ignoreOrder: false; types: BIGINT,VARCHAR 1|A| 2|B| 3|C|
  • 8.
    Tempto architecture user provided libraryprovided TestNG TestNG listeners utils tests requirements requirement fulfillers
  • 9.
    Tempto architecture ● Workswell ● Extensible ● Well knownTestNG TestNG listeners utils tests requirements requirement fulfillers
  • 10.
    Tempto architecture ● Temptospecific extension of TestNG execution framework ● Requirements management ● Tests filtering ● Injecting dependencies ● Extended logging TestNG utils tests requirements requirement fulfillers TestNG listeners
  • 11.
    Tempto architecture ● Testcode :) ○ Java ○ SQL-convention basedTestNG utils requirements requirement fulfillers TestNG listeners tests
  • 12.
    Tempto architecture ● Declarativerequirements ● Fulfilled by test framework via pluggable fulfillers ● e.g. mutableTable( Tpch.NATION, LOADED, “hive”) ● Test level and suite level ● Cleanup TestNG utils TestNG listeners tests requirements requirement fulfillers
  • 13.
    Tempto architecture ● extraassertions ● various tools ○ HDFS client ○ SSH client ○ JDBC query executor TestNG TestNG listeners tests requirements requirement fulfillers utils
  • 14.
    Executable runner java -jartarget/presto-product-tests-0.120-SNAPSHOT-executable.jar --help usage: Presto product tests --config-local <arg> URI to Test local configuration YAML file. --report-dir <arg> Test reports directory --groups <arg> Test groups to be run --excluded-groups <arg> Test groups to be excluded --tests <arg> Test patterns to be included -h,--help Shows help message ● All dependencies embedded ● User provides cluster details through yaml config.
  • 15.
    Configuration hdfs: username: hdfs webhdfs: host: master port:50070 tests: hdfs: path: /product-test databases: default: alias: presto hive: jdbc_driver_class: org.apache.hive.jdbc.HiveDriver jdbc_url: jdbc:hive2://master:10000 jdbc_user: hdfs jdbc_password: na jdbc_pooling: false jdbc_jar: test-framework-hive-jdbc-all.jar presto: jdbc_driver_class: com.facebook.presto.jdbc.PrestoDriver jdbc_url: jdbc:presto://localhost:8080/hive/default jdbc_user: hdfs jdbc_password: na jdbc_pooling: false
  • 16.
    Benchto macro benchmarking framework github.com/teradata/benchto(very soon) Karol Sobczak karol.sobczak@teradata.com
  • 17.
    Goals ● Easy andmanageable way to define benchmarks ● Run and analyze macro benchmarks in clustered environment ● Repeatable benchmarking of Hadoop SQL engines, most importantly Presto ○ also used for Hive, Teradata components ● Transparent, trusted framework for benchmarking
  • 22.
    Benchmarks - model BenchmarkRunQueryExecution Measurement Aggregated Measurement Measurement n n 1 n 1 n
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
    Defining benchmarks -structure ● Convention based defining of benchmark through descriptors (YAML format) and query SQL files $ tree . . ├── application-presto-devenv.yaml ├── application-td-hdp.yaml ├── benchmarks │ ├── presto │ │ ├── concurrency-insert-multi-table.yaml │ │ ├── concurrency.yaml │ │ ├── linear-scan.yaml │ │ ├── tpch.yaml │ │ └── types.yaml │ └── querygrid-presto-ansi │ └── concurrency.yaml └── sql ├── presto │ ├── dev-zero │ │ ├── create-alltypes.sql │ │ └── create-lineitem.sql │ ├── linear-scan │ │ ├── selectivity-0.sql │ │ ├── selectivity-100.sql ...
  • 28.
    Defining benchmarks -descriptor ● Descriptor is YAML configuration file with various properties and user defined variables $ cat benchmarks/presto/concurrency.yaml datasource: presto query-names: presto/linear-scan/selectivity-${selectivity}.sql schema: tpch_100gb_orc database: hive concurrency: ${concurrency_level} runs: ${concurrency_level} prewarm-runs: 3 before-benchmark: drop-caches variables: 1: selectivity: 10, 100 concurrency_level: 10 2: selectivity: 10, 100 concurrency_level: 20 3: selectivity: 10, 100 concurrency_level: 50
  • 29.
    Defining benchmarks –SQL file templating ● SQL files can use keys defined in YAML configuration file – templates are based on FreeMarker $ cat sql/presto/tpch/q14.sql SELECT 100.00 * sum(CASE WHEN p.type LIKE 'PROMO%' THEN l.extendedprice * (1 - l.discount) ELSE 0 END) / sum(l.extendedprice * (1 - l.discount)) AS promo_revenue FROM "${database}"."${schema}"."lineitem" AS l, "${database}"."${schema}"."part" AS p WHERE l.partkey = p.partkey AND l.shipdate >= DATE '1995-09-01' AND l.shipdate < DATE '1995-09-01' + INTERVAL '1' MONTH
  • 30.
    Future work ● (Tempto)Support for complex concurrent tests execution ● (Benchto) Automatic regression detection ● (Benchto) Customized dashboards (e.g. overall performance analysis) ● (Benchto) Hardware and configuration awarness ● (Benchto) More complex benchmarking scenarios ● (Benchto) Support for complex concurrency scenarios ● (Benchto) Scheduling mechanism
  • 31.
  • 32.
    Benchto GUI ● Visualizationof benchmarks results ● Linking between tools (Grafana, Presto UI) ● Comparison of multiple benchmarks
  • 33.
    Grafana monitoring ● Weuse Grafana dashboard with Graphite ● Benchmark/executions life-cycle events are showed on dashboards ● Provides good visibility into state of the cluster