SlideShare a Scribd company logo
1 of 45
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
Using the PostgreSQL
Extension Ecosystem for
Advanced Analytics
sales@chartio.com
(855) 232-0320
- The problem
- The prevailing view vs. the practical reality
- A possible solution
- Or just building blocks?
- Nearness
- Near at hand, near to our skill set, near to our capabilities
- A more complete solution
- The PostgreSQL extension ecosystem
Agenda
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
The Problem
The Prevailing View
vs.
The Practical Reality
sales@chartio.com
(855) 232-0320
The Prevailing View - Logical
Dimension Relational Non-Relational
Schema objects ● Structured rows and columns
● Schema on write
● Referential integrity
● Painful migrations
● Unstructured files, docs, etc
● Schema on read
● No referential integrity
● No migrations
Query languages ● SQL
● Declarative
● Easy enough for non-tech users
● Various
● Procedural
● Requires some programming skills
Exploratory analysis ● Native support for joins
● Interactive/low execution overhead
● No native support for joins
● OLAP - Batch processing
Data science and ML ● Only descriptive statistics
● Requires exporting dumps/samples
● Robust ecosystem
● Does not require exports
sales@chartio.com
(855) 232-0320
The Prevailing View - Physical
Dimension Relational Non-Relational
Parallel query
processing
● Single node system
● Single process per query
● Multiple node system
● Multiple processes per query
Concurrency ● High concurrency
● Single process per connection
● OLAP - low concurrency/high
scheduling overhead
High Availability &
Replication
● Async and sync replication
● HA may not be native
● Async and sync replication
● HA likely to be native
Sharding ● Sharding may not be native
● Difficult to manage
● Sharding likely to be native
● Easy to manage
sales@chartio.com
(855) 232-0320
The Prevailing View - Summary
- RDBMS have nice properties for producing rich data
- ACID, relational integrity, constraints, strong data types
- Easier for non-tech users and exploratory analysis
- Probably don’t meet the needs of today’s analysts
- Data science & Machine Learning
- Parallel processing
- Definitely don’t meet the needs of today’s apps
- Schema migrations
- Replication and sharding
sales@chartio.com
(855) 232-0320
The Practical Reality
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
But we still want more advanced
functionality.
The Practical Reality
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
A Possible Solution
Or Just Building Blocks?
sales@chartio.com
(855) 232-0320
Modern SQL
- Many people still think of SQL in terms of SQL-92
- Since then we’ve had: SQL:1999, SQL:2003, SQL:2006, SQL:2008,
SQL:2011
- http://use-the-index-luke.com/blog/2015-02/modern-sql
- Common Table Expressions (CTEs) / Recursive CTEs
- Window Functions
- Ordered-set Aggregates
- Lateral joins
- Temporal support
- The list goes on...
sales@chartio.com
(855) 232-0320
Procedural Languages
- Native
pgSQL Tcl Perl Python
- Community
Java PHP R Javascript Ruby Scheme
sh
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
These solve some problems.
For others, they are just building blocks.
Building Blocks
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
Nearness
Near at Hand
Near to Our Skill Set
Near to Our Capabilities
sales@chartio.com
(855) 232-0320
- http://www.infoq.com/presentations/Simple-Made-Easy
Nearness
sales@chartio.com
(855) 232-0320
- Near at hand
- Easily installable
- Near to our skill set
- Familiar tool/language/abstraction
- Modular and composable
- Near to our capabilities
- Capable of solving a problem in our domain
Nearness Drives Adoption
sales@chartio.com
(855) 232-0320
sales@chartio.com
(855) 232-0320
A More Complete Solution
The PostgreSQL Extension Ecosystem
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & operators: https://github.com/eulerto/pg_similarity
- UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & operators: https://github.com/eulerto/pg_similarity
- UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
- Package Manager: pgxn
- Index/Network: http://pgxn.org/
- PyPI, RubyGems, CPAN, CRAN
The PostgreSQL Extension Network
sales@chartio.com
(855) 232-0320
The PostgreSQL Extension Network
- Near at hand
- pgxn search semver
- pgxn info semver
- pgxn install semver
- pgxn load –d somedb semver
- pgxn unload –d somedb semver
- pgxn uninstall semver
- Search github? google? mailing list?
- Github README?
- git clone; make; make install;
- psql –c “CREATE EXTENSION IF NOT
EXISTS”
- psql –c “DROP EXTENSION IF EXISTS”
- make uninstall?
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & operators: https://github.com/eulerto/pg_similarity
- UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
UDFs & Operators: pg_similarity
- Near to our capabilities
- Similarity coefficient algorithms
- L1 Distance
- Cosine Distance
- Dice Coefficient
- Euclidean Distance
- Hamming Distance
- Jaccard Coefficient
- Jaro Distance
- Jaro-Winkler Distance
- Levenshtein Distance
- Matching Coefficient
- Monge-Elkan Coefficient
- Needleman-Wunsch Coefficient
- Overlap Coefficient
- Q-Gram Distance
- Smith-Waterman Coefficient
- Smith-Waterman-Gotoh Coefficient
- Soundex Distance
sales@chartio.com
(855) 232-0320
UDFs & Operators: pg_similarity
- Near to our skill set
sales@chartio.com
(855) 232-0320
UDFs & Operators: pg_similarity
- Implementation
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
UDAs & Data Types: postgresql-hll
- Near to our capabilities & near to our skill set
- Data type
- Estimate count distinct with tunable precision
- 1280 bytes estimates tens of billions of distinct values with few percent error
sales@chartio.com
(855) 232-0320
UDAs & Data Types: postgresql-hll
sales@chartio.com
(855) 232-0320
UDAs & Data Types: postgresql-hll
- Implementation
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Foreign Data Wrappers: API
sales@chartio.com
(855) 232-0320
Foreign Data Wrappers: multicorn
- Near to our skill set
sales@chartio.com
(855) 232-0320
Foreign Data Wrappers: pgosquery
- Near at hand
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Indexes: ZomboDB
- Index Access Method API
- http://www.postgresql.org/docs/9.4/static/indexam.html
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes (GiST, GIN): https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Composing Extension Methods: MADlib
Near to our capabilities
sales@chartio.com
(855) 232-0320
Composing Extension Methods: MADlib
- Near to our skill set
sales@chartio.com
(855) 232-0320
Composing Extension Methods: MADlib
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Parallel Processing
- Parallel sequential scan
- http://rhaas.blogspot.com/2015/11/parallel-sequential-scan-is-committed.html
- Columnar FDW:
- https://github.com/citusdata/cstore_fdw
sales@chartio.com
(855) 232-0320
Postgres Extension Ecosystem Examples
- PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
sales@chartio.com
(855) 232-0320
Composing Extensions: Alps
sales@chartio.com
(855) 232-0320
Composing Extensions: Record Linking
sales@chartio.com
(855) 232-0320
Beyond Analytics
- Web app framework
- http://blog.aquameta.com/
- REST API
- https://github.com/begriffs/postgrest
- Unit testing framework
- http://pgtap.org/
- Firewall
- https://github.com/uptimejp/sql_firewall
- More every week!
sales@chartio.com
(855) 232-0320
Conclusion
- With PostgreSQL, you get
- more than rows and columns
- more than SELECT, FROM, WHERE, GROUP BY, ORDER BY
- more than a single machine
- Make sure you get the full return on your investment!
Get your Chartio free trial!
sales@chartio.com
(855) 232-0320

More Related Content

What's hot

ODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesMark Rittman
 
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use caseBDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use caseDavid Lauzon
 
BDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using ImpalaBDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using ImpalaDavid Lauzon
 
HBase and Drill: How loosley typed SQL is ideal for NoSQL
HBase and Drill: How loosley typed SQL is ideal for NoSQLHBase and Drill: How loosley typed SQL is ideal for NoSQL
HBase and Drill: How loosley typed SQL is ideal for NoSQLDataWorks Summit
 
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Sahara
OpenStack Trove Day (19 Aug 2014, Cambridge MA)  - SaharaOpenStack Trove Day (19 Aug 2014, Cambridge MA)  - Sahara
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Saharaspinningmatt
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Wes McKinney
 
ACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesWes McKinney
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataWes McKinney
 
DataFrames: The Good, Bad, and Ugly
DataFrames: The Good, Bad, and UglyDataFrames: The Good, Bad, and Ugly
DataFrames: The Good, Bad, and UglyWes McKinney
 
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Spark Summit
 
Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020Wes McKinney
 
introduction to Neo4j (Tabriz Software Open Talks)
introduction to Neo4j (Tabriz Software Open Talks)introduction to Neo4j (Tabriz Software Open Talks)
introduction to Neo4j (Tabriz Software Open Talks)Farzin Bagheri
 
Apache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackApache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackWes McKinney
 
Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Yahoo Developer Network
 
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionApache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionWes McKinney
 
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"Wes McKinney
 
From flat files to deconstructed database
From flat files to deconstructed databaseFrom flat files to deconstructed database
From flat files to deconstructed databaseJulien Le Dem
 
Performance of Spark vs MapReduce
Performance of Spark vs MapReducePerformance of Spark vs MapReduce
Performance of Spark vs MapReduceEdureka!
 
The other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needsThe other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needsgagravarr
 

What's hot (20)

ODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" Sources
 
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use caseBDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
 
BDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using ImpalaBDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using Impala
 
HBase and Drill: How loosley typed SQL is ideal for NoSQL
HBase and Drill: How loosley typed SQL is ideal for NoSQLHBase and Drill: How loosley typed SQL is ideal for NoSQL
HBase and Drill: How loosley typed SQL is ideal for NoSQL
 
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Sahara
OpenStack Trove Day (19 Aug 2014, Cambridge MA)  - SaharaOpenStack Trove Day (19 Aug 2014, Cambridge MA)  - Sahara
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Sahara
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018
 
ACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data Frames
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
 
DataFrames: The Good, Bad, and Ugly
DataFrames: The Good, Bad, and UglyDataFrames: The Good, Bad, and Ugly
DataFrames: The Good, Bad, and Ugly
 
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
 
Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020
 
introduction to Neo4j (Tabriz Software Open Talks)
introduction to Neo4j (Tabriz Software Open Talks)introduction to Neo4j (Tabriz Software Open Talks)
introduction to Neo4j (Tabriz Software Open Talks)
 
Apache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackApache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics Stack
 
Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010
 
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionApache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS Session
 
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
 
From flat files to deconstructed database
From flat files to deconstructed databaseFrom flat files to deconstructed database
From flat files to deconstructed database
 
Performance of Spark vs MapReduce
Performance of Spark vs MapReducePerformance of Spark vs MapReduce
Performance of Spark vs MapReduce
 
Apache drill
Apache drillApache drill
Apache drill
 
The other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needsThe other Apache Technologies your Big Data solution needs
The other Apache Technologies your Big Data solution needs
 

Viewers also liked

PostgreSQL 9.5 Foreign Data Wrappers
PostgreSQL 9.5 Foreign Data WrappersPostgreSQL 9.5 Foreign Data Wrappers
PostgreSQL 9.5 Foreign Data WrappersNicholas Kiraly
 
Useful PostgreSQL Extensions
Useful PostgreSQL ExtensionsUseful PostgreSQL Extensions
Useful PostgreSQL ExtensionsEDB
 
Oracle Database Connect 2017 / JPOUG#1
Oracle Database Connect 2017 / JPOUG#1Oracle Database Connect 2017 / JPOUG#1
Oracle Database Connect 2017 / JPOUG#1Noriyoshi Shinoda
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
PostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practicePostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practiceJano Suchal
 

Viewers also liked (7)

PostgreSQL 9.5 Foreign Data Wrappers
PostgreSQL 9.5 Foreign Data WrappersPostgreSQL 9.5 Foreign Data Wrappers
PostgreSQL 9.5 Foreign Data Wrappers
 
pg_dbms_statsの紹介
pg_dbms_statsの紹介pg_dbms_statsの紹介
pg_dbms_statsの紹介
 
Useful PostgreSQL Extensions
Useful PostgreSQL ExtensionsUseful PostgreSQL Extensions
Useful PostgreSQL Extensions
 
Oracle Database Connect 2017 / JPOUG#1
Oracle Database Connect 2017 / JPOUG#1Oracle Database Connect 2017 / JPOUG#1
Oracle Database Connect 2017 / JPOUG#1
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
PostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practicePostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practice
 

Similar to Using the PostgreSQL Extension Ecosystem for Advanced Analytics

IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...Marcin Bielak
 
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Codemotion
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App developmentLuca Garulli
 
When it all GOes right
When it all GOes rightWhen it all GOes right
When it all GOes rightPavlo Golub
 
Dataweave Libraries and ObjectStore
Dataweave Libraries and ObjectStoreDataweave Libraries and ObjectStore
Dataweave Libraries and ObjectStoreVikalp Bhalia
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Bostonkbajda
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data SmarterMatheus Mota
 
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)Amazon Web Services
 
Peteris Arajs - Where is my data
Peteris Arajs - Where is my dataPeteris Arajs - Where is my data
Peteris Arajs - Where is my dataAndrejs Vorobjovs
 
Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013MLconf
 
Introduction to real time big data with Apache Spark
Introduction to real time big data with Apache SparkIntroduction to real time big data with Apache Spark
Introduction to real time big data with Apache SparkTaras Matyashovsky
 
Open core summit: Observability for data pipelines with OpenLineage
Open core summit: Observability for data pipelines with OpenLineageOpen core summit: Observability for data pipelines with OpenLineage
Open core summit: Observability for data pipelines with OpenLineageJulien Le Dem
 
Presto @ Zalando - Big Data Tech Warsaw 2020
Presto @ Zalando - Big Data Tech Warsaw 2020Presto @ Zalando - Big Data Tech Warsaw 2020
Presto @ Zalando - Big Data Tech Warsaw 2020Piotr Findeisen
 
Hannes end-of-the-router-tnc17
Hannes end-of-the-router-tnc17Hannes end-of-the-router-tnc17
Hannes end-of-the-router-tnc17Hannes Gredler
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsAnant Corporation
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaborationJulien Pivotto
 
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...Databricks
 
Data Contracts: Consensus as Code - Pycon 2023
Data Contracts: Consensus as Code - Pycon 2023Data Contracts: Consensus as Code - Pycon 2023
Data Contracts: Consensus as Code - Pycon 2023Ryan Collingwood
 
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the CloudInteractive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the CloudAlluxio, Inc.
 

Similar to Using the PostgreSQL Extension Ecosystem for Advanced Analytics (20)

IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
 
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App development
 
When it all GOes right
When it all GOes rightWhen it all GOes right
When it all GOes right
 
Dataweave Libraries and ObjectStore
Dataweave Libraries and ObjectStoreDataweave Libraries and ObjectStore
Dataweave Libraries and ObjectStore
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Boston
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data Smarter
 
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
 
Peteris Arajs - Where is my data
Peteris Arajs - Where is my dataPeteris Arajs - Where is my data
Peteris Arajs - Where is my data
 
Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013
 
Introduction to real time big data with Apache Spark
Introduction to real time big data with Apache SparkIntroduction to real time big data with Apache Spark
Introduction to real time big data with Apache Spark
 
Open core summit: Observability for data pipelines with OpenLineage
Open core summit: Observability for data pipelines with OpenLineageOpen core summit: Observability for data pipelines with OpenLineage
Open core summit: Observability for data pipelines with OpenLineage
 
Presto @ Zalando - Big Data Tech Warsaw 2020
Presto @ Zalando - Big Data Tech Warsaw 2020Presto @ Zalando - Big Data Tech Warsaw 2020
Presto @ Zalando - Big Data Tech Warsaw 2020
 
Hannes end-of-the-router-tnc17
Hannes end-of-the-router-tnc17Hannes end-of-the-router-tnc17
Hannes end-of-the-router-tnc17
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
 
Serverless Go at BuzzBird
Serverless Go at BuzzBirdServerless Go at BuzzBird
Serverless Go at BuzzBird
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaboration
 
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...
 
Data Contracts: Consensus as Code - Pycon 2023
Data Contracts: Consensus as Code - Pycon 2023Data Contracts: Consensus as Code - Pycon 2023
Data Contracts: Consensus as Code - Pycon 2023
 
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the CloudInteractive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
 

More from Chartio

Rethinking Your Ad Spend: 5 Tips for intelligent digital advertising
Rethinking Your Ad Spend: 5 Tips  for intelligent digital advertisingRethinking Your Ad Spend: 5 Tips  for intelligent digital advertising
Rethinking Your Ad Spend: 5 Tips for intelligent digital advertisingChartio
 
How To Drive Exponential Growth Using Unconventional Data Sources
How To Drive Exponential Growth Using Unconventional Data SourcesHow To Drive Exponential Growth Using Unconventional Data Sources
How To Drive Exponential Growth Using Unconventional Data SourcesChartio
 
7 Reasons You Haven't Reached Hyper-Growth
7 Reasons You Haven't Reached Hyper-Growth7 Reasons You Haven't Reached Hyper-Growth
7 Reasons You Haven't Reached Hyper-GrowthChartio
 
Redshift Chartio Event Presentation
Redshift Chartio Event PresentationRedshift Chartio Event Presentation
Redshift Chartio Event PresentationChartio
 
The Vital Metrics Every Sales Team Should Be Measuring
The Vital Metrics Every Sales Team Should Be MeasuringThe Vital Metrics Every Sales Team Should Be Measuring
The Vital Metrics Every Sales Team Should Be MeasuringChartio
 
CSV and XLS Best Practices
CSV and XLS Best PracticesCSV and XLS Best Practices
CSV and XLS Best PracticesChartio
 

More from Chartio (6)

Rethinking Your Ad Spend: 5 Tips for intelligent digital advertising
Rethinking Your Ad Spend: 5 Tips  for intelligent digital advertisingRethinking Your Ad Spend: 5 Tips  for intelligent digital advertising
Rethinking Your Ad Spend: 5 Tips for intelligent digital advertising
 
How To Drive Exponential Growth Using Unconventional Data Sources
How To Drive Exponential Growth Using Unconventional Data SourcesHow To Drive Exponential Growth Using Unconventional Data Sources
How To Drive Exponential Growth Using Unconventional Data Sources
 
7 Reasons You Haven't Reached Hyper-Growth
7 Reasons You Haven't Reached Hyper-Growth7 Reasons You Haven't Reached Hyper-Growth
7 Reasons You Haven't Reached Hyper-Growth
 
Redshift Chartio Event Presentation
Redshift Chartio Event PresentationRedshift Chartio Event Presentation
Redshift Chartio Event Presentation
 
The Vital Metrics Every Sales Team Should Be Measuring
The Vital Metrics Every Sales Team Should Be MeasuringThe Vital Metrics Every Sales Team Should Be Measuring
The Vital Metrics Every Sales Team Should Be Measuring
 
CSV and XLS Best Practices
CSV and XLS Best PracticesCSV and XLS Best Practices
CSV and XLS Best Practices
 

Recently uploaded

Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 

Recently uploaded (20)

Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 

Using the PostgreSQL Extension Ecosystem for Advanced Analytics

  • 1. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 Using the PostgreSQL Extension Ecosystem for Advanced Analytics
  • 2. sales@chartio.com (855) 232-0320 - The problem - The prevailing view vs. the practical reality - A possible solution - Or just building blocks? - Nearness - Near at hand, near to our skill set, near to our capabilities - A more complete solution - The PostgreSQL extension ecosystem Agenda
  • 3. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 The Problem The Prevailing View vs. The Practical Reality
  • 4. sales@chartio.com (855) 232-0320 The Prevailing View - Logical Dimension Relational Non-Relational Schema objects ● Structured rows and columns ● Schema on write ● Referential integrity ● Painful migrations ● Unstructured files, docs, etc ● Schema on read ● No referential integrity ● No migrations Query languages ● SQL ● Declarative ● Easy enough for non-tech users ● Various ● Procedural ● Requires some programming skills Exploratory analysis ● Native support for joins ● Interactive/low execution overhead ● No native support for joins ● OLAP - Batch processing Data science and ML ● Only descriptive statistics ● Requires exporting dumps/samples ● Robust ecosystem ● Does not require exports
  • 5. sales@chartio.com (855) 232-0320 The Prevailing View - Physical Dimension Relational Non-Relational Parallel query processing ● Single node system ● Single process per query ● Multiple node system ● Multiple processes per query Concurrency ● High concurrency ● Single process per connection ● OLAP - low concurrency/high scheduling overhead High Availability & Replication ● Async and sync replication ● HA may not be native ● Async and sync replication ● HA likely to be native Sharding ● Sharding may not be native ● Difficult to manage ● Sharding likely to be native ● Easy to manage
  • 6. sales@chartio.com (855) 232-0320 The Prevailing View - Summary - RDBMS have nice properties for producing rich data - ACID, relational integrity, constraints, strong data types - Easier for non-tech users and exploratory analysis - Probably don’t meet the needs of today’s analysts - Data science & Machine Learning - Parallel processing - Definitely don’t meet the needs of today’s apps - Schema migrations - Replication and sharding
  • 8. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 But we still want more advanced functionality. The Practical Reality
  • 9. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 A Possible Solution Or Just Building Blocks?
  • 10. sales@chartio.com (855) 232-0320 Modern SQL - Many people still think of SQL in terms of SQL-92 - Since then we’ve had: SQL:1999, SQL:2003, SQL:2006, SQL:2008, SQL:2011 - http://use-the-index-luke.com/blog/2015-02/modern-sql - Common Table Expressions (CTEs) / Recursive CTEs - Window Functions - Ordered-set Aggregates - Lateral joins - Temporal support - The list goes on...
  • 11. sales@chartio.com (855) 232-0320 Procedural Languages - Native pgSQL Tcl Perl Python - Community Java PHP R Javascript Ruby Scheme sh
  • 12. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 These solve some problems. For others, they are just building blocks. Building Blocks
  • 13. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 Nearness Near at Hand Near to Our Skill Set Near to Our Capabilities
  • 15. sales@chartio.com (855) 232-0320 - Near at hand - Easily installable - Near to our skill set - Familiar tool/language/abstraction - Modular and composable - Near to our capabilities - Capable of solving a problem in our domain Nearness Drives Adoption
  • 16. sales@chartio.com (855) 232-0320 sales@chartio.com (855) 232-0320 A More Complete Solution The PostgreSQL Extension Ecosystem
  • 17. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & operators: https://github.com/eulerto/pg_similarity - UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 18. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & operators: https://github.com/eulerto/pg_similarity - UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 19. sales@chartio.com (855) 232-0320 - Package Manager: pgxn - Index/Network: http://pgxn.org/ - PyPI, RubyGems, CPAN, CRAN The PostgreSQL Extension Network
  • 20. sales@chartio.com (855) 232-0320 The PostgreSQL Extension Network - Near at hand - pgxn search semver - pgxn info semver - pgxn install semver - pgxn load –d somedb semver - pgxn unload –d somedb semver - pgxn uninstall semver - Search github? google? mailing list? - Github README? - git clone; make; make install; - psql –c “CREATE EXTENSION IF NOT EXISTS” - psql –c “DROP EXTENSION IF EXISTS” - make uninstall?
  • 21. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & operators: https://github.com/eulerto/pg_similarity - UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 22. sales@chartio.com (855) 232-0320 UDFs & Operators: pg_similarity - Near to our capabilities - Similarity coefficient algorithms - L1 Distance - Cosine Distance - Dice Coefficient - Euclidean Distance - Hamming Distance - Jaccard Coefficient - Jaro Distance - Jaro-Winkler Distance - Levenshtein Distance - Matching Coefficient - Monge-Elkan Coefficient - Needleman-Wunsch Coefficient - Overlap Coefficient - Q-Gram Distance - Smith-Waterman Coefficient - Smith-Waterman-Gotoh Coefficient - Soundex Distance
  • 23. sales@chartio.com (855) 232-0320 UDFs & Operators: pg_similarity - Near to our skill set
  • 24. sales@chartio.com (855) 232-0320 UDFs & Operators: pg_similarity - Implementation
  • 25. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 26. sales@chartio.com (855) 232-0320 UDAs & Data Types: postgresql-hll - Near to our capabilities & near to our skill set - Data type - Estimate count distinct with tunable precision - 1280 bytes estimates tens of billions of distinct values with few percent error
  • 27. sales@chartio.com (855) 232-0320 UDAs & Data Types: postgresql-hll
  • 28. sales@chartio.com (855) 232-0320 UDAs & Data Types: postgresql-hll - Implementation
  • 29. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 31. sales@chartio.com (855) 232-0320 Foreign Data Wrappers: multicorn - Near to our skill set
  • 32. sales@chartio.com (855) 232-0320 Foreign Data Wrappers: pgosquery - Near at hand
  • 33. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 34. sales@chartio.com (855) 232-0320 Indexes: ZomboDB - Index Access Method API - http://www.postgresql.org/docs/9.4/static/indexam.html
  • 35. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes (GiST, GIN): https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 36. sales@chartio.com (855) 232-0320 Composing Extension Methods: MADlib Near to our capabilities
  • 37. sales@chartio.com (855) 232-0320 Composing Extension Methods: MADlib - Near to our skill set
  • 39. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 40. sales@chartio.com (855) 232-0320 Parallel Processing - Parallel sequential scan - http://rhaas.blogspot.com/2015/11/parallel-sequential-scan-is-committed.html - Columnar FDW: - https://github.com/citusdata/cstore_fdw
  • 41. sales@chartio.com (855) 232-0320 Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/ - UDFs & Operators: https://github.com/eulerto/pg_similarity - UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll - Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery - Indexes: https://github.com/zombodb/zombodb - Composing Extension Methods: http://doc.madlib.net/ - MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb - Composing Extensions - Custom Background Workers: https://github.com/no0p/alps - Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
  • 44. sales@chartio.com (855) 232-0320 Beyond Analytics - Web app framework - http://blog.aquameta.com/ - REST API - https://github.com/begriffs/postgrest - Unit testing framework - http://pgtap.org/ - Firewall - https://github.com/uptimejp/sql_firewall - More every week!
  • 45. sales@chartio.com (855) 232-0320 Conclusion - With PostgreSQL, you get - more than rows and columns - more than SELECT, FROM, WHERE, GROUP BY, ORDER BY - more than a single machine - Make sure you get the full return on your investment! Get your Chartio free trial! sales@chartio.com (855) 232-0320