Presentation from PostgreSQL PGCon 2015 in Ottawa, ON.
Integrating NoSQL data into PostgreSQL's Relational Model plus techniques to present NoSQL data as a traditional relation (table or view).
The supporting *.sql file and data are at the conference website
https://www.pgcon.org/2015/schedule/events/780.en.html
Semantic Technologies and Triplestores for Business IntelligenceMarin Dimitrov
This document provides an introduction to semantic technologies and triplestores. It discusses the Semantic Web vision of making data on the web more accessible and linked. Key concepts covered include RDF, ontologies, OWL, SPARQL and Linked Data. It also introduces triplestores as RDF databases for storing and querying semantic data and compares their features to traditional databases.
UMLtoGraphDB: Mapping Conceptual Schemas to Graph DatabasesGwendal Daniel
UMLtoGraphDB presentation at ER2016. Related article available online at https://hal.archives-ouvertes.fr/hal-01344015/document
Related post on modeling-languages.com: http://modeling-languages.com/uml-to-nosql-graph-database/
This document discusses using graphs and graph databases for machine learning. It provides an overview of graph analytics algorithms that can be used to solve problems with graph data, including recommendations, fraud detection, and network analysis. It also discusses using graph embeddings and graph neural networks for tasks like node classification and link prediction. Finally, it discusses how graphs can be used for machine learning infrastructure and metadata tasks like data provenance, audit trails, and privacy.
The document compares Neo4j, Titan, and Cassandra graph databases. It provides details on each database such as Neo4j using the Cypher query language, Cassandra being highly distributed and able to scale linearly, and Titan running on Cassandra or HBase but not supporting Cypher queries. It also gives a 15 point comparison of Cassandra vs Neo4j and examples of querying the same data in Gremlin, Cypher, and SQL. The conclusion recommends a graph database like Neo4j for recommendation queries and only using Titan for very large graphs or high loads.
The document discusses linked data, ontologies, and inference. It provides examples of using RDFS and OWL to infer new facts from schemas and ontologies. Key points include:
- Linked Data uses URIs and HTTP to identify things and provide useful information about them via standards like RDF and SPARQL.
- Projects like LOD aim to develop best practices for publishing interlinked open datasets. FactForge and LinkedLifeData are examples that contain billions of statements across life science and general knowledge datasets.
- RDFS and OWL allow defining schemas and ontologies that enable inferring new facts through reasoning. Rules like rdfs:domain and rdfs:range allow inferring type information
This document discusses GraphQL and DGraph with GO. It begins by introducing GraphQL and some popular GraphQL implementations in GO like graphql-go. It then discusses DGraph, describing it as a distributed, high performance graph database written in GO. It provides examples of using the DGraph GO client to perform CRUD operations, querying for single and multiple objects, committing transactions, and more.
Semantic Technologies and Triplestores for Business IntelligenceMarin Dimitrov
This document provides an introduction to semantic technologies and triplestores. It discusses the Semantic Web vision of making data on the web more accessible and linked. Key concepts covered include RDF, ontologies, OWL, SPARQL and Linked Data. It also introduces triplestores as RDF databases for storing and querying semantic data and compares their features to traditional databases.
UMLtoGraphDB: Mapping Conceptual Schemas to Graph DatabasesGwendal Daniel
UMLtoGraphDB presentation at ER2016. Related article available online at https://hal.archives-ouvertes.fr/hal-01344015/document
Related post on modeling-languages.com: http://modeling-languages.com/uml-to-nosql-graph-database/
This document discusses using graphs and graph databases for machine learning. It provides an overview of graph analytics algorithms that can be used to solve problems with graph data, including recommendations, fraud detection, and network analysis. It also discusses using graph embeddings and graph neural networks for tasks like node classification and link prediction. Finally, it discusses how graphs can be used for machine learning infrastructure and metadata tasks like data provenance, audit trails, and privacy.
The document compares Neo4j, Titan, and Cassandra graph databases. It provides details on each database such as Neo4j using the Cypher query language, Cassandra being highly distributed and able to scale linearly, and Titan running on Cassandra or HBase but not supporting Cypher queries. It also gives a 15 point comparison of Cassandra vs Neo4j and examples of querying the same data in Gremlin, Cypher, and SQL. The conclusion recommends a graph database like Neo4j for recommendation queries and only using Titan for very large graphs or high loads.
The document discusses linked data, ontologies, and inference. It provides examples of using RDFS and OWL to infer new facts from schemas and ontologies. Key points include:
- Linked Data uses URIs and HTTP to identify things and provide useful information about them via standards like RDF and SPARQL.
- Projects like LOD aim to develop best practices for publishing interlinked open datasets. FactForge and LinkedLifeData are examples that contain billions of statements across life science and general knowledge datasets.
- RDFS and OWL allow defining schemas and ontologies that enable inferring new facts through reasoning. Rules like rdfs:domain and rdfs:range allow inferring type information
This document discusses GraphQL and DGraph with GO. It begins by introducing GraphQL and some popular GraphQL implementations in GO like graphql-go. It then discusses DGraph, describing it as a distributed, high performance graph database written in GO. It provides examples of using the DGraph GO client to perform CRUD operations, querying for single and multiple objects, committing transactions, and more.
These are the slides to the webinar about Custom Pregel algorithms in ArangoDB https://youtu.be/DWJ-nWUxsO8. It provides a brief introduction to the capabilities and use cases for Pregel.
Running complex data queries in a distributed systemArangoDB Database
With the always-growing amount of data, it is getting increasingly hard to store and get it back efficiently. While the first versions of distributed databases have put all the burden of sharding on the application code, there are now some smarter solutions that handle most of the data distribution and resilience tasks inside the database.
This poses some interesting questions, e.g.
- how are other than by-primary-key queries actually organized and executed in a distributed system, so that they can run most efficiently?
- how do the contemporary distributed databases actually achieve transactional semantics for non-trivial operations that affect different shards/servers?
This talk will give an overview of these challenges and the available solutions that some open source distributed databases have picked to solve them.
A Graph Database That Scales - ArangoDB 3.7 Release WebinarArangoDB Database
örg Schad (Head of Engineering and ML) and Chris Woodward (Developer Relations Engineer) introduce the new capabilities to work with graph in a distributed setting. In addition explain and showcase the new fuzzy search within ArangoDB's search engine as well as JSON schema validation.
Get started with ArangoDB: https://www.arangodb.com/arangodb-tra...
Explore ArangoDB Cloud for free with 1-click demos: https://cloud.arangodb.com/home
ArangoDB is a native multi-model database written in C++ supporting graph, document and key/value needs with one engine and one query language. Fulltext search and ranking is supported via ArangoSearch the fully integrated C++ based search engine in ArangoDB.
Introduction to DGraph - A Graph DatabaseKnoldus Inc.
The slides introduce you to the world of Graph databases and explain what they are all about. Moving forward, our focus is changed to one of the newest tools for graph databases, i.e., DGraph.
overview of the RDF graph database-as-a-service (GraphDB based) on the Self-Service Semantic Suite (S4)
http://s4.ontotext.com
presentation for the AKSW Group of the University of Leipzig
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015Sergio Fernández
This document summarizes a presentation about querying geospatial data in Apache Marmotta. It introduces Apache Marmotta as an open source linked data platform, describes linked data and RDF, and explains how Marmotta supports the GeoSPARQL standard for representing and querying geospatial data on the semantic web through materialization of geospatial data and PostGIS. It provides examples of GeoSPARQL queries in Marmotta and outlines the topological relations and functions supported.
The document discusses taming the size and cardinality of OLAP data cubes over big data. It presents an overview of data warehouse systems and architectures, OLAP cubes, and decision support system benchmarks. It then introduces the TPC-H*d benchmark for evaluating multi-dimensional databases and the AutoMDB tool for automating multi-dimensional database design. Lastly, it discusses application scenarios for benchmarking data servers, multi-dimensional database schemas, and parallel OLAP servers.
We all know good training data is crucial for data scientists to build quality machine learning models. But when productionizing Machine Learning, Metadata is equally important. Consider for example:
- Provenance of model allowing for reproducible builds
- Context to comply with GDPR, CCPA requirements
- Identifying data shift in your production data
This is the reason we built ArangoML Pipeline, a flexible Metadata store which can be used with your existing ML Pipeline.
Today we are happy to announce a release of ArangoML Pipeline Cloud. Now you can start using ArangoML Pipeline without having to even start a separate docker container.
In this webinar, we will show how to leverage ArangoML Pipeline Cloud with your Machine Learning Pipeline by using an example notebook from the TensorFlow tutorial.
Find the video here: https://www.arangodb.com/arangodb-events/arangoml-pipeline-cloud/
Considerations for using NoSQL technology on your next IT projectAkmal Chaudhri
The slideshare view is not great, but the downloadable PDF file is just fine.
Originally presented at:
British Computer Society (BCS) SPA-270, London, UK, 6 February 2013
http://www.bcs-spa.org/cgi-bin/view/SPA/NoSqlDatabasesForBigData
Generating Executable Mappings from RDF Data Cube Data Structure DefinitionsChristophe Debruyne
Data processing is increasingly the subject of various internal and external regulations, such as GDPR which has recently come into effect. Instead of assuming that such processes avail of data sources (such as files and relational databases), we approach the problem in a more abstract manner and view these processes as taking datasets as input. These datasets are then created by pulling data from various data sources. Taking a W3C Recommendation for prescribing the structure of and for describing datasets, we investigate an extension of that vocabulary for the generation of executable R2RML mappings. This results in a top-down approach where one prescribes the dataset to be used by a data process and where to find the data, and where that prescription is subsequently used to retrieve the data for the creation of the dataset “just in time”. We argue that this approach to the generation of an R2RML mapping from a dataset description is the first step towards policy-aware mappings, where the generation takes into account regulations to generate mappings that are compliant. In this paper, we describe how one can obtain an R2RML mapping from a data structure definition in a declarative manner using SPARQL CONSTRUCT queries, and demonstrate it using a running example. Some of the more technical aspects are also described.
Reference: Christophe Debruyne, Dave Lewis, Declan O'Sullivan: Generating Executable Mappings from RDF Data Cube Data Structure Definitions. OTM Conferences (2) 2018: 333-350
Graph Databases, The Web of Data Storage EnginesPere Urbón-Bayes
Graph databases are a type of database that uses graph structures with nodes, edges and properties to represent and store information. They are distinct from specialized graph databases like triple stores and network databases. Some key graph database vendors include Neo4j, InfiniteGraph and OrientDB. Graph databases are well suited for applications that involve relationships, like recommendations, social networks, knowledge graphs and location-based services.
Out of the box, Accumulo's strengths are difficult to appreciate without first building an application that showcases its capabilities to handle massive amounts of data. Unfortunately, building such an application is non-trivial for many would-be users, which affects Accumulo's adoption.
In this talk, we introduce Datawave, a complete ingest, query, and analytic framework for Accumulo. Datawave, recently open-sourced by the National Security Agency, capitalizes on Accumulo's capabilities, provides an API for working with structured and unstructured data, and boasts a robust, flexible, and scalable backend.
We'll do a deep dive into Datawave's project layout, table structures, and APIs in addition to demonstrating the Datawave quickstart—a tool that makes it incredibly easy to hit the ground running with Accumulo and Datawave without having to develop a complete application.
IEEE IRI 16 - Clustering Web Pages based on Structure and Style SimilarityThamme Gowda
The structural similarity of HTML pages is measured by using Tree Edit Distance measure on DOM trees. The stylistic similarity is measured by using Jaccard similarity on CSS class names. An aggregated similarity measure is computed by combining structural and stylistic measures. A clustering method is then applied to this aggregated similarity measure to group the documents.
Clustering output of Apache Nutch using Apache SparkThamme Gowda
This document discusses clustering the output of Apache Nutch web pages using Apache Spark. It presents structural and style similarity measures to group similar web pages based on their DOM structure and CSS styles. Shared near neighbor clustering is implemented on the Spark GraphX library to cluster the web pages based on a similarity matrix without prior knowledge of cluster sizes or shapes. A demo is provided to visualize the clustered results.
ROI in Linking Content to CRM by Applying the Linked Data StackMartin Voigt
Today, decision makers in enterprises have to rely more and more on a variety of data sets that are internally but also externally available in heterogeneous formats. Therefore, intelligent processes are required to build an integrated knowledge-base. Unfortunately, the adoption of the Linked Data lifecycle within enterprises, which targets the extraction, interlinking, publishing and analytics of distributed data, lags behind the public domain due to missing frameworks that are efficiently to deploy and ease to use. In this paper, we present our adoption of the lifecycle through our generic, enterprise-ready Linked Data workbench. To judge its benefits, we describe its application within a real-world Customer Relationship Management scenario. It shows (1) that sales employee could significantly reduce their workload and (2) that the integration of sophisticated Linked Data tools come with an obvious positive Return on Investment.
Considerations for using NoSQL technology on your next IT projectAkmal Chaudhri
The slideshare view is not great, but the downloadable PDF file is just fine.
Originally presented at:
NoSQL Roadshow Amsterdam, The Netherlands, 29 November 2012
http://nosqlroadshow.com/nosql-amsterdam-2012/speaker/Akmal+B.+Chaudhri
The document discusses MySQL Document Store, which allows users to work with both SQL relational tables and schema-less JSON collections using CRUD operations. It provides the flexibility of NoSQL with the consistency of a relational database management system (RDBMS). Key components include the MySQL Shell, X DevAPI, X Protocol, and X Plugin. The X Plugin enables communication using the X Protocol for both relational and document operations, mapping CRUD operations to tables.
Elephants vs. Dolphins: Comparing PostgreSQL and MySQL for use in the DoDJamey Hanson
- PostgreSQL and MySQL are both relational database management systems that store data in tables and views and allow users to interact with data using SQL. However, PostgreSQL supports additional features like native NoSQL capabilities, geospatial extensions, and a wider range of programming languages.
- While PostgreSQL and MySQL have similar basic functionality, PostgreSQL has a broader mission beyond the relational model and aims to support multiple data models and a fuller implementation of the SQL standard.
- For projects that may utilize more advanced features over their lifespan, such as NoSQL, geospatial analysis, or custom data types, PostgreSQL may be the better long-term choice compared to MySQL.
The document discusses taming the size and cardinality of OLAP data cubes over big data. It introduces OLAP cubes and data warehouse architectures. It also discusses benchmarks like TPC-H and how TPC-H*d was created to turn TPC-H into a multi-dimensional benchmark by making some schema changes and adding MDX workloads. AutoMDB is presented as an open source tool that can parse multi-dimensional schemas, compare and merge cubes, and generate new schemas.
These are the slides to the webinar about Custom Pregel algorithms in ArangoDB https://youtu.be/DWJ-nWUxsO8. It provides a brief introduction to the capabilities and use cases for Pregel.
Running complex data queries in a distributed systemArangoDB Database
With the always-growing amount of data, it is getting increasingly hard to store and get it back efficiently. While the first versions of distributed databases have put all the burden of sharding on the application code, there are now some smarter solutions that handle most of the data distribution and resilience tasks inside the database.
This poses some interesting questions, e.g.
- how are other than by-primary-key queries actually organized and executed in a distributed system, so that they can run most efficiently?
- how do the contemporary distributed databases actually achieve transactional semantics for non-trivial operations that affect different shards/servers?
This talk will give an overview of these challenges and the available solutions that some open source distributed databases have picked to solve them.
A Graph Database That Scales - ArangoDB 3.7 Release WebinarArangoDB Database
örg Schad (Head of Engineering and ML) and Chris Woodward (Developer Relations Engineer) introduce the new capabilities to work with graph in a distributed setting. In addition explain and showcase the new fuzzy search within ArangoDB's search engine as well as JSON schema validation.
Get started with ArangoDB: https://www.arangodb.com/arangodb-tra...
Explore ArangoDB Cloud for free with 1-click demos: https://cloud.arangodb.com/home
ArangoDB is a native multi-model database written in C++ supporting graph, document and key/value needs with one engine and one query language. Fulltext search and ranking is supported via ArangoSearch the fully integrated C++ based search engine in ArangoDB.
Introduction to DGraph - A Graph DatabaseKnoldus Inc.
The slides introduce you to the world of Graph databases and explain what they are all about. Moving forward, our focus is changed to one of the newest tools for graph databases, i.e., DGraph.
overview of the RDF graph database-as-a-service (GraphDB based) on the Self-Service Semantic Suite (S4)
http://s4.ontotext.com
presentation for the AKSW Group of the University of Leipzig
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015Sergio Fernández
This document summarizes a presentation about querying geospatial data in Apache Marmotta. It introduces Apache Marmotta as an open source linked data platform, describes linked data and RDF, and explains how Marmotta supports the GeoSPARQL standard for representing and querying geospatial data on the semantic web through materialization of geospatial data and PostGIS. It provides examples of GeoSPARQL queries in Marmotta and outlines the topological relations and functions supported.
The document discusses taming the size and cardinality of OLAP data cubes over big data. It presents an overview of data warehouse systems and architectures, OLAP cubes, and decision support system benchmarks. It then introduces the TPC-H*d benchmark for evaluating multi-dimensional databases and the AutoMDB tool for automating multi-dimensional database design. Lastly, it discusses application scenarios for benchmarking data servers, multi-dimensional database schemas, and parallel OLAP servers.
We all know good training data is crucial for data scientists to build quality machine learning models. But when productionizing Machine Learning, Metadata is equally important. Consider for example:
- Provenance of model allowing for reproducible builds
- Context to comply with GDPR, CCPA requirements
- Identifying data shift in your production data
This is the reason we built ArangoML Pipeline, a flexible Metadata store which can be used with your existing ML Pipeline.
Today we are happy to announce a release of ArangoML Pipeline Cloud. Now you can start using ArangoML Pipeline without having to even start a separate docker container.
In this webinar, we will show how to leverage ArangoML Pipeline Cloud with your Machine Learning Pipeline by using an example notebook from the TensorFlow tutorial.
Find the video here: https://www.arangodb.com/arangodb-events/arangoml-pipeline-cloud/
Considerations for using NoSQL technology on your next IT projectAkmal Chaudhri
The slideshare view is not great, but the downloadable PDF file is just fine.
Originally presented at:
British Computer Society (BCS) SPA-270, London, UK, 6 February 2013
http://www.bcs-spa.org/cgi-bin/view/SPA/NoSqlDatabasesForBigData
Generating Executable Mappings from RDF Data Cube Data Structure DefinitionsChristophe Debruyne
Data processing is increasingly the subject of various internal and external regulations, such as GDPR which has recently come into effect. Instead of assuming that such processes avail of data sources (such as files and relational databases), we approach the problem in a more abstract manner and view these processes as taking datasets as input. These datasets are then created by pulling data from various data sources. Taking a W3C Recommendation for prescribing the structure of and for describing datasets, we investigate an extension of that vocabulary for the generation of executable R2RML mappings. This results in a top-down approach where one prescribes the dataset to be used by a data process and where to find the data, and where that prescription is subsequently used to retrieve the data for the creation of the dataset “just in time”. We argue that this approach to the generation of an R2RML mapping from a dataset description is the first step towards policy-aware mappings, where the generation takes into account regulations to generate mappings that are compliant. In this paper, we describe how one can obtain an R2RML mapping from a data structure definition in a declarative manner using SPARQL CONSTRUCT queries, and demonstrate it using a running example. Some of the more technical aspects are also described.
Reference: Christophe Debruyne, Dave Lewis, Declan O'Sullivan: Generating Executable Mappings from RDF Data Cube Data Structure Definitions. OTM Conferences (2) 2018: 333-350
Graph Databases, The Web of Data Storage EnginesPere Urbón-Bayes
Graph databases are a type of database that uses graph structures with nodes, edges and properties to represent and store information. They are distinct from specialized graph databases like triple stores and network databases. Some key graph database vendors include Neo4j, InfiniteGraph and OrientDB. Graph databases are well suited for applications that involve relationships, like recommendations, social networks, knowledge graphs and location-based services.
Out of the box, Accumulo's strengths are difficult to appreciate without first building an application that showcases its capabilities to handle massive amounts of data. Unfortunately, building such an application is non-trivial for many would-be users, which affects Accumulo's adoption.
In this talk, we introduce Datawave, a complete ingest, query, and analytic framework for Accumulo. Datawave, recently open-sourced by the National Security Agency, capitalizes on Accumulo's capabilities, provides an API for working with structured and unstructured data, and boasts a robust, flexible, and scalable backend.
We'll do a deep dive into Datawave's project layout, table structures, and APIs in addition to demonstrating the Datawave quickstart—a tool that makes it incredibly easy to hit the ground running with Accumulo and Datawave without having to develop a complete application.
IEEE IRI 16 - Clustering Web Pages based on Structure and Style SimilarityThamme Gowda
The structural similarity of HTML pages is measured by using Tree Edit Distance measure on DOM trees. The stylistic similarity is measured by using Jaccard similarity on CSS class names. An aggregated similarity measure is computed by combining structural and stylistic measures. A clustering method is then applied to this aggregated similarity measure to group the documents.
Clustering output of Apache Nutch using Apache SparkThamme Gowda
This document discusses clustering the output of Apache Nutch web pages using Apache Spark. It presents structural and style similarity measures to group similar web pages based on their DOM structure and CSS styles. Shared near neighbor clustering is implemented on the Spark GraphX library to cluster the web pages based on a similarity matrix without prior knowledge of cluster sizes or shapes. A demo is provided to visualize the clustered results.
ROI in Linking Content to CRM by Applying the Linked Data StackMartin Voigt
Today, decision makers in enterprises have to rely more and more on a variety of data sets that are internally but also externally available in heterogeneous formats. Therefore, intelligent processes are required to build an integrated knowledge-base. Unfortunately, the adoption of the Linked Data lifecycle within enterprises, which targets the extraction, interlinking, publishing and analytics of distributed data, lags behind the public domain due to missing frameworks that are efficiently to deploy and ease to use. In this paper, we present our adoption of the lifecycle through our generic, enterprise-ready Linked Data workbench. To judge its benefits, we describe its application within a real-world Customer Relationship Management scenario. It shows (1) that sales employee could significantly reduce their workload and (2) that the integration of sophisticated Linked Data tools come with an obvious positive Return on Investment.
Considerations for using NoSQL technology on your next IT projectAkmal Chaudhri
The slideshare view is not great, but the downloadable PDF file is just fine.
Originally presented at:
NoSQL Roadshow Amsterdam, The Netherlands, 29 November 2012
http://nosqlroadshow.com/nosql-amsterdam-2012/speaker/Akmal+B.+Chaudhri
The document discusses MySQL Document Store, which allows users to work with both SQL relational tables and schema-less JSON collections using CRUD operations. It provides the flexibility of NoSQL with the consistency of a relational database management system (RDBMS). Key components include the MySQL Shell, X DevAPI, X Protocol, and X Plugin. The X Plugin enables communication using the X Protocol for both relational and document operations, mapping CRUD operations to tables.
Elephants vs. Dolphins: Comparing PostgreSQL and MySQL for use in the DoDJamey Hanson
- PostgreSQL and MySQL are both relational database management systems that store data in tables and views and allow users to interact with data using SQL. However, PostgreSQL supports additional features like native NoSQL capabilities, geospatial extensions, and a wider range of programming languages.
- While PostgreSQL and MySQL have similar basic functionality, PostgreSQL has a broader mission beyond the relational model and aims to support multiple data models and a fuller implementation of the SQL standard.
- For projects that may utilize more advanced features over their lifespan, such as NoSQL, geospatial analysis, or custom data types, PostgreSQL may be the better long-term choice compared to MySQL.
The document discusses taming the size and cardinality of OLAP data cubes over big data. It introduces OLAP cubes and data warehouse architectures. It also discusses benchmarks like TPC-H and how TPC-H*d was created to turn TPC-H into a multi-dimensional benchmark by making some schema changes and adding MDX workloads. AutoMDB is presented as an open source tool that can parse multi-dimensional schemas, compare and merge cubes, and generate new schemas.
If NoSQL is your answer, you are probably asking the wrong question.Lukas Smith
This session is not about bad mouthing MongoDB, CoachDB, big data, map reduce or any of the other more recent additions to the database buzzword bingo. Instead it is about looking at how NoSQL is a confusing term and a more realistic assessment how old and new approaches in databases impact todays architectures...
The document provides an overview of SQL Saturday in San Diego (#909) and introduces SSIS and third-party components. It begins with welcoming attendees and providing background on the speaker. An agenda is then outlined covering introductions to ETL, SSIS, creating a simple SSIS package, new features in SSIS 2017, and testing third-party SSIS components Melissa and COZYROC. The document demonstrates concepts through examples and provides information on SSIS architecture, tools, and comparisons to other technologies.
This document discusses GraphDB connectors, which implement advanced search scenarios for GraphDB. It describes how the connectors allow real-time synchronization with external data stores, support for property chains and full-text search, and integration with SPARQL queries. The connectors are designed to optimize the filtering of large, complex models to retrieve results fast.
NoSQL Database: Classification, Characteristics and ComparisonMayuree Srikulwong
My students' presentation of a paper "NoSQL Database: New Era of Databases for Big Data Analytics - Classification, Characteristics and Comparison" by Moniruzzaman, A.B.M. and Hossain, S.A. (2013).
PostgreSQL versus MySQL - What Are The Real DifferencesAll Things Open
PostgreSQL versus MySQL differences
- PostgreSQL has better SQL standard support and is governed by mailing list consensus while MySQL is governed by Oracle. Both have active communities.
- While both are popular open-source relational database management systems, their differences in features like transactions, indexing, and replication can drastically impact how projects operate.
- This presentation will provide an overview of the similarities between PostgreSQL and MySQL, highlight important areas where they differ, and help attendees decide which database is better suited for their needs.
The DBMS market trends focused on the Graph DBMS. The benefit of the Graph Database and its forecasted the growth rate. The Advice from the renowned market research institute.
MySQL 5.7 is GA. Here is the news about our NoSQL features in MySQL and MySQL Cluster, with a lot of emphasize on the new JSON features that make MySQL suitable as a document store.
The document discusses next generation data warehousing and business intelligence (BI) analytics. It outlines some of the challenges with scaling traditional BI systems to handle large and growing volumes of data. It then proposes using a massively parallel processing (MPP) database like Greenplum to enable scalable dataflow and embed analytics processing directly into the data warehouse. This would help address issues of data volume, processing time, and refreshing aggregated data for analytics servers. It presents an application profile for typical BI systems and discusses Greenplum's scaling technology using parallel queries and data streams. Finally, it introduces the draft gNet API for implementing parallel dataflows and analytics procedures directly in the MPP database.
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021StreamNative
You may be familiar with the Presto plugin used to run fast interactive queries over Pulsar using ANSI SQL and can be joined with other data sources. This plugin will soon get a rename to align with the rename of the PrestoSQL project to Trino. What is the purpose of this rename and what does it mean for those using the Presto plugin? We cover the history of the community shift from PrestoDB to PrestoSQL, as well as, the future plans for the Pulsar community to donate this plugin to the Trino project. One of the connector maintainers will then demo the connector and show what is possible when using Trino and Pulsar!
The document discusses .NET development options for SQL Server developers. It provides an overview of .NET and C# capabilities and tools. It then tells the story of a SQL Server developer exploring .NET solutions to expand his abilities. These include using Entity Framework, ASP.NET MVC, and accessing SQL Server via libraries. The document encourages SQL developers to look beyond just SQL to leverage the full capabilities of .NET.
1) The document discusses the differences between SQL and NoSQL databases in terms of scalability, data modeling, and indexing. SQL databases are less scalable but ensure consistency and transactions, while NoSQL databases are more scalable through replication and sharding.
2) Complex applications may require a hybrid approach using both SQL and NoSQL databases. For example, storing product data in a NoSQL database and customer relationship management data in a SQL database.
3) There is no single best approach - the optimal solution depends on the specific business needs and data usage patterns. Both SQL and NoSQL databases each have their own advantages, and either can be suitable depending on the context.
This document provides an overview of IoT databases and time series data. It discusses different database types and popular IoT database solutions like InfluxDB and TimescaleDB. Implementations and demos of these databases are shown, including writing and querying time series data. Challenges of IoT databases are also mentioned.
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...Gezim Sejdiu
Over the past decade, vast amounts of machine-readable structured information have become available through the automation of research processes as well as the increasing popularity of knowledge graphs and semantic technologies.
A major and yet unsolved challenge that research faces today is to perform scalable analysis of large scale knowledge graphs in order to facilitate applications like link prediction, knowledge base completion, and question answering.
Most machine learning approaches, which scale horizontally (i.e. can be executed in a distributed environment) work on simpler feature vector based input rather than more expressive knowledge structures.
On the other hand, the learning methods which exploit the expressive structures, e.g. Statistical Relational Learning and Inductive Logic Programming approaches, usually do not scale well to very large knowledge bases owing to their working complexity.
This talk gives an overview of the ongoing project Semantic Analytics Stack (SANSA) which aims to bridge this research gap by creating an out of the box library for scalable, in-memory, structured learning.
A comparison of relational and graph model theories, with an eye towards DataStax's implementation of Graph. Note: I'm working on a concise, formal mathematical definition of relational, based on Codd's 1970 paper. (Thanks to Artem Chebotko for suggesting this.)
Micro-architectural Characterization of Apache Spark on Batch and Stream Proc...Ahsan Javed Awan
Micro-architectural performance is generally consistent between batch and stream processing workloads in Spark if they only differ in micro-batching. DataFrames show improved instruction retirement and reduced stalls compared to RDDs. Higher data velocities can improve CPU utilization and reduce stalls, while increasing bandwidth consumption and instruction retirement. The size of micro-batches in stream workloads determines their micro-architectural behavior.
What is NoSQL? How does it come to the picture? What are the types of NoSQL? Some basics of different NoSQL types? Differences between RDBMS and NoSQL. Pros and Cons of NoSQL.
What is MongoDB? What are the features of MongoDB? Nexus architecture of MongoDB. Data model and query model of MongoDB? Various MongoDB data management techniques. Indexing in MongoDB. A working example using MongoDB Java driver on Mac OSX.
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
State of Artificial intelligence Report 2023kuntobimo2016
Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its sixth year. Consider this report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future.
We consider the following key dimensions in our report:
Research: Technology breakthroughs and their capabilities.
Industry: Areas of commercial application for AI and its business impact.
Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI.
Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.
Predictions: What we believe will happen in the next 12 months and a 2022 performance review to keep us honest.
A presentation that explain the Power BI Licensing
Pg no sql_beatemjoinem_v10
1. If you can't Beat 'em, Join 'em!
Integrating NoSQL data elements into a relational model
This presentation and SQL file is on SlideShare and PGCon2015
http://www.slideshare.net/jamesphanson/pg-no-sqlbeatemjoinemv10sql
Jamey Hanson
jhanson@freedomconsultinggroup.com
jamesphanson@yahoo.com
@jamey_hanson
Freedom Consulting Group
http://www.freedomconsultinggroup.com
PGCon 2015,
Ottawa, ON CA
19-Jun-2015
2. NoSQL is magic a panacea 42 a hot mess
really useful in some situations, not applicable in
other situations - and here to stay.
Q: How can PostgreSQL thrive in a mixed
NoSQL environment?
A: By integrating NoSQL data types and features
plus understanding where PostgreSQL is - and is
not - a good fit.
NoSQL hype check
PGCon 2015 Ottawa, ON CA 19-Jun-20152
3. Stages of NoSQL acceptance …
PGCon 2015 Ottawa, ON CA 19-Jun-20153
NoSQL
RDBMS
4. Given that I have PostgreSQL,
how can I leverage NoSQL data?
Given that I have NoSQL data,
how can I leverage PostgreSQL?
Reverse the traditional approach
PGCon 2015 Ottawa, ON CA 19-Jun-20154
5. The framework and approach come from
PGCon 2015 Ottawa, ON CA 19-Jun-20155
Martin Fowler's book NoSQL Distilled
and his term
Polyglot Persistence
6. Jamey Hanson
jhanson@freedomconsultinggroup.com
jamesphanson@yahoo.com
Manage a team for Freedom Consulting Group migrating
applications from Oracle to Postgres Plus Advanced Server and
PostgreSQL in the government space. We are subcontracting to
EnterpriseDB
Overly certified: PMP, CISSP, CSEP, OCP in 5 versions of Oracle,
Cloudera developer & admin. Used to be NetApp admin and
MCSE. I teach PMP and CISSP at the Univ. MD training center
Alumnus of multiple schools and was C-130 aircrew
About the author
PGConf US, NYC 26-Mar-20156
7. } Document store (MongoDB)
} Wide column store (Cassandra)*
a.k.a. Column family database
} Key-value store (Redis)
} Graph DBMS (Neo4j)
} Search engine (Solr)**
Categories adapted from db-engines.com and
Martin Fowler, martinfowler.com/nosql.html
* Not covered in this presentation.
** See PGConf 2015 NYC "Full Text Search with Ranked Results"
What is NoSQL … for the next 45 min?
PGCon 2015 Ottawa, ON CA 19-Jun-20157
8. } Too much data for a single machine to process.
RDBMS is a "single-or-few" machine architecture.*
NoSQL has expectations of sharding across large cluster
while RDBMS does not.*
} Shard – divide database into aggregates of related business data
and spread the entire database across a cluster.
Sharding incorporates (changes to) application design.
* Sharded RDBMSs have been developed but they are more
difficult than NoSQL sharding and have not been as successful.
Why NoSQL (vs. RDBMS/PostgreSQL)? 1
PGCon 2015 Ottawa, ON CA 19-Jun-20158
9. } Because RDBMSs store data in very small pieces in lots of
different places, which does not match object-oriented
methodology and is inconvenient for developers.
} A.k.a. Object-relational Impedance Mismatch, which is handled
by Object Relational Mapping (ORM) tools such as Hibernate.
Why NoSQL (vs. RDBMS/PostgreSQL)? 2
PGCon 2015 Ottawa, ON CA 19-Jun-20159
RDBMS
NoSQL
10. } Because RDBMS data models are difficult to modify as
data structures and business needs change.
} RDBMS models must be consistent for all the data in a table. It
is not possible to have legacy data use on structure, new data
use a different structure and keep them all in the same table(s).
Why NoSQL (vs. RDBMS/PostgreSQL)? 3
PGCon 2015 Ottawa, ON CA 19-Jun-201510
11. } Because RDBMS is the incumbent.
} Installed everywhere, widely understood, mature technology
that still has active development – such as this conference.
} Because RDBMS have transactions and a consistent view
of the data.
} Sometimes you need to change a small piece of data and you
need every connection to see that change instantly.
} Because most data sets are not Google-sized.
} A single machine easily can process terabytes of data.
} Because RDBMS are better at finding relationships* and
enforcing data integrity. (*Graph databases are an exception.)
} Sometimes you want to stop bad-data from loading.
Why RDBMS/PostgreSQL (vs. NoSQL)?
PGCon 2015 Ottawa, ON CA 19-Jun-201511
12. } A: With Polyglot Persistence*.
Organizations will have relational and NoSQL databases
… our job is to match the business needs + data to the
technology.
* Polyglot Persistence was coined by Martin Fowler.
It refers to using multiple database tools and architectures.
} This presentation is about identifying where PostgreSQL
is a great fit and demonstrating how to integrate NoSQL
data into PostgreSQL's relational model.
PostgreSQL can thrive – not just survive –
in a world that includes NoSQL.
Q: Where does this leave us?
PGCon 2015 Ottawa, ON CA 19-Jun-201512
13. PostgreSQL NoSQL data sweet spot
PGCon 2015 Ottawa, ON CA 19-Jun-201513
Amount of data
Sizeofbusinesstransactions
doesnotfitonafewmachines
doesnotfitononemachine
document sized
geographic region
need to change individual data elements
PostgreSQL's NoSQL
sweet spot!
NoSQL tools
14. } Scenario: You are given ~ 1 million JSON files and need to
decide how to handle them. (i.e. Do I need MongoDB?)
} Q: Can you process this data on a single/few servers?
A: Easily!
} Q: How large are the business transactions?
(element-level, document-level or other?)
A: I have no idea.*
} Perfect for PostgreSQL integrating NoSQL data.
* PostgreSQL can handle most answers.
NoSQL can (generally) only handle document-level or larger transactions.
On to the technical parts …
PGCon 2015 Ottawa, ON CA 19-Jun-201514
15. } JavaScript Object Notation:
A widely accepted, human readable open standard for
transmitting optionally-nested key-value pairs.
{
"firstName": "John",
"lastName": "Smith",
"isAlive": true,
"age": 25,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY"
"postalCode": "10021-3100"
}...
What is JSON?
PGCon 2015 Ottawa, ON CA 19-Jun-201515
16. Also applies to XML … but it's not as cool
PGCon 2015 Ottawa, ON CA 19-Jun-201516
17. } 10,000 files from the Million Song Database (MSD)
} http://labrosa.ee.columbia.edu/millionsong/lastfm
} http://labrosa.ee.columbia.edu/millionsong/sites/default/files/
lastfm/lastfm_subset.zip
} Each file includes the song's:
} track_id
} artist
} title
} 0-N key-value pairs of similar track_id's and weights.
} 0-N key-value pairs of song tags and weights.
What's in our JSON files?
PGCon 2015 Ottawa, ON CA 19-Jun-201517
18. } Create a table with JSONB* data type.
CREATE TABLE j_songs (
id SERIAL PRIMARY KEY,
song JSONB
);
} Use COPY command to load each file.
COPY j_songs (song) FROM '/NoSQL/TRAAFD.json'
CSV QUOTE e'x01' DELIMITER e'x01';
NOTE: The e'x01' parameter handles embedded quotes.
*There are very few reasons to use JSON over JSONB
How do I load JSON files into PostgreSQL?
PGCon 2015 Ottawa, ON CA 19-Jun-201518
19. } Requires some Linux work … but not too bad.
} Extract JSON files into OS postgres's ~/NoSQL
$ unzip ~/NoSQL/lastfm_subset.zip
} Create a symbolic link for each JSON file from ~/NoSQL
to $PGDATA/ExtFiles
$ find ~/NoSQL/lastfm_subset -name *.json|
xargs -i ln -s {} $PGDATA/ExtFiles/
How do I load JSON files into PostgreSQL?
PGCon 2015 Ottawa, ON CA 19-Jun-201519
20. } Use pg_ls_dir* to generate the file-loading SQL
SELECT 'COPY nosql.j_songs(song) FROM
''ExtFiles/' || pg_ls_dir('ExtFiles') ||
''' CSV QUOTE e''x01''
DELIMITER e''x02'';';
* pg_ls_dir can list directory contents under $PGDATA.
This is why we created symbolic links in under $PGDATA
We could also have used COPY command in the files' original location.
How do I load JSON files into PostgreSQL?
PGCon 2015 Ottawa, ON CA 19-Jun-201520
21. Switch to SQL interactive
Prepare and loading JSON files.
PGCon 2015 Ottawa, ON CA 19-Jun-201521
NOTE: The SQL statements are in the file PG_NOSQL_BeatEmJoin_vXX.sql,
which is loaded in SlideShare and the PGCon web page.
22. } Return tags
SELECT DISTINCT jsonb_object_keys(song)...
} Return values
SELECT
song ->> 'title' AS title, -- return TEXT
song -> 'artist' AS artist, -- return JSON
FROM j_songs ...
} Match tags
WHERE song @> '{"artist":"Arctic Monkeys"}'::JSONB
Exploring JSON data
PGCon 2015 Ottawa, ON CA 19-Jun-201522
23. Switch to SQL interactive
Exploring JSON data and indexing
PGCon 2015 Ottawa, ON CA 19-Jun-201523
24. } Some interfaces – and a lot of existing code – require an
RDBMS structure.
} JPA (a.k.a. Hibernate) SQL cannot interact with non-RDBMS
structures.
} Present JSON data as a view or materialized view.
CREATE OR REPLACE VIEW v_songs AS
SELECT
song ->> 'track_id' AS track_id,
song ->> 'artist' AS artist,
song ->> 'title'
Present JSON as an RDBMS relation
PGCon 2015 Ottawa, ON CA 19-Jun-201524
25. Switch to SQL interactive
Presenting JSON as a view and/or
materialized view
PGCon 2015 Ottawa, ON CA 19-Jun-201525
26. } The tags and similars JSON elements contain
arrays of key-value-pairs.
} Convert them to HSTORE
} track_id, artist and title are present in every
JSON file – so we can turn them into columns.
CREATE TABLE h_songs (
track_id TEXT PRIMARY KEY,
artist TEXT,
title TEXT,
tags HSTORE,
similars HSTORE);
Transform tags and similars to HSTORE
PGCon 2015 Ottawa, ON CA 19-Jun-201526
27. } Return elements in the arrays with
jsonb_array_elements
jsonb_array_elements (song -> 'tags') ->> 0
AS tag_key,
} Build the HSTORE column with HSTORE and
array_agg operators.
HSTORE (array_agg(tag_key), array_agg(tag_value))
Use JSON and HSTORE operators to convert
PGCon 2015 Ottawa, ON CA 19-Jun-201527
28. Switch to SQL interactive
Convert from JSON to HSTORE
PGCon 2015 Ottawa, ON CA 19-Jun-201528
29. } Select records with a specific tag and value.
SELECT
artist,
title,
tags -> 'latin'
FROM h_songs
WHERE
tags ? 'latin'
and (tags -> 'latin')::INTEGER > 67;
HSTORE also has operators and indexes
PGCon 2015 Ottawa, ON CA 19-Jun-201529
30. Switch to SQL interactive
Explore and index HSTORE key-value pairs
PGCon 2015 Ottawa, ON CA 19-Jun-201530
31. } The similars element (key-value pairs of
similar_song => weight) ~ edges on a graph.
} Neo4j is the leading NoSQL graph database.
Can also explore songs data as a graph
PGCon 2015 Ottawa, ON CA 19-Jun-201531
Neo4j-sized graph PostgreSQL-sized graph
32. } Example adapted from PostgreSQL documentation.
} Transform songs data format into:
} track_id
} similar_track_id (a.k.a. link)
} weight
} Filter our data set to 'rock' songs with relatively strong links
… because recursive queries are expensive.
} Answer the burning question:
Is there a path of related songs from Lady Gaga Poker Face to
Justin Timberlake What Goes Around … Comes Around?
Use recursive query to find path
PGCon 2015 Ottawa, ON CA 19-Jun-201532
33. Switch to SQL interactive
Explore recursive query and graphing
PGCon 2015 Ottawa, ON CA 19-Jun-201533
35. } Assertion: NoSQL is here to stay.
our task is to thrive within Polyglot Persistence world.
} 5-ish types of NoSQL databases.
PostgreSQL plays nice with 4 of them:
} Document store (MongoDB)
} Wide column store (Cassandra)
} Key-value store (Redis)
} Graph DBMS (Neo4j)
} Search engine (Solr)*
*See "Full Test Search with Ranked Results" from PGConf 2015.
Summary (1)
PGCon 2015 Ottawa, ON CA 19-Jun-201535
36. } PostgreSQL can load, interact with and present
NoSQL data in a relational structure.
} PostgreSQL's sweet spot:
} Data volume that fits well on one or a few servers.
} Transaction boundaries from element to document level.
} Want to enforce (some) referential integrity.
} Want to find relations within data.
NOTE: This is what most organizations call "my real data".
} Leave edge cases to NoSQL tools.
Summary (2)
PGCon 2015 Ottawa, ON CA 19-Jun-201536
37. } Database architecture rules are more subtle and
complex now.
} It is too simplistic to think If it's not in at least 3NF –
it's wrong.
} Know your business needs.
} Know your data.
} Know the strengths and limitations of the relational
model plus NoSQL.
Seek to thrive in a world of Polyglot Persistence.
Summary (3)
PGCon 2015 Ottawa, ON CA 19-Jun-201537
39. LIFE. LIBERTY. TECHNOLOGY.
Freedom Consulting Group is a talented, hard-working,
and committed partner, providing hardware, software
and database development and integration services
to a diverse set of clients.
“
”