This document summarizes different approaches to data warehousing including Inmon's 3NF model, Kimball's conformed dimensions model, Linstedt's data vault model, and Rönnbäck's anchor model. It discusses the challenges of data warehousing and provides examples of open source software that can be used to implement each approach including MySQL, PostgreSQL, Greenplum, Infobright, and Hadoop. Cautions are also noted for each methodology.
Architecting Database by Jony Sugianto (Detik.com)Tech in Asia ID
Jony Sugianto is a Research Engineer at Detikcom, an online news and article website based in Indonesia.
This slide was shared on TIA DevTalk: "How to Not Fail in Database” on 18 February 2016.
TIA DevTalk is a monthly event of TIA Dev Community-- a community for all developers and/ or engineer to create collaborative things that advanced the tech community and ecosystem.
Get updates about our dev events delivered straight to your inbox by signing up here: http://bit.ly/tia-dev ! Be the first to know when new information is available!
Architecting Database by Jony Sugianto (Detik.com)Tech in Asia ID
Jony Sugianto is a Research Engineer at Detikcom, an online news and article website based in Indonesia.
This slide was shared on TIA DevTalk: "How to Not Fail in Database” on 18 February 2016.
TIA DevTalk is a monthly event of TIA Dev Community-- a community for all developers and/ or engineer to create collaborative things that advanced the tech community and ecosystem.
Get updates about our dev events delivered straight to your inbox by signing up here: http://bit.ly/tia-dev ! Be the first to know when new information is available!
Triple stores are finally seeing mainstream use, but what exactly is all this talk about linked data? In this deck, we discuss what the semantic web is and how to map your relational data sets into a triple store database using open source software.
How Linked Data Can Speed Information DiscoveryAlex Meadows
Linked data platforms are now making it easier than ever to perform data exploration and discovery without having to wait to get the data integrated into the data warehouse. In this presentation, we discuss what linked data is and show a case study on integrating separate source systems so that scientists don't have to learn the source systems structures to get to their data.
In this talk I will explain the motivation behind the multi model database approach, discuss its advantages and limitations, and will keep the presentation concrete and practice oriented by showing concrete usage examples from node.js .
This presentation gives an overview of the Apache Arrow project. It explains the Arrow project in terms of its in memory structure, its purpose, language interfaces and supporting projects.
Links for further information and connecting
http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
https://nz.linkedin.com/pub/mike-frampton/20/630/385
https://open-source-systems.blogspot.com/
What's So Unique About a Columnar Database?FlyData Inc.
Looking for the right database technology to use? Luckily there are many database technologies to choose from, including relational databases (MySQL, Postgres), NoSQL (MongoDB), columnar databases (Amazon Redshift, BigQuery), and others. Each choice has its own pros and cons, but today let’s walk through how columnar databases are unique, by comparing it against the more traditional row-oriented database (e.g., MySQL).
Build an Open Source Data Lake For Data ScientistsShawn Zhu
This is a talk I presented in 2019 ICSA (International Chinese Statistics Association) Applied Statistics Symposium in session "How Data Science Drives Success in Enterprises"
Seminar presentation for which the entire work was conducted at Technical University Kaiserslautern. The seminar work involved understanding the Semantic Web technology along with RDF and querying mechanism. It also involved looking at technologies that are used for data storage, data management and data querying.
Big Data has been around long enough that there are some common issues that occur whenever an organization tries to implement and integrate it into their ecosystem. This presentation covers some of those pitfalls, which also impact traditional data warehouses/business intelligence ecosystems
Triple stores are finally seeing mainstream use, but what exactly is all this talk about linked data? In this deck, we discuss what the semantic web is and how to map your relational data sets into a triple store database using open source software.
How Linked Data Can Speed Information DiscoveryAlex Meadows
Linked data platforms are now making it easier than ever to perform data exploration and discovery without having to wait to get the data integrated into the data warehouse. In this presentation, we discuss what linked data is and show a case study on integrating separate source systems so that scientists don't have to learn the source systems structures to get to their data.
In this talk I will explain the motivation behind the multi model database approach, discuss its advantages and limitations, and will keep the presentation concrete and practice oriented by showing concrete usage examples from node.js .
This presentation gives an overview of the Apache Arrow project. It explains the Arrow project in terms of its in memory structure, its purpose, language interfaces and supporting projects.
Links for further information and connecting
http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
https://nz.linkedin.com/pub/mike-frampton/20/630/385
https://open-source-systems.blogspot.com/
What's So Unique About a Columnar Database?FlyData Inc.
Looking for the right database technology to use? Luckily there are many database technologies to choose from, including relational databases (MySQL, Postgres), NoSQL (MongoDB), columnar databases (Amazon Redshift, BigQuery), and others. Each choice has its own pros and cons, but today let’s walk through how columnar databases are unique, by comparing it against the more traditional row-oriented database (e.g., MySQL).
Build an Open Source Data Lake For Data ScientistsShawn Zhu
This is a talk I presented in 2019 ICSA (International Chinese Statistics Association) Applied Statistics Symposium in session "How Data Science Drives Success in Enterprises"
Seminar presentation for which the entire work was conducted at Technical University Kaiserslautern. The seminar work involved understanding the Semantic Web technology along with RDF and querying mechanism. It also involved looking at technologies that are used for data storage, data management and data querying.
Big Data has been around long enough that there are some common issues that occur whenever an organization tries to implement and integrate it into their ecosystem. This presentation covers some of those pitfalls, which also impact traditional data warehouses/business intelligence ecosystems
Introduction to Structured Data Processing with Spark SQLdatamantra
An introduction to structured data processing using Data source and Dataframe API's of spark.Presented at Bangalore Apache Spark Meetup by Madhukara Phatak on 31/05/2015.
“not only SQL.”
NoSQL databases are databases store data in a format other than relational tables.
NoSQL databases or non-relational databases don’t store relationship data well.
Scylla Summit 2022: Migrating SQL Schemas for ScyllaDB: Data Modeling Best Pr...ScyllaDB
To maximize the benefits of ScyllaDB, you must adapt the structure of your data. Data modeling for ScyllaDB should be query-driven based on your access patterns – a very different approach than normalization for SQL tables. In this session, you will learn how tools can help you migrate your existing SQL structures to accelerate your digital transformation and application modernization.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
Watch full webinar here: https://bit.ly/3hgOSwm
Data Lake technologies have been in constant evolution in recent years, with each iteration primising to fix what previous ones failed to accomplish. Several data lake engines are hitting the market with better ingestion, governance, and acceleration capabilities that aim to create the ultimate data repository. But isn't that the promise of a logical architecture with data virtualization too? So, what’s the difference between the two technologies? Are they friends or foes? This session will explore the details.
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesMongoDB
With so much talk of how Big Data is revolutionizing the world and how a data lake with Hadoop and/or Spark will solve all your data problems, it is hard to tell what is hype, reality, or somewhere in-between.
In working with dozens of enterprises in varying stages of their enterprise data management (EDM) strategy, MongoDB enterprise architect, Matt Kalan, sees the same challenges and misunderstandings arise again and again.
In this session, he will explain common challenges in data management, what capabilities are necessary, and what the future state of architecture looks like. MongoDB is uniquely capable of filling common gaps in the data lake strategy.
This session also includes a live Q&A portion during which you are encouraged to ask questions of our team.
We live in an increasingly data driven world, but without a real deep understanding of the ethical delimmas around it. In this presentation, we'll look at some recent ethical problems that have cropped up and discuss what can be done to address them
SIM RTP Meeting - So Who's Using Open Source Anyway?Alex Meadows
Open Source has been around for several decades now, but there is still a bit of mystery around what makes open source work and concern about using it in the enterprise. Open Source technologies are being widely used in many industries, including analytics, software development, social media, data center management, and more.
The discussion will be moderated by Julie Batchelor and panelists include:
* Todd Lewis, Open Source evangelist
* Jason Hibbets, Open Source Community Manager
* Jim Salter, Co-Owner and Chief Technology Officer at Openoid, LLC
* Alex Meadows, data scientist
Data Warehousing is a data architecture that separates reporting and analytics needs from operational transaction systems. This presentation is an introduction into traditional data warehousing architectures and how to determine if your environment requires a data warehouse.
Building next generation data warehousesAlex Meadows
All Things Open 2016 Talk - discussing technologies used to augment traditional data warehousing. Those technologies are:
* data vault
* anchor modeling
* linked data
* NoSQL
* data virtualization
* textual disambiguation
Slides used for a presentation to introduce the field of business analytics. Covers what BA is, how it is a part of business intelligence, and what areas make up BA.
"Big Data" is big business, but what does it really mean? How will big data impact industries and consumers? This slide deck goes through some of the high level details of the market and how it is revolutionizing the world.
Providing value to the customer is one of the biggest challenges for any team to succeed in, let alone BI teams. Agile allows for moving into a faster delivery mode by slowing down to speed up. In this presentation, we cover tips for setting up an Agile practice, common pitfalls to avoid, and why Agile is just now taking off in the BI space.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Open source data_warehousing_overview
1. Open Source Data Warehousing:
MySQL and Beyond
Alex Meadows
Twitter: @DBA_Alex
Percona MySQL University
Raleigh, NC
1/29/2013
2. What Is Data Warehousing?
● Central repository
● Oriented on Reporting and Analysis
● Integrates multiple sources
● Core to Business Intelligence and Advanced
Analytics
● Helps keep source systems clean and lean
3. Warehouse Methodologies
● Inmon’s 3NF/Hub and Spoke Model
● Kimball’s Conformed Dimension Model
● Linstedt’s Data Vault Model
● Rönnbäck’s Anchor Model/6NF
9. Kimball’s Conformed Dimensions
● Normal database modeling does not meet needs of
reporting and analysis
● Denormalize data
● Dimensions
● How does data need to be filtered?
● Facts
● What are we wanting to analyze/measure?
11. Open Source Software
● Greenplum (PostgreSQL derivative)
● InfiniDB (MySQL derivative)
● Infobright (MySQL derivative)
● Other columnar data stores
12. Columnar Data Stores
● Designed for conformed dimensions
● High Performance
● Self-indexing based on usage
● High compression of data
13. Row vs Columnar Databases
Source: http://dbbest.com/blog/column-oriented-database-technologies/
14. Cautions
● Traditional RDBMS
● Not built for conformed dimensions!
● Performance will become issue
15. Inmon’s Hub and Spoke
● Combines
● 3NF central data warehouse
● Conformed dimensions
● Becomes foundation for further variants
16. ● Linstedt’s Data Vault Model
● Mixes 3NF and Conformed Dimensions
● Model data per business entities and their
relationships
● Hubs
● Store unique business entity identifiers (keys)
● Links
● Relate hubs and other links to form relationships
● Satellites
● Store unique information regarding entity or
relationship
18. Cautions
● While you get the best mix between 3NF and
conformed dimensions, data marts are still needed
● Issues seen with both 3NF and conformed
dimensions can be found here
19. Open Source Software
● MySQL
● PostgreSQL
● Greenplum
● Other Traditional RDBMS
● NoSQL
● Hadoop
20. ● Rönnbäck’s Anchor Model/6NF
● Focus is on the data and it’s relationships.
● Anchors
● Model entities and events
● Attributes
● Model properties of anchors
● Ties
● Model relationships between anchors
● Knots
● Model relationships between shared properties
22. Cautions
● Number of joins will be an issue for some databases
● Queries will become complex
● Joins
● Finding properties/valuable information
● Every column in traditional tables becomes own
unique table