What is Data Warehouse?OLTP vs. OLAP, Conceptual Modeling of Data Warehouses,Data Warehousing Components, Data Warehousing Components, Building a Data Warehouse, Mapping the Data Warehouse to a Multiprocessor Architecture, Database Architectures for Parallel Processing
What is Data Warehouse?OLTP vs. OLAP, Conceptual Modeling of Data Warehouses,Data Warehousing Components, Data Warehousing Components, Building a Data Warehouse, Mapping the Data Warehouse to a Multiprocessor Architecture, Database Architectures for Parallel Processing
Data marts,Types of Data Marts,Multidimensional Data Model,Fact table ,Dimension table ,Data Warehouse Schema,Star Schema,Snowflake Schema,Fact-Constellation Schema
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptSubrata Kumer Paul
Jiawei Han, Micheline Kamber and Jian Pei
Data Mining: Concepts and Techniques, 3rd ed.
The Morgan Kaufmann Series in Data Management Systems
Morgan Kaufmann Publishers, July 2011. ISBN 978-0123814791
- Agile values and manifesto
- Scrum in details
- Themes, epics, and user stories
- Combining and splitting user stories.
- What could go wrong in Scrum and why?
- Overview in Other Agile methodologies:
- XP Agile Methodology
- KanBan Agile Methodology.
Data marts,Types of Data Marts,Multidimensional Data Model,Fact table ,Dimension table ,Data Warehouse Schema,Star Schema,Snowflake Schema,Fact-Constellation Schema
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptSubrata Kumer Paul
Jiawei Han, Micheline Kamber and Jian Pei
Data Mining: Concepts and Techniques, 3rd ed.
The Morgan Kaufmann Series in Data Management Systems
Morgan Kaufmann Publishers, July 2011. ISBN 978-0123814791
- Agile values and manifesto
- Scrum in details
- Themes, epics, and user stories
- Combining and splitting user stories.
- What could go wrong in Scrum and why?
- Overview in Other Agile methodologies:
- XP Agile Methodology
- KanBan Agile Methodology.
Search engine optimization course :
- Google ranking signals (overview)
- Spam types
- Google algorithms
- Panda
- Penguin
- Google tools
- Google web Master
- Google speed insights
- Google analytics
- Google disavow tool
- Other famous SEO tools
- Google penalty indicators
- Understanding website traffic sources
- Crawling and indexing
- On page SEO checklist
- Metadata famous errors
- How to optimize a blog article
- Yoast plugin
- Sitemaps file
- Robots files
- Off page SEO
- Links types
- Good link building strategy
- Bad link building strategy
This presentation was given to some fresh graduate developers to help them understand how to protect their web apps against some famous attacks like XSS . the presentation was a part of a bigger course that was designed to asset them
تلخيص مختصر لكتاب التوحيد و التوكل للامام الغزالى من سلسلة احياء علوم الدينABDEL RAHMAN KARIM
تلخيص مختصر لكتاب التوحيد و التوكل للامام الغزالى من سلسلة احياء علوم الدين
يشرح فيه الامام فنون التوكل على الله لجلب النفع او استبقاؤه او دفع الضر او قطعه و يوضح احوال المتوكلين على الله
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
2. Data Analysis
Data analysis is a process of inspecting, cleansing, transforming,
and modelling data with the goal of discovering useful
information, informing conclusions, and supporting
decision-making.
3. Examples Of the information from the data
When and Where Each product is Sold ,
Quantity of a certain Product to order to minimize expiration
rate
Which Features Will actually improve sales or customer
satisfaction.
4. Types of Data Analysis
1- Descriptive Analysis
The goal of descriptive analytics is to find out what happened? For example,
what was the average revenue for the month of January? Cases of Covid Per
countries .
In other words it is Generating simple summaries of the data.
5. 2. Exploratory Analysis (EDA)
Goal — Examine or explore data and find relationships
between variables which were previously unknown
7. Data Analyst Role
Data analysts mostly work with an organization's structured data.
They create advanced and sophisticated visualizations to provide
insights using BI tools like Tableau, Metabase, Redash , Power BI. .
They do this on the fly without IT assistance.
Big Percentage of those who do Data Analysis don’t carry a title with
the words ‘Data’ or ‘Analysis’ On them.
9. BI Tools Features
Combine multiple data sets to create a new one.
Preparing, and cleaning data for analysis.
Data Visualization
Geospatial Analysis
10. BI features
- Reporting
- What IF analysis using past data to predict
potential outcomes.
- Scenario analysis
- Statistical analysis using advanced functions like
mean, median, mode, standard deviation,
- Mobile Dashboards
- Integrations
11. Data Acquisition
Process of gathering data from multiple sources such as:
- Server logs
- APIs
- Databases Could be more than one
- File systems and Excel sheets
- IOT sensors data
- Satellite images
- Emails
12. Data preparation:
Data Cleaning:
Very consuming process to get rid of inconsistent data, misspelled
attributes, duplication
Data transformation
Converting data to a structural data
14. 3Vs Of Data to consider
Volume
Variety
Velocity
15. Structured data
is data that has been predefined
and formatted to a set structure
before being placed in data
storage. The best example of
structured data is the relational
database: the data has been
formatted into precisely defined
fields, to be easily queried with
SQL.
Types of Data
UnStructured data
Unstructured data is data stored in its
native format and not processed until
it is used, a myriad of file formats,
including:
email, social media posts,
presentations, chats, IoT sensor data,
and satellite imagery.
16.
17. Types of Data
Transactional Data
The purpose of Transactional Data
is to support day-to-day operations
of the business. An inspection
Analytical Data
Analytical Data is used for
managerial analysis and decision
making.
18. Analytical data are:
- Scanning over a huge number of records.
- Only reading from few columns.
- calculates aggregate statistics (such as count, sum, or
average) rather than just returning the raw data to the user.
20. Using Database for Both OLTP and OLAP
At first, the same databases were used for both transaction
processing and analytic queries. SQL turned out to be quite
flexible in this regard: it works well for OLTP type queries as
well as OLAP-type queries.
21. When OLAP is not recommended with a database
- Slow Analytical Queries.
- Guarding the database which is facing the systems from the expensive
Analytical queries.
- Data Admins won’t allow Business Analyst to run queries on live databases.
- Multiple sources of data, from multiple systems the enterprise might be
using.
22.
23. ELT tools (Extract, Transform Load)
ETL tools collect, read, and migrate large volumes of raw data from
multiple data sources and across disparate platforms. They load that
data into a single database, data store, or data warehouse for easy
access.
● Hand-coding
● Batch processing during off hours (not real time).
● Real-time ETL tools capture data from and deliver data to
applications in real time using distributed message queues and
continuous data processing. This allows analytics tools to query
Internet of Things (IoT) sensors,
25. Dataware Houses
● Relational tables
● Uses SQL
● Columnar storage to know more click here
● Single point of truth SSOT
● less well known, because they are primarily used by business
analysts, not by end users.
● Handles a much lower volume of queries than OLTP systems,
but each query is typically very demanding, requiring many
millions of records to be scanned in a short time.
26. Data Warehouse structure
● The Data Modeling of Data warehouses is called star scheme.
Also called Dimensional Modelling.
● Materialized Views not virtual views
● Data cubes or OLAP cubes
27. Elements of Dimensional Modeling DM
1. Facts
2. Dimensions
DM has no Many To Many only Fact Dimension relationship.
So tables gets wide.
28. Facts
Facts are the measurements/metrics or facts from your business process.
For a Sales business process, a measurement would be quarterly sales
numbers.
A Fact Table contains
1. Measurements/facts
2. Foreign key to dimension table
3. Only numerical attributes that can be used for calculations.
- Can grow and be huge.
29. Dimensions
Dimension provides the context surrounding a business process event. In
simple terms, they give who, what, where of a fact. In the Sales business
process, for the fact quarterly sales number, dimensions would be
● Who – Customer Names
● Where – Location
● What – Product Name
● When -Date
● With what (How) - equipments and services
● Why
33. DM Notes
- if the customer buys several different products at once, they are
represented as separate rows in the fact table.)
- Date And time are often represented using dimension tables,
because this allows additional information about dates (such as
public holidays) to be encoded, allowing queries to differentiate
between sales on holidays and non-holidays.
-
34. DM notes
The name “star schema” comes from the fact that when the table
relationships are visualized, the fact table is in the middle, surrounded
by its dimension tables; the connections to these tables are like the rays
of a star.
Data warehouse queries often involve an aggregate function, such as
COUNT, SUM, AVG, MIN, or MAX in SQL. If the same aggregates are
used by many different queries, it can be wasteful to crunch through the
raw data every time. Why not cache some of the counts or sums that
queries use most often?
35. Materialized Views Vs Virtual Views
The difference is that a materialized view is an actual copy of the
query results, written to disk, whereas a virtual view is just a
shortcut for writing queries.
When the underlying data changes, a materialized view needs to be
updated, because it is a denormalized copy of the data. The database
can do that automatically,
36. Data Cubes (Also called Multidimension database)
data is grouped or combined in
multidimensional matrices called
Data Cubes.
Example, XYZ may create a sales data
warehouse to keep records of the
store's sales for the dimensions time,
item, branch, and location.
37.
38.
39.
40.
41.
42. Data Cubes
In data warehousing, the data cubes are n-dimensional. The cuboid
which holds the lowest level of summarization is called a base cuboid.
For example, the 4-D cuboid in the figure is the base cuboid for the
given time, item, location, and supplier dimensions.
The topmost 0-D cuboid, which holds the highest level of summarization, is known as the
apex cuboid. In this example, this is the total sales, or dollars sold, summarized over all
four dimensions.
43.
44. Summary Tables
Following table is huge table of Visits with their dates and
Browser and its versions. The needed chart or report is to
answer How many Visits Per Browser Per Version Per Date
46. Summary Tables
● Best Material about Summary table from the official site of
Maria DB
● Using summary tables can dramatically improve query
performance for queries that access commonly
● There is nothing wrong about redundancy as long as
it's controlled.
● If your clients are requesting information "per day", then you can
create summary tables that include information per day.
47. Augmenting summary tables cases
"Augment" in this section means to add new rows into the summary table or increment the counts
in existing rows.
Plan A: "While inserting" rows into the Fact table, augment the summary
table(s).
Plan B: "Periodically", via cron or an EVENT.
Plan C: "As needed". That is, when someone asks for a report, First check if
the row table have been updated since last datetime , if chang happened
then the code first updates the summary tables that will be needed then
keep track of this report creation datetime.
48. Multiple summary tables
● Look at the reports you will need.
● Design a summary table for each.
● Then look at the summary tables -- you are likely to find some similarities.
● Merge similar ones.
49. Part 3 Data Analysis and SaaS
Data Analysis with SaaS
- The Problem.
- Scenario
- Hand Coding + Client side Visualization.
- Scenario
- Headless BI + Client side Visualization.
- Embedding and integration
- Tableau
- Devexpress , telerik
- Power Bi